XQP: XQuery Processing on a P2P System

XQP is a dynamic and scalable architecture for indexing and querying a large distributed repository of schema-less XML data, implemented on top of a structured peer-to-peer (P2P) system (Pastry). Unlike other approaches, XQP can process most forms of XQuery extended with full-text search, even those queries that search for multiple documents that are related through join conditions. The indexing is based on both the text terms and the structural summary of a document. Given an XQuery, our system can find all the plausible structural summaries applicable to the query using one peer lookup only. Each structural summary is associated with a small, dynamically adapting sub-space of peers who share the inverted lists related to all the documents that conform to this particular structural summary. Peers may participate in multiple sub-spaces, while the size of each sub-space may grow and shrink dynamically, depending on the workload. To locate multiple documents that are related through join conditions, XQP uses value histograms distributed over the P2P network.

