HXQ: Release Notes
- HXQ-0.19.0 (released on 01/09/10)
- HXQ-0.18.0 (released on 09/29/09)
- Added namespaces, XML Schema validation, and type inference. Added general URL for doc files based on the HTTP package.
- HXQ-0.17.2 (released on 06/17/09)
- Replaced editline/readline with haskeline. Fixed HXQ.cabal.
- HXQ-0.17.0 (released on 03/30/09)
- Improved streaming: Before this release, the cache requirements for parsing an XML document was the size
of the largest root child generated by the parser. For DBLP, this was not a problem since this data was generated from a relational
database so it was mostly flat. But for XMark, which is a automatically generated document-centric XML data,
the root contains only one child, which in turn contains one child, etc up to 3 levels. This created a space
leak for XMark. The way I improved it is to look at the event stream before constructing the XML tree
and decide at which level to chop the stream so that the size of the first XML tree would require a bounded space.
Then, during parsing, when the tree construction reaches this level, it immediately closes the current tree and start a new tree.
This was done in such a way that XQueries that do the testing/output beyond the chopping level, will still produce the correct result.
For all small documents and for most large documents (such as DBLP), chopping was not necessary.
For XMark, though the chopping was done at level 3. Now 1.1GB of XMark needs just 10MB of heap. The drawback: some queries that work
on level<3 will not return correct results. Eg, count(doc('data/xdata.xml')/site/sites/europe) over this XMark file will return 3M instead of just 1.
- Added virtual views using the syntax declare view ..., which is similar to function declarations.
In contrast to functions, views are unfolded before query optimization, so they cannot be recursive.
Also special care must be taken to have a linear use of the view parameters, since otherwise it may lead to code explosion.
XML views are very important for my data integration project that I am currently implementing.
Last modified: 01/09/10 by Leonidas Fegaras