Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Seek in huge xml-files

From: Bogomir Engel <bogomir@-----------.-->
To: NULL
Date: 8/9/2008 12:38:00 AM

Hi all,

For a student project I have to be able to look up information in 
xml-files that are several GB big. Depending on the input of the user 
through the GUI data has to be displayed. And it's not applicable to 
parse the whole file for every input. We can't use DOM since it would 
load the whole file into memory. Our current approaches are based on the 
use of SAX. We thought of generating some sort of index for every data 
set that would provide us the byte offset in the file. The Project has 
to be implemented in Java, so we wanted to do something like

Reader.skip(offsetBytes)

So we could jump to the location where our data set is without having to 
parse the whole file. The Problem with that is, that we don't have any 
idea on how to obtain the index information. How can you find out, where 
in a file the SAX parser is (meaning the byte offset)?

Another point is that our tests with the SAX parser when skipping bytes 
in it's input source produced this exception.

Content is not allowed in prolog

So we are wondering, whether it's possible to jump to some given 
position and then parse from there.

I'm thankful for any advice since I'm quite helpless now. Many Thanks!
Bogomir Engel


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent