Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: Seek in huge xml-files

From: "jimmy Zhang" <crackeur@-------.--->
To: NULL
Date: 8/10/2008 8:02:00 PM

if it is less than 2GB and you have enough memory, try vtd-xml
http://vtd-xml.sf.net

"Bogomir Engel" <bogomir@g...> wrote in message 
news:g7ihsv$o1m$02$1@n......
> Hi all,
>
> For a student project I have to be able to look up information in 
> xml-files that are several GB big. Depending on the input of the user 
> through the GUI data has to be displayed. And it's not applicable to parse 
> the whole file for every input. We can't use DOM since it would load the 
> whole file into memory. Our current approaches are based on the use of 
> SAX. We thought of generating some sort of index for every data set that 
> would provide us the byte offset in the file. The Project has to be 
> implemented in Java, so we wanted to do something like
>
> Reader.skip(offsetBytes)
>
> So we could jump to the location where our data set is without having to 
> parse the whole file. The Problem with that is, that we don't have any 
> idea on how to obtain the index information. How can you find out, where 
> in a file the SAX parser is (meaning the byte offset)?
>
> Another point is that our tests with the SAX parser when skipping bytes in 
> it's input source produced this exception.
>
> Content is not allowed in prolog
>
> So we are wondering, whether it's possible to jump to some given position 
> and then parse from there.
>
> I'm thankful for any advice since I'm quite helpless now. Many Thanks!
> Bogomir Engel 




transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent