Altova Mailing List Archives>Archive Index >comp.text.xml Archive Home >Recent entries >Thread Prev - Seek in huge xml-files [Thread Next] Re: Seek in huge xml-filesTo: NULL Date: 8/10/2008 8:02:00 PM if it is less than 2GB and you have enough memory, try vtd-xml http://vtd-xml.sf.net "Bogomir Engel" <bogomir@g...> wrote in message news:g7ihsv$o1m$02$1@n...... > Hi all, > > For a student project I have to be able to look up information in > xml-files that are several GB big. Depending on the input of the user > through the GUI data has to be displayed. And it's not applicable to parse > the whole file for every input. We can't use DOM since it would load the > whole file into memory. Our current approaches are based on the use of > SAX. We thought of generating some sort of index for every data set that > would provide us the byte offset in the file. The Project has to be > implemented in Java, so we wanted to do something like > > Reader.skip(offsetBytes) > > So we could jump to the location where our data set is without having to > parse the whole file. The Problem with that is, that we don't have any > idea on how to obtain the index information. How can you find out, where > in a file the SAX parser is (meaning the byte offset)? > > Another point is that our tests with the SAX parser when skipping bytes in > it's input source produced this exception. > > Content is not allowed in prolog > > So we are wondering, whether it's possible to jump to some given position > and then parse from there. > > I'm thankful for any advice since I'm quite helpless now. Many Thanks! > Bogomir Engel | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
