Altova Mailing List Archives


RE: [xsl] parsing large xml files using Saxon 6.5.2

From: "Michael Kay" <mhk@--------->
To:
Date: 8/11/2003 8:00:00 AM
So what's the difference between the 18.1Mb run that ran "for hours",
and the 19.2Mb run that ran in 26 seconds? Somewhere there is a
significant difference that explains the problem, and you haven't given
us enough information to find it.

Running with the -T option can be useful. It will produce far more
information than you can analyse, and will slow down processing
considerably, but it should give you some indication as to whether the
processing is hung, looping, or just doing a lot of work.

The evidence of your measurements is that the stylesheet's performance
is essentially linear.

I would advise, by the way, moving off Instant Saxon to full Saxon for
any serious work. The Microsoft Java VM is now a thing of the past, so
any benefits that Instant Saxon once offered have pretty well
disappeared.

Michael Kay

> -----Original Message-----
> From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx 
> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of marina
> Sent: 11 August 2003 13:36
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] parsing large xml files using Saxon 6.5.2
> 
> 
> Hi,
> 
> I am having problems parsing some xml files.I have a
> 1ghz processor and 256Meg Ram.
> 
> The xslt stylesheet "wordgroup.xsl" from Dimitri
> (thankyou!) wwas tested
> and worked perfectly on smaller test files. When I run
> it on a larger file
> "1cl.xml" = 18.1Mb it builds the tree for 
> str-Split-to-words.xsl and then sits there for hours.
> 
> See output below.
> 
> --------------------------------------------------------------
> -------------------
> Microsoft Windows 2000 [Version 5.00.2195]
> (C) Copyright 1985-2000 Microsoft Corp.
> 
> h:\saxon\testbed>saxon -t -o output.txt 1cl.xml
> wordgroup.xsl
> SAXON 6.5.2 from Michael Kay
> Java version 1.1.4
> Preparation time: 371 milliseconds
> Processing file:/h:/saxon/testbed/1cl.xml
> Building tree for file:/h:/saxon/testbed/1cl.xml using
> class com.icl.saxon.tinyt
> ree.TinyBuilder
> Tree built in 7070 milliseconds
> Building tree for
> file:/h:/saxon/testbed/strSplit-to-Words.xsl using
> class com.i
> cl.saxon.tinytree.TinyBuilder
> Tree built in 10 milliseconds
> 
> --------------------------------------------------------------
> -------------------
> 
> 
> So I made another xml file "little.xml" by pasting
> sections of 1cl.xml in different sizes to see 
> 
> where it was having problems processing.
> 
> little.xml = 1.4Mb time = 1.2sec
> little.xml = 4.4Mb time = 3.3 sec
> little.xml = 7.3Mb time = 6 sec
> little.xml = 10.3Mb time = 9.8 sec
> little.xml = 19.2 Mb (bigger than the file I want to
> parse!) time = 26.1 sec! (see nice output 
> 
> below)
> 
> 
> h:\saxon\testbed>saxon -t -o output.txt little.xml 
> wordgroup.xsl SAXON 6.5.2 from Michael Kay Java version 1.1.4 
> Preparation time: 701 milliseconds Processing 
> file:/h:/saxon/testbed/little.xml Building > tree for 
> file:/h:/saxon/testbed/little.xml using class 
> com.icl.saxon.ti nytree.TinyBuilder Tree built in 7912 
> milliseconds Building tree for 
> file:/h:/saxon/testbed/strSplit-to-Words.xsl > using class 
> com.i cl.saxon.tinytree.TinyBuilder Tree built in 20 
> milliseconds Execution time: 26178 milliseconds
> 
> Any ideas for me to try?
> 
> Thanks
> 
> Marina
> 
> 
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! SiteBuilder - Free, easy-to-use web site design 
> software http://sitebuilder.yahoo.com
> 
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Disclaimer

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.