Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xsl] Optimization Question

From: Tony Lavinio <xml1@----------->
To:
Date: 2/1/2005 1:45:00 AM
There are a couple of XSLT profilers out there.  Stylus Studio, for
example, has one that hooks into the internal processor or Saxon or
Xalan-J.  It would tell you where the time is being spent.

See www.---.com



The web page is a little out of date; the current version also added
Saxon 8 profiling, and the output includes cute little graphs.

In the profilers I've seen, including this one, the output format
itself is XML, letting you do some interesting transforms with the
profiling data.





On 01-31-2005 6:41 PM, Michael Nguyen wrote:



All,

   I've been trying to find a more efficient way of transforming a large 
group of files using Xalan.  I have about 435,000 xml documents (sizes 
ranging from 600b to 8K) that I need to transform.  Each document 
undergoes the same exact transformation.  Part of the transformation 
involves doing a lookup in another xml document. ( about 5mb)  The 
source xml files are stored in a hierarchical structure, such that the 
xml files are distributed across 40 directories.  If I perform the 
transofrmations without the external doc lookup, the entire process 
takes about two hours.  When I perform the transformation with the 
lookup it runs roughly 8000 documents / 12 hours.  I'm running this on a 
P4 3 Ghz system with 1GB ram. I'm using xsl:Key for lookup like the 
following:



<xsl:key name="all-groups-key" match="GROUP_DOC/COVER_SHEET/TITLE" 
use="ancestor::GROUP_DOC/@CRG" />

<xsl:variable name="all-groups" 
select="document('../current/groups.xml')//GROUP_DOC/COVER_SHEET/TITLE" />



... code ...snip
   <xsl:template match="CC" >
       <xsl:variable name="group_name" select ="." />
       <xsl:variable name="crg" select="substring-after(.,'-')" />

           <xsl:value-of select="$group_name"  />

       <xsl:variable name="group_full_name" 
select="$all-groups[generate-id()

= generate-id(key ('all-groups-key', $crg))]" />

       <xsl:choose>

           <xsl:when test="$group_full_name != ''" >

           <a href="../../../group/{$crg}.html"><xsl:value-of 
select="$group_f

ull_name"  /></a>

           </xsl:when>

           <xsl:otherwise>

           <xsl:value-of select="$group_name"  />

           </xsl:otherwise>

       </xsl:choose>



       <xsl:if test="following-sibling::CC" ><br /></xsl:if>
   </xsl:template>

----------------------

The lookup is done for each CC tag there are in the document.  Each 
document has at least one CC that matches.

It seems to me that the difference in processing time is solely due to 
the lookup code above, because as soon as I remove it, all 435000 files 
are processed in relatively no time.  Once I put the code back in 
however it runs my machine to a grinding halt trying to process the 
files.  It seems to be loading the groups.xml file each time I perform 
the transformation. What I want to try to do is store this stylesheet 
with the lookup xml in memory to reduce the number of times the 
gorups.xml file is loaded.

Thanks,

Michael Nguyen


transparent
Print
Mail
Digg
delicious
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent