Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: XSL Improvement

From: rdcpro@---------.------
To: NULL
Date: 2/1/2005 5:41:00 PM
Well, I played around with some large example files using xsl:keys and a 
variety of techniques. Pretty much everything was slower than using 
preceding-sibling, due to the overhead involved in indexing the data.  But 
this approach shaved a few hundred milliseconds off the 1000 or so that it 
otherwise took to execute.  The biggest factor in my smaller file was 
parsing, but as the file approached 100 MB, everything slowed up.

Also, the numbers below are for MSXML 4.0, not .NET 1.1.  This is on decent 
hardware (3 gHz P4, 2 g RAM).

Using your approach on a 28 MB file:

Source document load time:     7093 milliseconds
Stylesheet document load time: .628 milliseconds
Stylesheet compile time:       .472 milliseconds
Stylesheet execution time:     1063 milliseconds

Using an xsl:key to index the transaction nodes:

Source document load time:     7300 milliseconds
Stylesheet document load time: .705 milliseconds
Stylesheet compile time:       .575 milliseconds
Stylesheet execution time:     1449 milliseconds

Using a slightly different XPath (posted below):

Source document load time:     7300 milliseconds
Stylesheet document load time: .679 milliseconds
Stylesheet compile time:       .515 milliseconds
Stylesheet execution time:     867.9 milliseconds

So, by using a different XPath to get to the same nodes, I can shave about 
200 ms from a 1063 ms execution time.  With a 53 MB document, the times are:

Your Approach:
Source document load time:     20877 milliseconds
Stylesheet document load time: .739 milliseconds
Stylesheet compile time:       .614 milliseconds
Stylesheet execution time:     2063 milliseconds

New XPath:
Source document load time:     21141 milliseconds
Stylesheet document load time: .683 milliseconds
Stylesheet compile time:       .649 milliseconds
Stylesheet execution time:     1646 milliseconds

Parse time nearly tripled, but execution time doubled.

On 106 MB:

Your Xpath
Source document load time:     68023 milliseconds
Stylesheet document load time: .662 milliseconds
Stylesheet compile time:       .495 milliseconds
Stylesheet execution time:     4058 milliseconds

New Xpath
Source document load time:     67808 milliseconds
Stylesheet document load time: .705 milliseconds
Stylesheet compile time:       .530 milliseconds
Stylesheet execution time:     3275 milliseconds

So parse time tripled again, but execution time only doubled.

Now, in .NET, you obviously have the choice of an XmlDocument or 
XPathDocument, and the XPathDocument should greatly improve things.  I don't 
know which you're using, though.  It seems like 7 minutes is a long time for 
a 60 MB document...especially without using the preceeding-sibling axis.

Here are the various approaches I tried.  It would be interesting to see how 
an XslTransform on an XPathDocument would work using the xsl:key approach.

Using xsl:key:

<?xml version="1.0"?>
<xsl:transform version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
xmlns:msxsl="urn:schemas-microsoft-com:xslt">
	<xsl:output method="xml" version="1.0" indent="yes" 
omit-xml-declaration="yes" encoding="utf-8"/>
	<xsl:key name="kTrans" match="Transaction837P" use="generate-id()"/>
	<xsl:template match="/">
		<xsl:element name="root">
			<xsl:for-each select="file/Transaction837P">
				<xsl:variable name="vId" select="generate-id()"/>
				<xsl:for-each select="PatientHierarchicalLevelLoop/ClaimInformationLoop">
					<xsl:variable name="vClaimPosition" select="position()"/>
					<xsl:element name="CLAIM">
						<xsl:attribute name="LAST_NAME"><xsl:value-of select="key('kTrans', 
$vId)/UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@LastName"/></xsl:attribute><!-- 
<xsl:value-of 
select="../preceding-sibling::UserHierarchicalLevelLoop[1]/UserNameLoop/UserName/@LastName"/> -->
						<xsl:attribute name="FIRST_NAME"><xsl:value-of select="key('kTrans', 
$vId)/UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@FirstName"/></xsl:attribute> 
<!-- <xsl:value-of 
select="../preceding-sibling::UserHierarchicalLevelLoop[1]/UserNameLoop/UserName/@FirstName"/> -->
						<xsl:element name="test">
							<xsl:for-each select="test"><xsl:value-of select="."/></xsl:for-each>
						</xsl:element>
					</xsl:element>
				</xsl:for-each>
			</xsl:for-each>
		</xsl:element>
	</xsl:template>
</xsl:transform>


Using a different Xpath:

<?xml version="1.0"?>
<xsl:transform version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
xmlns:msxsl="urn:schemas-microsoft-com:xslt">
	<xsl:output method="xml" version="1.0" indent="yes" 
omit-xml-declaration="yes" encoding="utf-8"/>
	<xsl:template match="/">
		<xsl:element name="root">
			<xsl:for-each select="file/Transaction837P">
				<xsl:for-each select="PatientHierarchicalLevelLoop/ClaimInformationLoop">
					<xsl:variable name="vClaimPosition" select="position()"/>
					<xsl:element name="CLAIM">
						<xsl:attribute name="LAST_NAME"><xsl:value-of 
select="../../UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@LastName"/></xsl:attribute>
						<xsl:attribute name="FIRST_NAME"><xsl:value-of 
select="../../UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@FirstName"/></xsl:attribute>
						<xsl:element name="test">
							<xsl:for-each select="test"><xsl:value-of select="."/></xsl:for-each>
						</xsl:element>
					</xsl:element>
				</xsl:for-each>
			</xsl:for-each>
		</xsl:element>
	</xsl:template>
</xsl:transform>

This was on an Xml Document structured like:

<file>
	<Transaction837P>
		<SubmitterNameLoop></SubmitterNameLoop>
		<ReceiverNameLoop></ReceiverNameLoop>
		<Billing_Pay_toProviderHierarchicalLevelLoop></Billing_Pay_toProviderHierarchicalLevelLoop>
		<UserHierarchicalLevelLoop>
			<UserNameLoop>
				<UserName LastName="LAST1" FirstName="FIRST1"></UserName>
			</UserNameLoop>
			<PayerNameLoop></PayerNameLoop>
		</UserHierarchicalLevelLoop>
		<PatientHierarchicalLevelLoop>
			<ClaimInformationLoop>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
			</ClaimInformationLoop>
		</PatientHierarchicalLevelLoop>
		<UserHierarchicalLevelLoop>
			<UserNameLoop>
				<UserName LastName="LAST2" FirstName="FIRST2"></UserName>
			</UserNameLoop>
			<PayerNameLoop></PayerNameLoop>
		</UserHierarchicalLevelLoop>
		<PatientHierarchicalLevelLoop>
			<ClaimInformationLoop>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
			</ClaimInformationLoop>
		</PatientHierarchicalLevelLoop>
		<UserHierarchicalLevelLoop>
			<UserNameLoop>
				<UserName LastName="LAST3" FirstName="FIRST3"></UserName>
			</UserNameLoop>
			<PayerNameLoop></PayerNameLoop>
		</UserHierarchicalLevelLoop>
		<PatientHierarchicalLevelLoop>
			<ClaimInformationLoop>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
				<test>a</test>
			</ClaimInformationLoop>
		</PatientHierarchicalLevelLoop>
	</Transaction837P>

And there were many many more  <Transaction837P> nodes, some with very large 
sets of child nodes. 

It seems to me with the document structure you have to live with, that a 
streaming API is probably the way to go.  Rather than parse into a DOM tree 
or an XPathDocument, use an XmlTextReader and parse the document in a stream, 
processing the nodes you're interested in.  An XmlTextWriter at the same time 
could generate your output stream.

Regards,
Mike Sharp




transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent