Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: XSL Improvement

From: steveg@---------.---
To: NULL
Date: 2/2/2005 11:24:00 AM
Mike,

I appreciate all your feedback and testing on my behalf.

I have been using the XPathDocument class in conjuction with the
XSLTransform under .Net for my testing.  However, I ported my code to
MSXML late yesterday and have been suitably impressed.  It's probably
not news to most of the folks around here but I was surprised by how
much MSXML outclassed .NET performance wise.  I hear that's likely to
change in the next major release of the .Net framework but for the time
being I'll keep my code in MSXML.

I'll give your approaches a try and see if I can shave some additional
time off my transform.  I hadn't considered the XMLReader/XMLWriter
approach previously; it sounds promising.  Thanks for all your
assistance.

Steve


rdcpro wrote:
> Well, I played around with some large example files using xsl:keys
and a
> variety of techniques. Pretty much everything was slower than using
> preceding-sibling, due to the overhead involved in indexing the data.
But
> this approach shaved a few hundred milliseconds off the 1000 or so
that it
> otherwise took to execute.  The biggest factor in my smaller file was

> parsing, but as the file approached 100 MB, everything slowed up.
>
> Also, the numbers below are for MSXML 4.0, not .NET 1.1.  This is on
decent
> hardware (3 gHz P4, 2 g RAM).
>
> Using your approach on a 28 MB file:
>
> Source document load time:     7093 milliseconds
> Stylesheet document load time: .628 milliseconds
> Stylesheet compile time:       .472 milliseconds
> Stylesheet execution time:     1063 milliseconds
>
> Using an xsl:key to index the transaction nodes:
>
> Source document load time:     7300 milliseconds
> Stylesheet document load time: .705 milliseconds
> Stylesheet compile time:       .575 milliseconds
> Stylesheet execution time:     1449 milliseconds
>
> Using a slightly different XPath (posted below):
>
> Source document load time:     7300 milliseconds
> Stylesheet document load time: .679 milliseconds
> Stylesheet compile time:       .515 milliseconds
> Stylesheet execution time:     867.9 milliseconds
>
> So, by using a different XPath to get to the same nodes, I can shave
about
> 200 ms from a 1063 ms execution time.  With a 53 MB document, the
times are:
>
> Your Approach:
> Source document load time:     20877 milliseconds
> Stylesheet document load time: .739 milliseconds
> Stylesheet compile time:       .614 milliseconds
> Stylesheet execution time:     2063 milliseconds
>
> New XPath:
> Source document load time:     21141 milliseconds
> Stylesheet document load time: .683 milliseconds
> Stylesheet compile time:       .649 milliseconds
> Stylesheet execution time:     1646 milliseconds
>
> Parse time nearly tripled, but execution time doubled.
>
> On 106 MB:
>
> Your Xpath
> Source document load time:     68023 milliseconds
> Stylesheet document load time: .662 milliseconds
> Stylesheet compile time:       .495 milliseconds
> Stylesheet execution time:     4058 milliseconds
>
> New Xpath
> Source document load time:     67808 milliseconds
> Stylesheet document load time: .705 milliseconds
> Stylesheet compile time:       .530 milliseconds
> Stylesheet execution time:     3275 milliseconds
>
> So parse time tripled again, but execution time only doubled.
>
> Now, in .NET, you obviously have the choice of an XmlDocument or
> XPathDocument, and the XPathDocument should greatly improve things.
I don't
> know which you're using, though.  It seems like 7 minutes is a long
time for
> a 60 MB document...especially without using the preceeding-sibling
axis.
>
> Here are the various approaches I tried.  It would be interesting to
see how
> an XslTransform on an XPathDocument would work using the xsl:key
approach.
>
> Using xsl:key:
>
> <?xml version="1.0"?>
> <xsl:transform version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> xmlns:msxsl="urn:schemas-microsoft-com:xslt">
> 	<xsl:output method="xml" version="1.0" indent="yes"
> omit-xml-declaration="yes" encoding="utf-8"/>
> 	<xsl:key name="kTrans" match="Transaction837P" use="generate-id()"/>
> 	<xsl:template match="/">
> 		<xsl:element name="root">
> 			<xsl:for-each select="file/Transaction837P">
> 				<xsl:variable name="vId" select="generate-id()"/>
> 				<xsl:for-each
select="PatientHierarchicalLevelLoop/ClaimInformationLoop">
> 					<xsl:variable name="vClaimPosition" select="position()"/>
> 					<xsl:element name="CLAIM">
> 						<xsl:attribute name="LAST_NAME"><xsl:value-of
select="key('kTrans',
>
$vId)/UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@LastName"/></xsl:attribute><!--

> <xsl:value-of
>
select="../preceding-sibling::UserHierarchicalLevelLoop[1]/UserNameLoop/UserName/@LastName"/>
-->
> 						<xsl:attribute name="FIRST_NAME"><xsl:value-of
select="key('kTrans',
>
$vId)/UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@FirstName"/></xsl:attribute>

> <!-- <xsl:value-of
>
select="../preceding-sibling::UserHierarchicalLevelLoop[1]/UserNameLoop/UserName/@FirstName"/>
-->
> 						<xsl:element name="test">
> 							<xsl:for-each select="test"><xsl:value-of
select="."/></xsl:for-each>
> 						</xsl:element>
> 					</xsl:element>
> 				</xsl:for-each>
> 			</xsl:for-each>
> 		</xsl:element>
> 	</xsl:template>
> </xsl:transform>
>
>
> Using a different Xpath:
>
> <?xml version="1.0"?>
> <xsl:transform version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> xmlns:msxsl="urn:schemas-microsoft-com:xslt">
> 	<xsl:output method="xml" version="1.0" indent="yes"
> omit-xml-declaration="yes" encoding="utf-8"/>
> 	<xsl:template match="/">
> 		<xsl:element name="root">
> 			<xsl:for-each select="file/Transaction837P">
> 				<xsl:for-each
select="PatientHierarchicalLevelLoop/ClaimInformationLoop">
> 					<xsl:variable name="vClaimPosition" select="position()"/>
> 					<xsl:element name="CLAIM">
> 						<xsl:attribute name="LAST_NAME"><xsl:value-of
>
select="../../UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@LastName"/></xsl:attribute>
> 						<xsl:attribute name="FIRST_NAME"><xsl:value-of
>
select="../../UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@FirstName"/></xsl:attribute>
> 						<xsl:element name="test">
> 							<xsl:for-each select="test"><xsl:value-of
select="."/></xsl:for-each>
> 						</xsl:element>
> 					</xsl:element>
> 				</xsl:for-each>
> 			</xsl:for-each>
> 		</xsl:element>
> 	</xsl:template>
> </xsl:transform>
>
> This was on an Xml Document structured like:
>
> <file>
> 	<Transaction837P>
> 		<SubmitterNameLoop></SubmitterNameLoop>
> 		<ReceiverNameLoop></ReceiverNameLoop>
>
<Billing_Pay_toProviderHierarchicalLevelLoop></Billing_Pay_toProviderHierarchicalLevelLoop>
> 		<UserHierarchicalLevelLoop>
> 			<UserNameLoop>
> 				<UserName LastName="LAST1" FirstName="FIRST1"></UserName>
> 			</UserNameLoop>
> 			<PayerNameLoop></PayerNameLoop>
> 		</UserHierarchicalLevelLoop>
> 		<PatientHierarchicalLevelLoop>
> 			<ClaimInformationLoop>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 			</ClaimInformationLoop>
> 		</PatientHierarchicalLevelLoop>
> 		<UserHierarchicalLevelLoop>
> 			<UserNameLoop>
> 				<UserName LastName="LAST2" FirstName="FIRST2"></UserName>
> 			</UserNameLoop>
> 			<PayerNameLoop></PayerNameLoop>
> 		</UserHierarchicalLevelLoop>
> 		<PatientHierarchicalLevelLoop>
> 			<ClaimInformationLoop>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 			</ClaimInformationLoop>
> 		</PatientHierarchicalLevelLoop>
> 		<UserHierarchicalLevelLoop>
> 			<UserNameLoop>
> 				<UserName LastName="LAST3" FirstName="FIRST3"></UserName>
> 			</UserNameLoop>
> 			<PayerNameLoop></PayerNameLoop>
> 		</UserHierarchicalLevelLoop>
> 		<PatientHierarchicalLevelLoop>
> 			<ClaimInformationLoop>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 				<test>a</test>
> 			</ClaimInformationLoop>
> 		</PatientHierarchicalLevelLoop>
> 	</Transaction837P>
>
> And there were many many more  <Transaction837P> nodes, some with
very large
> sets of child nodes.
>
> It seems to me with the document structure you have to live with,
that a
> streaming API is probably the way to go.  Rather than parse into a
DOM tree
> or an XPathDocument, use an XmlTextReader and parse the document in a
stream,
> processing the nodes you're interested in.  An XmlTextWriter at the
same time
> could generate your output stream.
> 
> Regards,
> Mike Sharp



transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent