Altova Mailing List Archives>Archive Index >microsoft.public.xsl Archive Home >Recent entries >Thread Prev - Re: XSL Improvement >Thread Next - Re: XSL Improvement Re: XSL ImprovementTo: NULL Date: 2/1/2005 5:41:00 PM Well, I played around with some large example files using xsl:keys and a variety of techniques. Pretty much everything was slower than using preceding-sibling, due to the overhead involved in indexing the data. But this approach shaved a few hundred milliseconds off the 1000 or so that it otherwise took to execute. The biggest factor in my smaller file was parsing, but as the file approached 100 MB, everything slowed up. Also, the numbers below are for MSXML 4.0, not .NET 1.1. This is on decent hardware (3 gHz P4, 2 g RAM). Using your approach on a 28 MB file: Source document load time: 7093 milliseconds Stylesheet document load time: .628 milliseconds Stylesheet compile time: .472 milliseconds Stylesheet execution time: 1063 milliseconds Using an xsl:key to index the transaction nodes: Source document load time: 7300 milliseconds Stylesheet document load time: .705 milliseconds Stylesheet compile time: .575 milliseconds Stylesheet execution time: 1449 milliseconds Using a slightly different XPath (posted below): Source document load time: 7300 milliseconds Stylesheet document load time: .679 milliseconds Stylesheet compile time: .515 milliseconds Stylesheet execution time: 867.9 milliseconds So, by using a different XPath to get to the same nodes, I can shave about 200 ms from a 1063 ms execution time. With a 53 MB document, the times are: Your Approach: Source document load time: 20877 milliseconds Stylesheet document load time: .739 milliseconds Stylesheet compile time: .614 milliseconds Stylesheet execution time: 2063 milliseconds New XPath: Source document load time: 21141 milliseconds Stylesheet document load time: .683 milliseconds Stylesheet compile time: .649 milliseconds Stylesheet execution time: 1646 milliseconds Parse time nearly tripled, but execution time doubled. On 106 MB: Your Xpath Source document load time: 68023 milliseconds Stylesheet document load time: .662 milliseconds Stylesheet compile time: .495 milliseconds Stylesheet execution time: 4058 milliseconds New Xpath Source document load time: 67808 milliseconds Stylesheet document load time: .705 milliseconds Stylesheet compile time: .530 milliseconds Stylesheet execution time: 3275 milliseconds So parse time tripled again, but execution time only doubled. Now, in .NET, you obviously have the choice of an XmlDocument or XPathDocument, and the XPathDocument should greatly improve things. I don't know which you're using, though. It seems like 7 minutes is a long time for a 60 MB document...especially without using the preceeding-sibling axis. Here are the various approaches I tried. It would be interesting to see how an XslTransform on an XPathDocument would work using the xsl:key approach. Using xsl:key: <?xml version="1.0"?> <xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt"> <xsl:output method="xml" version="1.0" indent="yes" omit-xml-declaration="yes" encoding="utf-8"/> <xsl:key name="kTrans" match="Transaction837P" use="generate-id()"/> <xsl:template match="/"> <xsl:element name="root"> <xsl:for-each select="file/Transaction837P"> <xsl:variable name="vId" select="generate-id()"/> <xsl:for-each select="PatientHierarchicalLevelLoop/ClaimInformationLoop"> <xsl:variable name="vClaimPosition" select="position()"/> <xsl:element name="CLAIM"> <xsl:attribute name="LAST_NAME"><xsl:value-of select="key('kTrans', $vId)/UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@LastName"/></xsl:attribute><!-- <xsl:value-of select="../preceding-sibling::UserHierarchicalLevelLoop[1]/UserNameLoop/UserName/@LastName"/> --> <xsl:attribute name="FIRST_NAME"><xsl:value-of select="key('kTrans', $vId)/UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@FirstName"/></xsl:attribute> <!-- <xsl:value-of select="../preceding-sibling::UserHierarchicalLevelLoop[1]/UserNameLoop/UserName/@FirstName"/> --> <xsl:element name="test"> <xsl:for-each select="test"><xsl:value-of select="."/></xsl:for-each> </xsl:element> </xsl:element> </xsl:for-each> </xsl:for-each> </xsl:element> </xsl:template> </xsl:transform> Using a different Xpath: <?xml version="1.0"?> <xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt"> <xsl:output method="xml" version="1.0" indent="yes" omit-xml-declaration="yes" encoding="utf-8"/> <xsl:template match="/"> <xsl:element name="root"> <xsl:for-each select="file/Transaction837P"> <xsl:for-each select="PatientHierarchicalLevelLoop/ClaimInformationLoop"> <xsl:variable name="vClaimPosition" select="position()"/> <xsl:element name="CLAIM"> <xsl:attribute name="LAST_NAME"><xsl:value-of select="../../UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@LastName"/></xsl:attribute> <xsl:attribute name="FIRST_NAME"><xsl:value-of select="../../UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@FirstName"/></xsl:attribute> <xsl:element name="test"> <xsl:for-each select="test"><xsl:value-of select="."/></xsl:for-each> </xsl:element> </xsl:element> </xsl:for-each> </xsl:for-each> </xsl:element> </xsl:template> </xsl:transform> This was on an Xml Document structured like: <file> <Transaction837P> <SubmitterNameLoop></SubmitterNameLoop> <ReceiverNameLoop></ReceiverNameLoop> <Billing_Pay_toProviderHierarchicalLevelLoop></Billing_Pay_toProviderHierarchicalLevelLoop> <UserHierarchicalLevelLoop> <UserNameLoop> <UserName LastName="LAST1" FirstName="FIRST1"></UserName> </UserNameLoop> <PayerNameLoop></PayerNameLoop> </UserHierarchicalLevelLoop> <PatientHierarchicalLevelLoop> <ClaimInformationLoop> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> </ClaimInformationLoop> </PatientHierarchicalLevelLoop> <UserHierarchicalLevelLoop> <UserNameLoop> <UserName LastName="LAST2" FirstName="FIRST2"></UserName> </UserNameLoop> <PayerNameLoop></PayerNameLoop> </UserHierarchicalLevelLoop> <PatientHierarchicalLevelLoop> <ClaimInformationLoop> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> </ClaimInformationLoop> </PatientHierarchicalLevelLoop> <UserHierarchicalLevelLoop> <UserNameLoop> <UserName LastName="LAST3" FirstName="FIRST3"></UserName> </UserNameLoop> <PayerNameLoop></PayerNameLoop> </UserHierarchicalLevelLoop> <PatientHierarchicalLevelLoop> <ClaimInformationLoop> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> <test>a</test> </ClaimInformationLoop> </PatientHierarchicalLevelLoop> </Transaction837P> And there were many many more <Transaction837P> nodes, some with very large sets of child nodes. It seems to me with the document structure you have to live with, that a streaming API is probably the way to go. Rather than parse into a DOM tree or an XPathDocument, use an XmlTextReader and parse the document in a stream, processing the nodes you're interested in. An XmlTextWriter at the same time could generate your output stream. Regards, Mike Sharp | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
