Altova Mailing List Archives>Archive Index >microsoft.public.xsl Archive Home >Recent entries >Thread Prev - Re: XSL Improvement >Thread Next - Re: XSL Improvement Re: XSL ImprovementTo: NULL Date: 2/2/2005 11:24:00 AM Mike, I appreciate all your feedback and testing on my behalf. I have been using the XPathDocument class in conjuction with the XSLTransform under .Net for my testing. However, I ported my code to MSXML late yesterday and have been suitably impressed. It's probably not news to most of the folks around here but I was surprised by how much MSXML outclassed .NET performance wise. I hear that's likely to change in the next major release of the .Net framework but for the time being I'll keep my code in MSXML. I'll give your approaches a try and see if I can shave some additional time off my transform. I hadn't considered the XMLReader/XMLWriter approach previously; it sounds promising. Thanks for all your assistance. Steve rdcpro wrote: > Well, I played around with some large example files using xsl:keys and a > variety of techniques. Pretty much everything was slower than using > preceding-sibling, due to the overhead involved in indexing the data. But > this approach shaved a few hundred milliseconds off the 1000 or so that it > otherwise took to execute. The biggest factor in my smaller file was > parsing, but as the file approached 100 MB, everything slowed up. > > Also, the numbers below are for MSXML 4.0, not .NET 1.1. This is on decent > hardware (3 gHz P4, 2 g RAM). > > Using your approach on a 28 MB file: > > Source document load time: 7093 milliseconds > Stylesheet document load time: .628 milliseconds > Stylesheet compile time: .472 milliseconds > Stylesheet execution time: 1063 milliseconds > > Using an xsl:key to index the transaction nodes: > > Source document load time: 7300 milliseconds > Stylesheet document load time: .705 milliseconds > Stylesheet compile time: .575 milliseconds > Stylesheet execution time: 1449 milliseconds > > Using a slightly different XPath (posted below): > > Source document load time: 7300 milliseconds > Stylesheet document load time: .679 milliseconds > Stylesheet compile time: .515 milliseconds > Stylesheet execution time: 867.9 milliseconds > > So, by using a different XPath to get to the same nodes, I can shave about > 200 ms from a 1063 ms execution time. With a 53 MB document, the times are: > > Your Approach: > Source document load time: 20877 milliseconds > Stylesheet document load time: .739 milliseconds > Stylesheet compile time: .614 milliseconds > Stylesheet execution time: 2063 milliseconds > > New XPath: > Source document load time: 21141 milliseconds > Stylesheet document load time: .683 milliseconds > Stylesheet compile time: .649 milliseconds > Stylesheet execution time: 1646 milliseconds > > Parse time nearly tripled, but execution time doubled. > > On 106 MB: > > Your Xpath > Source document load time: 68023 milliseconds > Stylesheet document load time: .662 milliseconds > Stylesheet compile time: .495 milliseconds > Stylesheet execution time: 4058 milliseconds > > New Xpath > Source document load time: 67808 milliseconds > Stylesheet document load time: .705 milliseconds > Stylesheet compile time: .530 milliseconds > Stylesheet execution time: 3275 milliseconds > > So parse time tripled again, but execution time only doubled. > > Now, in .NET, you obviously have the choice of an XmlDocument or > XPathDocument, and the XPathDocument should greatly improve things. I don't > know which you're using, though. It seems like 7 minutes is a long time for > a 60 MB document...especially without using the preceeding-sibling axis. > > Here are the various approaches I tried. It would be interesting to see how > an XslTransform on an XPathDocument would work using the xsl:key approach. > > Using xsl:key: > > <?xml version="1.0"?> > <xsl:transform version="1.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > xmlns:msxsl="urn:schemas-microsoft-com:xslt"> > <xsl:output method="xml" version="1.0" indent="yes" > omit-xml-declaration="yes" encoding="utf-8"/> > <xsl:key name="kTrans" match="Transaction837P" use="generate-id()"/> > <xsl:template match="/"> > <xsl:element name="root"> > <xsl:for-each select="file/Transaction837P"> > <xsl:variable name="vId" select="generate-id()"/> > <xsl:for-each select="PatientHierarchicalLevelLoop/ClaimInformationLoop"> > <xsl:variable name="vClaimPosition" select="position()"/> > <xsl:element name="CLAIM"> > <xsl:attribute name="LAST_NAME"><xsl:value-of select="key('kTrans', > $vId)/UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@LastName"/></xsl:attribute><!-- > <xsl:value-of > select="../preceding-sibling::UserHierarchicalLevelLoop[1]/UserNameLoop/UserName/@LastName"/> --> > <xsl:attribute name="FIRST_NAME"><xsl:value-of select="key('kTrans', > $vId)/UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@FirstName"/></xsl:attribute> > <!-- <xsl:value-of > select="../preceding-sibling::UserHierarchicalLevelLoop[1]/UserNameLoop/UserName/@FirstName"/> --> > <xsl:element name="test"> > <xsl:for-each select="test"><xsl:value-of select="."/></xsl:for-each> > </xsl:element> > </xsl:element> > </xsl:for-each> > </xsl:for-each> > </xsl:element> > </xsl:template> > </xsl:transform> > > > Using a different Xpath: > > <?xml version="1.0"?> > <xsl:transform version="1.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > xmlns:msxsl="urn:schemas-microsoft-com:xslt"> > <xsl:output method="xml" version="1.0" indent="yes" > omit-xml-declaration="yes" encoding="utf-8"/> > <xsl:template match="/"> > <xsl:element name="root"> > <xsl:for-each select="file/Transaction837P"> > <xsl:for-each select="PatientHierarchicalLevelLoop/ClaimInformationLoop"> > <xsl:variable name="vClaimPosition" select="position()"/> > <xsl:element name="CLAIM"> > <xsl:attribute name="LAST_NAME"><xsl:value-of > select="../../UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@LastName"/></xsl:attribute> > <xsl:attribute name="FIRST_NAME"><xsl:value-of > select="../../UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@FirstName"/></xsl:attribute> > <xsl:element name="test"> > <xsl:for-each select="test"><xsl:value-of select="."/></xsl:for-each> > </xsl:element> > </xsl:element> > </xsl:for-each> > </xsl:for-each> > </xsl:element> > </xsl:template> > </xsl:transform> > > This was on an Xml Document structured like: > > <file> > <Transaction837P> > <SubmitterNameLoop></SubmitterNameLoop> > <ReceiverNameLoop></ReceiverNameLoop> > <Billing_Pay_toProviderHierarchicalLevelLoop></Billing_Pay_toProviderHierarchicalLevelLoop> > <UserHierarchicalLevelLoop> > <UserNameLoop> > <UserName LastName="LAST1" FirstName="FIRST1"></UserName> > </UserNameLoop> > <PayerNameLoop></PayerNameLoop> > </UserHierarchicalLevelLoop> > <PatientHierarchicalLevelLoop> > <ClaimInformationLoop> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > </ClaimInformationLoop> > </PatientHierarchicalLevelLoop> > <UserHierarchicalLevelLoop> > <UserNameLoop> > <UserName LastName="LAST2" FirstName="FIRST2"></UserName> > </UserNameLoop> > <PayerNameLoop></PayerNameLoop> > </UserHierarchicalLevelLoop> > <PatientHierarchicalLevelLoop> > <ClaimInformationLoop> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > </ClaimInformationLoop> > </PatientHierarchicalLevelLoop> > <UserHierarchicalLevelLoop> > <UserNameLoop> > <UserName LastName="LAST3" FirstName="FIRST3"></UserName> > </UserNameLoop> > <PayerNameLoop></PayerNameLoop> > </UserHierarchicalLevelLoop> > <PatientHierarchicalLevelLoop> > <ClaimInformationLoop> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > <test>a</test> > </ClaimInformationLoop> > </PatientHierarchicalLevelLoop> > </Transaction837P> > > And there were many many more <Transaction837P> nodes, some with very large > sets of child nodes. > > It seems to me with the document structure you have to live with, that a > streaming API is probably the way to go. Rather than parse into a DOM tree > or an XPathDocument, use an XmlTextReader and parse the document in a stream, > processing the nodes you're interested in. An XmlTextWriter at the same time > could generate your output stream. > > Regards, > Mike Sharp | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
