Altova Mailing List Archives>Archive Index >microsoft.public.xsl Archive Home >Recent entries >Thread Prev - Re: XSL Improvement [Thread Next] Re: XSL ImprovementTo: NULL Date: 2/2/2005 2:57:00 PM Late last year I saw a couple of talks by a Microsoft PM and he was definitely emphatic about major XML performance improvements in .NET 2.0 (apparently XSL processing can be up to 400% faster). If you can get your hands on the 2.0 Framework beta, you could try running your old code against that to see if that helps. I don't know if you can expect MSXML-like performance, though, so if you have that working you might want to stick with it. <steveg@p...> wrote in message news:1107372220.034627.81910@o...... > Mike, > > I appreciate all your feedback and testing on my behalf. > > I have been using the XPathDocument class in conjuction with the > XSLTransform under .Net for my testing. However, I ported my code to > MSXML late yesterday and have been suitably impressed. It's probably > not news to most of the folks around here but I was surprised by how > much MSXML outclassed .NET performance wise. I hear that's likely to > change in the next major release of the .Net framework but for the time > being I'll keep my code in MSXML. > > I'll give your approaches a try and see if I can shave some additional > time off my transform. I hadn't considered the XMLReader/XMLWriter > approach previously; it sounds promising. Thanks for all your > assistance. > > Steve > > > rdcpro wrote: > > Well, I played around with some large example files using xsl:keys > and a > > variety of techniques. Pretty much everything was slower than using > > preceding-sibling, due to the overhead involved in indexing the data. > But > > this approach shaved a few hundred milliseconds off the 1000 or so > that it > > otherwise took to execute. The biggest factor in my smaller file was > > > parsing, but as the file approached 100 MB, everything slowed up. > > > > Also, the numbers below are for MSXML 4.0, not .NET 1.1. This is on > decent > > hardware (3 gHz P4, 2 g RAM). > > > > Using your approach on a 28 MB file: > > > > Source document load time: 7093 milliseconds > > Stylesheet document load time: .628 milliseconds > > Stylesheet compile time: .472 milliseconds > > Stylesheet execution time: 1063 milliseconds > > > > Using an xsl:key to index the transaction nodes: > > > > Source document load time: 7300 milliseconds > > Stylesheet document load time: .705 milliseconds > > Stylesheet compile time: .575 milliseconds > > Stylesheet execution time: 1449 milliseconds > > > > Using a slightly different XPath (posted below): > > > > Source document load time: 7300 milliseconds > > Stylesheet document load time: .679 milliseconds > > Stylesheet compile time: .515 milliseconds > > Stylesheet execution time: 867.9 milliseconds > > > > So, by using a different XPath to get to the same nodes, I can shave > about > > 200 ms from a 1063 ms execution time. With a 53 MB document, the > times are: > > > > Your Approach: > > Source document load time: 20877 milliseconds > > Stylesheet document load time: .739 milliseconds > > Stylesheet compile time: .614 milliseconds > > Stylesheet execution time: 2063 milliseconds > > > > New XPath: > > Source document load time: 21141 milliseconds > > Stylesheet document load time: .683 milliseconds > > Stylesheet compile time: .649 milliseconds > > Stylesheet execution time: 1646 milliseconds > > > > Parse time nearly tripled, but execution time doubled. > > > > On 106 MB: > > > > Your Xpath > > Source document load time: 68023 milliseconds > > Stylesheet document load time: .662 milliseconds > > Stylesheet compile time: .495 milliseconds > > Stylesheet execution time: 4058 milliseconds > > > > New Xpath > > Source document load time: 67808 milliseconds > > Stylesheet document load time: .705 milliseconds > > Stylesheet compile time: .530 milliseconds > > Stylesheet execution time: 3275 milliseconds > > > > So parse time tripled again, but execution time only doubled. > > > > Now, in .NET, you obviously have the choice of an XmlDocument or > > XPathDocument, and the XPathDocument should greatly improve things. > I don't > > know which you're using, though. It seems like 7 minutes is a long > time for > > a 60 MB document...especially without using the preceeding-sibling > axis. > > > > Here are the various approaches I tried. It would be interesting to > see how > > an XslTransform on an XPathDocument would work using the xsl:key > approach. > > > > Using xsl:key: > > > > <?xml version="1.0"?> > > <xsl:transform version="1.0" > > xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > > xmlns:msxsl="urn:schemas-microsoft-com:xslt"> > > <xsl:output method="xml" version="1.0" indent="yes" > > omit-xml-declaration="yes" encoding="utf-8"/> > > <xsl:key name="kTrans" match="Transaction837P" use="generate-id()"/> > > <xsl:template match="/"> > > <xsl:element name="root"> > > <xsl:for-each select="file/Transaction837P"> > > <xsl:variable name="vId" select="generate-id()"/> > > <xsl:for-each > select="PatientHierarchicalLevelLoop/ClaimInformationLoop"> > > <xsl:variable name="vClaimPosition" select="position()"/> > > <xsl:element name="CLAIM"> > > <xsl:attribute name="LAST_NAME"><xsl:value-of > select="key('kTrans', > > > $vId)/UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@Last Name"/></xsl:attribute><!-- > > > <xsl:value-of > > > select="../preceding-sibling::UserHierarchicalLevelLoop[1]/UserNameLoop/User Name/@LastName"/> > --> > > <xsl:attribute name="FIRST_NAME"><xsl:value-of > select="key('kTrans', > > > $vId)/UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserName/@Firs tName"/></xsl:attribute> > > > <!-- <xsl:value-of > > > select="../preceding-sibling::UserHierarchicalLevelLoop[1]/UserNameLoop/User Name/@FirstName"/> > --> > > <xsl:element name="test"> > > <xsl:for-each select="test"><xsl:value-of > select="."/></xsl:for-each> > > </xsl:element> > > </xsl:element> > > </xsl:for-each> > > </xsl:for-each> > > </xsl:element> > > </xsl:template> > > </xsl:transform> > > > > > > Using a different Xpath: > > > > <?xml version="1.0"?> > > <xsl:transform version="1.0" > > xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > > xmlns:msxsl="urn:schemas-microsoft-com:xslt"> > > <xsl:output method="xml" version="1.0" indent="yes" > > omit-xml-declaration="yes" encoding="utf-8"/> > > <xsl:template match="/"> > > <xsl:element name="root"> > > <xsl:for-each select="file/Transaction837P"> > > <xsl:for-each > select="PatientHierarchicalLevelLoop/ClaimInformationLoop"> > > <xsl:variable name="vClaimPosition" select="position()"/> > > <xsl:element name="CLAIM"> > > <xsl:attribute name="LAST_NAME"><xsl:value-of > > > select="../../UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserNa me/@LastName"/></xsl:attribute> > > <xsl:attribute name="FIRST_NAME"><xsl:value-of > > > select="../../UserHierarchicalLevelLoop[$vClaimPosition]/UserNameLoop/UserNa me/@FirstName"/></xsl:attribute> > > <xsl:element name="test"> > > <xsl:for-each select="test"><xsl:value-of > select="."/></xsl:for-each> > > </xsl:element> > > </xsl:element> > > </xsl:for-each> > > </xsl:for-each> > > </xsl:element> > > </xsl:template> > > </xsl:transform> > > > > This was on an Xml Document structured like: > > > > <file> > > <Transaction837P> > > <SubmitterNameLoop></SubmitterNameLoop> > > <ReceiverNameLoop></ReceiverNameLoop> > > > <Billing_Pay_toProviderHierarchicalLevelLoop></Billing_Pay_toProviderHierarc hicalLevelLoop> > > <UserHierarchicalLevelLoop> > > <UserNameLoop> > > <UserName LastName="LAST1" FirstName="FIRST1"></UserName> > > </UserNameLoop> > > <PayerNameLoop></PayerNameLoop> > > </UserHierarchicalLevelLoop> > > <PatientHierarchicalLevelLoop> > > <ClaimInformationLoop> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > </ClaimInformationLoop> > > </PatientHierarchicalLevelLoop> > > <UserHierarchicalLevelLoop> > > <UserNameLoop> > > <UserName LastName="LAST2" FirstName="FIRST2"></UserName> > > </UserNameLoop> > > <PayerNameLoop></PayerNameLoop> > > </UserHierarchicalLevelLoop> > > <PatientHierarchicalLevelLoop> > > <ClaimInformationLoop> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > </ClaimInformationLoop> > > </PatientHierarchicalLevelLoop> > > <UserHierarchicalLevelLoop> > > <UserNameLoop> > > <UserName LastName="LAST3" FirstName="FIRST3"></UserName> > > </UserNameLoop> > > <PayerNameLoop></PayerNameLoop> > > </UserHierarchicalLevelLoop> > > <PatientHierarchicalLevelLoop> > > <ClaimInformationLoop> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > <test>a</test> > > </ClaimInformationLoop> > > </PatientHierarchicalLevelLoop> > > </Transaction837P> > > > > And there were many many more <Transaction837P> nodes, some with > very large > > sets of child nodes. > > > > It seems to me with the document structure you have to live with, > that a > > streaming API is probably the way to go. Rather than parse into a > DOM tree > > or an XPathDocument, use an XmlTextReader and parse the document in a > stream, > > processing the nodes you're interested in. An XmlTextWriter at the > same time > > could generate your output stream. > > > > Regards, > > Mike Sharp > | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
