Altova Mailing List Archives
>xsl-list Archive Home
>Recent entries
>Thread Prev - Re: Fw: [xsl] Question on duplicate node elimination
[Thread Next]
Re: Fw: [xsl] Question on duplicate node elimination
To: xsl-list@-----.------------.---
Date: 8/30/2010 6:54:00 PM
This is a question on "pointers" in XSLT. The sample ancestor3.xml [1] is demonstration for nodes "//*" and "ids($nodes/ancestor::*)". This excludes the root node. ancestor4.xml [2] demonstrates "ids($nodes/ancestor::node())" and nodes "/|//*" (includes root node). This is the modified key definition needed by dupelim4.xsl [3]: <xsl:key name="nodes-by-genid" match="/" use="generate-id()"/> <xsl:key name="nodes-by-genid" match="node()" use="generate-id()"/> By this definition every of the seven node types in the XML data model [4] is covered. Having an id-node-set of the form <id>some_id_1</id> <id>some_id_2</id> ... <id>some_id_k</id> as in [3] allows to (efficiently) "address" the represented nodes in the XML tree by the key() function. And every node-set can be represented by such an id-node-set. Result tree fragments of id-node-sets can be converted to id-node-sets by the exslt:node-set() function as in [3]. This allows for iteratively generating new id-node-sets. I did a quick search for "XSLT pointer" and found hits for pointers in C-implementations of XSLT processors or for "XPointer". Can representing the current node by <id><xsl:value-of select="generate-id()"/></id> in conjuntion with "bulk" conversion to corresponding (real) node-set by "key('nodes-by-genid',exslt:node-set($nodes)/id)" for id-node-set $nodes be considered as XSLT "pointer" representation of the current node as in C? [1] http://stamm-wilbrandt.de/en/xsl-list/ancestor3.xml [2] http://stamm-wilbrandt.de/en/xsl-list/ancestor4.xml [3] http://stamm-wilbrandt.de/en/xsl-list/dupelim4.xsl [4] http://www.w3.org/TR/xpath/#data-model Mit besten Gruessen / Best wishes, Hermann Stamm-Wilbrandt Developer, XML Compiler, L3 WebSphere DataPower SOA Appliances ---------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 From: Hermann Stamm-Wilbrandt/Germany/IBM@IBMDE To: xsl-list@l... Date: 08/27/2010 03:00 PM Subject: Re: Fw: [xsl] Question on duplicate node elimination Michael, > ... Instead, whenever you > are evaluating an operation that returns a node-set, represent that > node-set as a string containing the generate-id values of the nodes in > the node-set, space-separated. Elimination of duplicates then reduces to > an operation on strings: not trivial, but not especially difficult > either. yesterdays solution [1] based on id() function was working good. But I thought again and below single file solution based on applying key() function twice for duplicate elimination is much better: * does not need any separately created structure (like idcopy in [1]) * is really short, just a few lines (not counting comments) * works on ALL major browsers (IE support by David Carlisle's trick [4]) Below are * execution by xsltproc * listing of dupelinm3.xsl [2] * listing of ancestor.xml [3] (open that in browser) $ xsltproc dupelim3.xsl ancestor3.xml <html><pre><h2>Duplicate node elimination by applying key() function twice</h2> See <a href="dupelim3.xsl">dupelim3.xsl</a> for details. Tested to work with these browsers: Chrome Firefox Internet Explorer Opera Safari (clicking reload shows different ids) ids(//*) a id2619817 +-b id2619788 ! +-c id2619830 ! +-c id2619802 +-b id2619245 ! +-c id2619317 ! +-c id2619321 <hr> ids(//c): <id>id2619830</id><id>id2619802</id><id>id2619317</id><id>id2619321</id> <hr> nodes="ids(//c)"<br>ids($nodes/ancestor::*): <id>id2619817</id><id>id2619788</id><id>id2619245</id> </pre></html> $ $ cat dupelim3.xsl <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="exslt msxsl" > <xsl:output method="html"/> <xsl:key name="nodes-by-genid" match="node()" use="generate-id()"/> <xsl:template match="/"> <!-- initial node-set sample, represented by <id> nodes --> <xsl:variable name="nodes"> <xsl:for-each select="//c"> <id><xsl:value-of select="generate-id()"/></id> </xsl:for-each> </xsl:variable> <!-- do ancestor location step --> <xsl:variable name="result"> <!-- application of "ancestor::*" on $nodes; $aux might contain duplicate id nodes --> <xsl:variable name="aux"> <!-- use key() function to determine real nodes --> <xsl:for-each select="key('nodes-by-genid',exslt:node-set ($nodes)/id)"> <!-- location step on each real node --> <xsl:for-each select="ancestor::*"> <!-- generate <id>s for new nodes --> <id><xsl:value-of select="generate-id()"/></id> </xsl:for-each> </xsl:for-each> </xsl:variable> <!-- use key() function for duplicate elimination --> <xsl:for-each select="key('nodes-by-genid',exslt:node-set($aux)/id)"> <!-- generate <id>s, now for unique new nodes --> <id><xsl:value-of select="generate-id()"/></id> </xsl:for-each> </xsl:variable> <html><pre> <h2>Duplicate node elimination by applying key() function twice</h2> See <a href="dupelim3.xsl">dupelim3.xsl</a> for details. Tested to work with these browsers: Chrome Firefox Internet Explorer Opera Safari (clicking reload shows different ids) <!-- node name vs genid output --> <xsl:text> ids(//*)</xsl:text> <xsl:for-each select="//*"> <xsl:value-of select= "concat(' ',substring('! +-',5-2*count(ancestor::*)),name(), substring(' ',1+2*count(ancestor::*)),' ',generate-id())"/> </xsl:for-each> <xsl:text> </xsl:text><hr/><xsl:text> </xsl:text> <!-- for verification --> <xsl:text>ids(//c): </xsl:text> <xsl:copy-of select="$nodes"/> <xsl:text> </xsl:text><hr/><xsl:text> </xsl:text> <!-- output of result --> <xsl:text>nodes="ids(//c)"</xsl:text><br/> <xsl:text>ids($nodes/ancestor::*): </xsl:text> <xsl:copy-of select="$result"/> <xsl:text> </xsl:text> </pre></html> </xsl:template> <!-- from http://dpcarlisle.blogspot.com/2007/05/exslt-node-set-function.html --> <msxsl:script language="JScript" implements-prefix="exslt"> this['node-set'] = function (x) { return x; } </msxsl:script> </xsl:stylesheet> $ $ cat ancestor3.xml <?xml-stylesheet href="dupelim3.xsl" type="text/xsl"?> <a> <b> <c>1</c> <c>2</c> </b> <b> <c>3</c> <c>4</c> </b> </a> $ [1] http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/201008/msg00291.html [2] http://stamm-wilbrandt.de/en/xsl-list/ancestor3.xml [3] http://stamm-wilbrandt.de/en/xsl-list/dupelim3.xml [4] http://dpcarlisle.blogspot.com/2007/05/exslt-node-set-function.html Mit besten Gruessen / Best wishes, Hermann Stamm-Wilbrandt Developer, XML Compiler, L3 WebSphere DataPower SOA Appliances ---------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 From: Michael Kay <mike@s...> To: xsl-list@l... Date: 08/24/2010 02:17 PM Subject: Re: Fw: [xsl] Question on duplicate node elimination I haven't understood your logic in any detail, but I wonder if it suggests an alternative approach to the problem: namely, avoid creating RTFs entirely, at least for intermediate results. Instead, whenever you are evaluating an operation that returns a node-set, represent that node-set as a string containing the generate-id values of the nodes in the node-set, space-separated. Elimination of duplicates then reduces to an operation on strings: not trivial, but not especially difficult either. Michael Kay Saxonica --~------------------------------------------------------------------ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/ or e-mail: <mailto:xsl-list-unsubscribe@l...> --~-- --~------------------------------------------------------------------ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/ or e-mail: <mailto:xsl-list-unsubscribe@l...> --~-- --~------------------------------------------------------------------ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/ or e-mail: <mailto:xsl-list-unsubscribe@l...> --~--
Disclaimer
These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

