Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xsl] Collapsing run-on tag chains not working in saxon or xalan

From: "M. David Peterson" <m.david@---------->
To:
Date: 11/1/2004 7:32:00 PM
Richard,



Unfortunately I don't have time to look at your problem although I wish 
I did as it sounds like an interesting one.  One thing that struck me 
though is the fact that used MSXML as the basis for your standards 
compliancy and Xalan and Saxon as incompliant with what you expect as 
your output.  While I would no way argue with the logic of Xalan (Xalan 
is a fantastic processor but it definitely "extends" the 1.0 
specification to its own interpretation here and there [e.g. variables 
allowed in match attributes of xsl:template when the spec most 
definitely specifies this as a no-no]... which does have some benefits 
here and there but limits the portability of your stylesheets ...) 



But as far as Saxon is concerned...  the general  rule of them is  use 
Saxon as your standards benchmark in which all other processors should 
be measured against.  You could argue the same of the XT processor from 
James Clark but this wasnt part of your comparison so we'll leave XT out 
of this, for now.



Given the fact that both Xalan produces the same output as Saxon 
suggests to me in absolute terms that its MSXML that is incorrect which 
is not something to be considered a shocker. 



What MSXML is and MSXML is not...



MSXML = FAST AS HELL XML PARSER AND XSLT PROCESSOR
MSXML != STANDARDS COMPLIANT PROCESSOR

Start with Saxon and move forward from there and you might find the 
solution to your problem...



Best regards,



<M:D/>



Richard Bondi wrote:



Dear All,



With the following xml and xsl, the Microsoft msxmldom 4 is producing
the expected output, but xalan 2.4, 2.6, and saxon 6.5.3 are not: they
all produce
the same, unexpected output.

The purpose of this code is to collapse run-on chains like
<ilink>foo</link><link id="1234">bar</link> into a single tag
<link>foo bar<id id="1234"/>
</ilink>. The xsl will also collapse run-on chains of b, i, sup, sub,
and similar tags.

Can anyone explain to me whether xalan and saxon just have a bug, and
preferably how to get xalan and/or saxon to transform the way msxml4
does here
(which I believe is correct)?

TMIA,
Richard Bondi


Sample input:



<Chapter>
	<ChapterTitle>The chapter title must be immediately followed by a
section title</ChapterTitle>
	<Body>
		<SectionTitle>The section title</SectionTitle>
		<Title>Internal Links: _ilink</Title>
		<Paragraph>The internal link to Proteins and Membranes, optionally
including the cont_id would look like: <ilink id="1234">Proteins and
		Membranes</ilink>. You could also just type <ilink>Proteins and
Membranes</ilink>. Another option is <ilink>CBIO|Proteins and
Membranes</ilink>, or
		even just <ilink id="1234"/>. You can also do <ilink
id="1234">CBIO|Proteins and Membranes</ilink>. Spaces on either side
of a pipe (|) are
		optional.</Paragraph>
		<Paragraph>Feel free to include crazy formatting, as in <ilink>CBIO|</ilink>
			<ilink>
				<i>Proteins</i>
			</ilink>
			<ilink> and Membranes</ilink> or <ilink>
				<b>
					<i>Pr</i>
				</b>
			</ilink>
			<ilink>
				<sup>
					<b>
						<i>o</i>
					</b>
				</sup>
			</ilink>
			<ilink>
				<sub>
					<b>
						<i>t</i>
					</b>
				</sub>
			</ilink>
			<ilink>
				<b>
					<i>ei</i>
				</b>
			</ilink>
			<ilink>
				<b>
					<i>
						<u>n</u>
					</i>
				</b>
			</ilink>
			<ilink>
				<b>
					<i>s</i>
				</b>
			</ilink>
			<ilink id="1234">and Membranes</ilink>. </Paragraph>
	</Body>
</Chapter>


Xsl:



<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	<xsl:output encoding="ISO-8859-1"/>
	<xsl:template match="/">
		<xsl:apply-templates/>
	</xsl:template>
	<!-- run of ilinks -->
	<xsl:template match="ilink">
		<xsl:if test="not(local-name(preceding-sibling::node()[1])='ilink')">
			<ilink>
				<xsl:if test="not(name(following-sibling::node()[1])='ilink')"><xsl:copy-of
select="@*"/></xsl:if>
				<xsl:apply-templates/>
				<xsl:if test="name(following-sibling::node()[1])='ilink'"><xsl:apply-templates
select="following-sibling::node()[1]" mode="following"/></xsl:if>
			</ilink>
		</xsl:if>
	</xsl:template>
	<xsl:template match="ilink" mode="following" >
		<xsl:apply-templates/>
		<xsl:if test="not(name(following-sibling::node()[1])='ilink') and
@*"><id><xsl:copy-of select="@*"/></id></xsl:if>
		<xsl:if test="name(following-sibling::node()[1])='ilink'"><xsl:apply-templates
select="following-sibling::node()[1]" mode="following"/></xsl:if>
	</xsl:template>
	<!-- run of formatting tags, eg tags without attributes -->
	<xsl:template match="b | i | sup | sub | u | smallcaps | red" priority="2">
		<xsl:variable name="ename" select="name(.)"/>
		<xsl:if test="not(local-name(preceding-sibling::node()[1])=string($ename))">
			<xsl:element name="{$ename}">
				<xsl:apply-templates/>
				<xsl:if test="name(following-sibling::node()[1])=string($ename)"><xsl:apply-templates
select="following-sibling::node()[1]" mode="following"/>
				</xsl:if>
			</xsl:element>
		</xsl:if>
	</xsl:template>
	<xsl:template match="b | i | sup | sub | u | smallcaps | red"
mode="following" >
		<xsl:variable name="ename" select="name(.)"/>
		<xsl:apply-templates/>
		<xsl:if test="name(following-sibling::node()[1])=string($ename)"><xsl:apply-templates
select="following-sibling::node()[1]" mode="following"/>
		</xsl:if>
	</xsl:template>
	<xsl:template match="@* | node()">
		<xsl:copy >
			<xsl:apply-templates select="@*" />
			<xsl:apply-templates />
		</xsl:copy>
	</xsl:template>
</xsl:stylesheet>


Output using msxml4 (correct output, IMHO):



<Chapter>
	<ChapterTitle>The chapter title must be immediately followed by a
section title</ChapterTitle>
	<Body>
		<SectionTitle>The section title</SectionTitle>
		<Title>Internal Links: _ilink</Title>
		<Paragraph>The internal link to Proteins and Membranes, optionally
including the cont_id would look like: <ilink id="1234">Proteins and
		Membranes</ilink>. You could also just type <ilink>Proteins and
Membranes</ilink>. Another option is <ilink>CBIO|Proteins and
Membranes</ilink>, or
		even just <ilink id="1234"/>. You can also do <ilink
id="1234">CBIO|Proteins and Membranes</ilink>. Spaces on either side
of a pipe (|) are
		optional.</Paragraph>
		<Paragraph>Feel free to include crazy formatting, as in
<ilink>CBIO|<i>Proteins</i> and Membranes</ilink> or <ilink>
				<b>
					<i>Pr</i>
				</b>
				<sup>
					<b>
						<i>o</i>
					</b>
				</sup>
				<sub>
					<b>
						<i>t</i>
					</b>
				</sub>
				<b>
					<i>ei</i>
				</b>
				<b>
					<i>
						<u>n</u>
					</i>
				</b>
				<b>
					<i>s</i>
				</b>and Membranes<id id="1234"/>
			</ilink>. </Paragraph>
	</Body>
</Chapter>


Output of xalan 2.4, 2.6.0, and instant saxon 6.5.3 (appears to do
nothing, actually):

<Chapter>
	<ChapterTitle>The chapter title must be immediately followed by a
section title</ChapterTitle>
	<Body>
		<SectionTitle>The section title</SectionTitle>
		<Title>Internal Links: _ilink</Title>
		<Paragraph>The internal link to Proteins and Membranes, optionally
including the cont_id would look like: <ilink id="1234">Proteins and
		Membranes</ilink>. You could also just type <ilink>Proteins and
Membranes</ilink>. Another option is <ilink>CBIO|Proteins and
Membranes</ilink>, or
		even just <ilink id="1234"/>. You can also do <ilink
id="1234">CBIO|Proteins and Membranes</ilink>. Spaces on either side
of a pipe (|) are
		optional.</Paragraph>
		<Paragraph>Feel free to include crazy formatting, as in <ilink>CBIO|</ilink>
			<ilink>
				<i>Proteins</i>
			</ilink>
			<ilink> and Membranes</ilink> or <ilink>
				<b>
					<i>Pr</i>
				</b>
			</ilink>
			<ilink>
				<sup>
					<b>
						<i>o</i>
					</b>
				</sup>
			</ilink>
			<ilink>
				<sub>
					<b>
						<i>t</i>
					</b>
				</sub>
			</ilink>
			<ilink>
				<b>
					<i>ei</i>
				</b>
			</ilink>
			<ilink>
				<b>
					<i>
						<u>n</u>
					</i>
				</b>
			</ilink>
			<ilink>
				<b>
					<i>s</i>
				</b>
			</ilink>
			<ilink id="1234">and Membranes</ilink>. </Paragraph>
	</Body>
</Chapter>


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent