Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


RE: [xsl] Collapsing run-on tag chains not working in saxon or xalan

From: Wendell Piez <wapiez@---------------->
To:
Date: 11/2/2004 5:43:00 PM
At 04:31 AM 11/2/2004, Mike wrote:
Actually, XSLT 1.0 is a little ambiguous on this. It does recognize that the
source tree can be constructed by various routes, it is not necessarily the
direct result of parsing a source XML document. It mentions source trees
derived from a DOM as a specific example. XSLT 2.0 says the same thing much
more explicitly: you can construct a tree any way you like. Or any way your
vendor likes. Most vendors are forced into line by market forces, but some
seem to be able to sell their products regardless.

This is all fine -- I am not inclined to argue that all "XML" handled by 
XSLT must start life as "XML" in the sense of the W3C Rec ... if it can sit 
in the file system as HTML or SGML, or in an RDBMS or whatever, and is 
presented to an XSLT processor through some means that builds a tree out of 
it, that seems to me on balance to be a good thing.



But it's not MSXML the XSLT engine that fails conformance here: it's MSXML 
the parser/processor, that does not report all whitespace to the 
application. Apparently a decision was made at some point that the parser 
could know better than the author of the application (me) which text nodes 
were important and which ones were not. This seems to have been done in the 
belief that without understanding full application semantics, a parser's 
best option is to rely on a faulty principle to establish which whitespace 
I want to see ... a principle by which, for example, whitespace appearing 
with no other text between two elements in mixed content gets thrown away.



Almost any other decision -- pass it all through, examine sibling text 
nodes before throwing away any whitespace, or consult a DTD or schema to 
ensure #PCDATA wasn't allowed there before doing so -- would have been 
better for me, even if a case could be made against any of them. But 
because it can be debated what the "XML application" is in this case, an 
argument can be made that MSXML is not, in fact, non-conformant.



It's not the spec itself that's unclear; rather, it's some murkiness at the 
interface between specs (here, between XML parsing and XSLT: which is the 
application?). Like Dimitre, I have some concern that similar variations 
will be the rule in the more complex technologies to come. It's not that I 
believe all "XML" must actually start as honest XML: processing "XML" (by 
which I mean the tree-thing we build and then transform, not XML the data 
format) is too useful not to expect reasonable people will want to do that. 
Nor do I expect XML processors all to do exactly the same thing. But if the 
whole point is that variation is a good thing because it allows me choices, 
then I'll make choices. MSXML's whitespace-handling bug makes it not the 
premier choice for the kind of work I do (where mixed content abounds), 
irrespective of questions of conformance. That the unambiguously conformant 
behavior (not throwing away the whitespace unasked), in this case, is also 
the right thing to do, just makes it easier for me to choose.



Cheers,
Wendell


======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent