Altova Mailing List Archives


Re: Validating Entities (was Re: XML Torture Test: Parsers Fail)

From: "Richard L. Goerwitz" <richard@----.---.-----.--->
To: David Megginson <david@---------.--->
Date: 4/7/1999 9:18:00 PM
David Megginson wrote:

>   3.Each of the parsed entities which is referenced directly or
>     indirectly within the document is well-formed

If I've seemed harsh, then forgive me.  I have a great deal of respect
for your views, and I don't think you're wrong here per se.

While I agree with what you've inferred about the standard, I'm not at
all certain that the standard itself forces your interpretation.  In the
above case, for example, the standard is talking about well-formed docu-
ments as if all parsed entities must be read in if used in the document.
In fact, this is not a requirement.  The whole reason parameter entities,
e.g., are not supposed to be used inside markup in the internal DTD sub-
set is that this allows us to bypass them if you're not validating.

(Incidentally, does it bother anyone else that you can have valid docu-
ments that aren't well-formed?  Imagine an external entity used inside
an attribute value?  If declared in such a way that a non-validating
parser doesn't realize it's external, then the validating parser will
reject it as an error (can't have external entities in this context).
There are other such cases, although this is the main one that comes
to mind.)

My general point is that the question of what you do while validating is
not simply a superset of what you do when just parsing with well-formed-
ness in mind.  You process documents in somewhat different ways depending
on which of these two alternatives you've chosen.  And so the question
of what context an external entity should be checked in, if validating,
is not clearly answered from the spec without exegesis, and I would ar-
gue, background knowledge.

Anyway, even if I grant that it says what you want it to, then the point
should still be made that it does so in a way that's not easy to interpret
or understand.  The fact that the writers of IE's parser apparently got it
wrong is therefore not at all unexpected.

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@g...

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)

Disclaimer

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.