Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xml-dev] Caution using XML Schema backward- or forward-compatibilityas a versioning strategy for data exchange

From: noah_mendelsohn@--.---.---
To: "Fraser Goffin" <goffinf@----------.--->
Date: 1/3/2008 10:32:00 PM
Fraser Goffin writes:

> yes I agree that structural validation is important, and I 
> further agree that the various checks that are made on the data
> are cummulative and go to the heart of data integrity.

I think there are some further nuances worth setting out.  In most of this 
discussion, there has been an implicit assumption that the schema 
validation languages are not Turing Complete [1].  For those unfamiliar 
with the term, what I mean is that languages like XSD or RelaxNG aren't 
powerful enough to compute all the things you can with languages like C, 
Java, or Cobol.  For example, you can't compute all the prime numbers  in 
XSD or RelaxNG, so you can't in practice write a schema type that would 
validate only prime integers as the content of some element.  If your 
schema language was, say, Java then you could write a schema to make sure 
that your XML element contained a prime number, and for a mathematician 
that would be a very sensible check to attempt.  There are, of course, 
good reasons for not using Turing Complete languages as our main schema 
languages.  One obvious one is that programs in Turing complete languages 
don't necessarily execute in bounded time.  You can always check an XML 
instance against and XSD or RelaxNG schema in bounded time, and usually 
quite quickly.  Most of our schema languages also handle the simple cases, 
such as looking for a fixed sequence of elements, very easily. Incidently, 
all the Turing Complete languages like C and Java have the same 
computational power:  if you can compute prime numbers in one, you can do 
it in all the others.

Anyway, I'd say there are at least four shades of grey to consider:

* Content validation that can be implemented in your schema language (the 
element name is legal, and the content is an integer)
* Content validation that your schema language can't handle (the number is 
prime)
* Business validation (that looks like a credit card number, but our 
records show that the card was stolen, so it's not "valid" for use in a 
purchasing transaction)
* Semantic incompatibility (we used to use the field for an account 
number, but in Version 2 of the language it identifies a particular credit 
card)

BTW: I know I've sent this link from time to time before, but if you're 
interested in the tradeoffs between using powerful vs. less powerful 
languages, Tim BL did a very nice analysis, and I helped him edit it as a 
TAG finding last year.  It's at [2]. 

Noah

[1] http://en.wikipedia.org/wiki/Turing_complete
[2] http://www.w3.org/2001/tag/doc/leastPower.html

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent