Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


RE: [xml-dev] Schematron Best Practice: A Schematron schema's area ofresponsibility?

From: "Rick Jelliffe" <rjelliffe@-------.---.-->
To: xml-dev@-----.---.---
Date: 7/18/2007 1:10:00 AM
> Mark Delaney asks:
>
>> Are there order-of-magnitude variations in efficiency, in either memory
>> use or time, between alternative languages? If so, are these variations
>> essential, or merely a quirk of the available implementations?

There is probably order-of-magnitude differences between alternative
implementations of the same language, let alone between languages!
(Certainly this is true with XSLT-based systems.)

The primary issue is whether a constaint can be tested in
 1) Streaming order with no state saved
 2) Streaming order with state (or values) saved (e.g. ID checking)
 3) random access

What grammar-based schema language do is limit themselves to 1), then also
allow some convenient number of 2) where some abstraction can be used to
make it coherent why the grammar model has been sidestepped (ALL, ID,
etc).  What the default Schematron implementation does is start from 3)
using XSLT1, then allow implementations to figure out optimizations if
random access is not allowed.  For example, an implementation could split
up a schema so that the streamable constraints are tested first (e.g. as
the DOM is being built), then the random-access constraints are checked
when the DOM is ready.

More than this, ISO DSDL looks like adopting the STX streaming XPath
language. When Schematron is used with this, then you certainly get a
streaming implementation that would not have object creation overhead.

The other aspect is that you tend to express different things in
difference schema languages: the grammars force you to pay a lot of
attention to sequencing issues and are at all good with partial orders.
Ticking through a big state machine is very easy, but when the state
transitions don't reflect business requirements they may be a burden and a
cost. Furthermore, the grammars actively discourage separation of
concerns: each stakeholder, agent and process in the pipeline may have
different, uncordinated and independent constraints. Grammars, notatbly
XSD, have proved themselves to be unattractive for validation: people
choose not to validate because with XSD and grammars they have to
over-validate (validate things they are not interested in, and omit to
validate things they are interested in) without getting useable
diagnostics. So when considering efficiency, are systems that promote, in
effect, no validation actually more "efficient" than systems that promote
effective partial validation...

Cheers
Rick Jelliffe


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent