Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: Whitespace normalization for union types

From: ht@---.--.--.-- (----- -. --------)
To: Kasimier Buchcik <kbuchcik@---------.-->
Date: 6/2/2005 2:10:00 PM
Consider:

 <xsd:simpleType name="fooType">
  <xsd:union memberTypes="xsd:string xsd:token"/>
 </xsd:simpleType>

 <xsd:simpleType name="fooSubType>
  <xsd:restriction base="fooType">
   <xsd:pattern value="[a-z]"/>
  </xsd:restriction>
 </xsd:simpleType>

 <xsd:element name="foo" type="fooSubType"/>

Wrt this instance:
<foo> a   </foo>

I think on balance Kasimier and Xan are _both_ right, and therefore
none of the processors are wrong.

Here's my reasoning:

[First, it has to be noted that the definition of Datatype Valid [1]
is broken -- it implies that if there's a *pattern* facet, the string
being checked need not be in the lexical space of the type!]

One the one hand the process of validation of a restricted union
could be understood in two steps -- checking the union, and then
enforcing the restriction.  This is because without checking the
union, we don't know what the string _is_, because the only way we get
a string to check is by using a type defn with a whitespace facet.

On this account (Kasimier's too, I guess) things go like this:

1) Check Datatype Valid for the pre-lexical form wrt each member of
   the union in turn:
     " a   " against xs:string -- whiteSpace is preserve, so
                                  lexical form is " a   ", which
                                  _is_ in the lexical space of
                                  xs:string, and the corresponding
                                  value is in the value space of
                                  xs:string, so we win

2) Check the facets of the union:
     " a   " against [a-z] -- fails

So, invalid.

The alternative reading is that the facets on the union are
distributed into the member types of the union, in which case Xan's
analysis is correct and things go like this:

1) Check Datatype Valid for the pre-lexical form wrt each member of
   the union, plus the facets on the union itself, in turn:

1a)  " a   " against xs:string -- whiteSpace is preserve, so
                                  lexical form is " a   ", which
                                  _is_ in the lexical space of
                                  xs:string, and the corresponding
                                  value is in the value space of
                                  xs:string, so we check the facets
                                  check  " a   " against [a-z] -- fails
1b)  " a   " against xs:token  -- whiteSpace is collapse, so
                                  lexical form is "a", which
                                  _is_ in the lexical space of
                                  xs:token, and the corresponding
                                  value is in the value space of
                                  xs:token, so we check the facets
                                  check  "a" against [a-z] -- succeeds

I don't believe it's actually at all clear which is correct.

This actually interacts with an existing issue, regarding the
semantics of a type allowed as the type of e.g. an attribute as part
of a complex type derived by restriction from a base type with a
restricted union for that attribute (whew!) -- example:

 <xs:complexType name="base">
  <xs:attribute name="foo" type="fooSubType"/>
 </xs:complexType>

 <xs:complexType name="restr">
  <xs:attribute name="foo" type="xs:token"/>
 </xs:complexType>

Currently this is a) allowed but b) means that the restricted type
allows _more_ than the base type, which is not supposed to happen.

We should probably solve both these problems together (and the latter
issue suggests we'll go in Xan's direction, that is, we'll push the
facets down onto all the member types. . .)

ht

[1] http://www.w3.org/TR/xmlschema-2/#defn-validation-rules
-- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@i...
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]

From webmaster@w... Thu Jun 02 13:14:23 2005
Received: from lisa.w3.org ([128.30.52.41])
	by frink.w3.org with esmtp (Exim 4.50)
	id 1DdpWh


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent