Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


RE: validator disparity

From: "Michael Kay" <mike@--------.--->
To: "'Paul Warren'" <pdw@------------.--->, <bud@---------.--->
Date: 6/14/2007 5:53:00 PM
There's a long history here, only a small part of which is captured at:

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1889

In fact the current spec makes three statements within a few lines of each
other, none of which agrees with the others, and there are no clues as to
which one takes precedence:

(1)
[17]   	charRange	   ::=   	 seRange | XmlCharIncDash  

which says that "-" is always a valid character-range

(2)
The [, ], - and \ characters are not valid character ranges;

which says that "-" can't be a character range

(3)
The - character is a valid character range only at the beginning or end of a
.positive character group..

which says it's sometimes valid and sometimes isn't (and says it in a very
odd way, because how do you know whether you're at the end of a positive
character group, especially where subtraction is involved?)

In my current implementation in Saxon I decided to allow "-" anywhere within
a character range, interpreting it as representing itself except in a
context where it can be interpreted as a range operator [a-z] or a
subtraction operator [\p{Lu}-[AEIOU]].

Users would be well-advised to steer clear of this and escape the "-"
everywhere.

Michael Kay
http://www.saxonica.com/

 

> -----Original Message-----
> From: xmlschema-dev-request@w... 
> [mailto:xmlschema-dev-request@w...] On Behalf Of Paul Warren
> Sent: 14 June 2007 15:43
> To: bud@s...
> Cc: xmlschema-dev@w...; tools@d...
> Subject: Re: validator disparity
> 
> 
> Hi Bud,
> 
> I should point out that our schema validation service is 
> simply a web frontend to the Xerces-J Schema Validator (this 
> used to be made clear on the web page, but it seems that 
> notice has gone AWOL - I'll get that fixed).
> 
> A quick look at the schema spec suggests that Xerces is wrong on this
> point:
> 
> "The - character is a valid character range only at the 
> beginning or end of a .positive character group.."
> 
> regards,
> 
> Paul
> 
> 
> On 14 Jun 2007, at 15:26, Bud Hovell wrote:
> 
> > Hi, folks ...
> >
> > I've run across a bit of a puzzle and thought I'd at least 
> report it 
> > for examination by others technically qualified.
> >
> > In the course of validating a test output file found at http:// 
> > www.amexpat.com/primeloc.xml, I discovered both it and the schema 
> > validate without complaint on the W2C validator at http:// 
> > www.w3.org/2001/03/webdata/xsv, but the schema does not validate at 
> > http://tools.decisionsoft.com/schemaValidate/, which offers the 
> > following complaint:
> > ================== OUTPUT ===================== XML Schema Validator
> >
> > Well Formed: VALID
> > Schema Validation: INVALID
> >
> > The following errors were found:
> > TYPELOCMESSAGE
> > Validation 128, 38InvalidRegex: Pattern value '[-0-9]*' is 
> not a valid 
> > regular expression. The reported error was: ''-' is an invalid 
> > character range. Write '\-'.'.
> > Validation 134, 38InvalidRegex: Pattern value '[-0-9+ ()]*' 
> is not a 
> > valid regular expression. The reported error was: ''-' is 
> an invalid 
> > character range. Write '\-'.'.
> >
> > ================ END OUTPUT ===============
> >
> > ... evidently because the non-range-denoting "-" character 
> is shown in 
> > the first position of the pattern match in brackets rather 
> than last.  
> > I'm not acquainted with the specific rules for schema 
> validation, but 
> > seem to recall that most regex matching rules DO require a literal 
> > naked dash to be mentioned last.  In this case, the parser 
> evidently 
> > wants to see it backslashed so it is understood to denote a literal 
> > rather than a range.
> >
> > This is the text of the two relevant blocks in the 
> 2007-05-21 schema 
> > file (attached in full) which I received from the provider and have 
> > input for testing at DecisionSoft:
> >
> >         <xsd:simpleType name="integerOrNull_Type">
> >                 <xsd:restriction base="xsd:string">
> >                         <xsd:pattern value="[-0-9]*"/>
> >                 </xsd:restriction>
> >         </xsd:simpleType>
> >         <!-- Telephone can contains numbers, spaces, 
> brackets, +'s and 
> > -'s /-->
> >         <xsd:simpleType name="telephoneNumber_Type">
> >                 <xsd:restriction base="xsd:string">
> >                         <xsd:pattern value="[-0-9+ ()]*"/>
> >                 </xsd:restriction>
> >         </xsd:simpleType>
> >
> > ... which shows no evidence of a backslash to protect the literal 
> > dash.
> >
> > These two parsers offer conflicting results given identical 
> input.   
> > While I'm agnostic as to which may be judged correct, they 
> should at 
> > least agree even if both are in error. :)
> >
> > I'm jointly addressing this to the W3C team and the folks over at 
> > DecisionSoft in hope this disparity may be resolved.
> >
> > Best regards,
> > -- Bud Hovell bud@s... http://www.syndafeed.com <?xml 
> > version="1.0" encoding="utf-8"?>
> > <!-- edited with XMLSpy v2006 rel. 3 sp1 (http://www.altova.com) by 
> > Andy Dawkins (Primelocation) --> <xsd:schema 
> > xmlns:xsd="http://www.w3.org/2001/XMLSchema">
> > 	<xsd:annotation>
> > 		<xsd:documentation xml:lang="en">
> >             PrimeLocation.com FastcropX1 data schema - Last Update
> > 2007-05-21
> >         </xsd:documentation>
> > 	</xsd:annotation>
> > 	<xsd:element name="root" type="root_Type"/>
> > 	<xsd:complexType name="root_Type">
> > 		<xsd:sequence>
> > 			<xsd:element name="agentGroup" 
> type="agentGroup_Type"  
> > minOccurs="0" maxOccurs="unbounded"/>
> > 		</xsd:sequence>
> > 	</xsd:complexType>
> > 	<xsd:complexType name="agentGroup_Type">
> > 		<xsd:sequence>
> > 			<xsd:element name="mode" 
> type="agentGroupMode_Type"  
> > default="FULL"/>
> > 			<xsd:element name="exportDate" 
> type="xsd:dateTime" minOccurs="0"/>
> > 			<!-- Not madatory but useful for debugging -->
> > 			<xsd:element name="agentBranch" 
> type="agentBranch_Type"  
> > minOccurs="0" maxOccurs="unbounded"/>
> > 		</xsd:sequence>
> > 		<xsd:attribute name="code" type="xsd:string" 
> use="required"/>
> > 	</xsd:complexType>
> > 	<xsd:complexType name="agentBranch_Type">
> > 		<xsd:sequence>
> > 			<xsd:element name="property" 
> type="property_Type" minOccurs="0"  
> > maxOccurs="unbounded"/>
> > 		</xsd:sequence>
> > 		<xsd:attribute name="code" type="xsd:string" 
> use="required"/>
> > 	</xsd:complexType>
> > 	<xsd:complexType name="property_Type">
> > 		<xsd:choice>
> > 			<xsd:sequence>
> > 				<!-- Property Address Details /-->
> > 				<xsd:element 
> name="fullPostCode" type="xsd:string"/>
> > 				<xsd:element name="countryCode" 
> type="countryCode_Type"  
> > default="GB" minOccurs="0"/>
> > 				<xsd:element name="name" 
> type="xsd:string"/>
> > 				<xsd:element name="address" 
> type="xsd:string"/>
> > 				<xsd:element name="regionCode" 
> type="xsd:string" minOccurs="0"/>
> > 				<!-- Property Description /-->
> > 				<xsd:element name="summary" 
> type="xsd:string" minOccurs="0"/>
> > 				<xsd:element name="details" 
> type="xsd:string" minOccurs="0"/>
> > 				<!-- Property Price Information /-->
> > 				<xsd:element name="pricePrefix" 
> type="pricePrefix_Type"/>
> > 				<xsd:element name="price" 
> type="integerRange_Type"/>
> > 				<xsd:element 
> name="priceCurrency" type="priceCurrency_Type"  
> > default="GBP" minOccurs="0"/>
> > 				<!-- Property sale specifics /-->
> > 				<xsd:element 
> name="sellingState" type="sellingState_Type"/>
> > 				<xsd:element 
> name="propertyType" type="propertyType_Type"/>
> > 				<xsd:element name="newHome" 
> type="xsd:string" minOccurs="0"/>
> > 				<xsd:element name="saleOrRent" 
> type="saleOrRent_Type"/>
> > 				<xsd:element 
> name="sharedCommission" type="xsd:string"  
> > minOccurs="0"/>
> > 				<!-- Rental Information /-->
> > 				<xsd:element name="groundRent" 
> type="xsd:decimal" minOccurs="0"/>
> > 				<!-- Value in GBP per annum /-->
> > 				<xsd:element 
> name="serviceCharge" type="xsd:decimal"  
> > minOccurs="0"/>
> > 				<!-- Value in GBP per annum /-->
> > 				<xsd:element name="furnished" 
> type="xsd:boolean" minOccurs="0"/>
> > 				<xsd:element 
> name="rentalLength" type="xsd:int" minOccurs="0"/>
> > 				<!-- Tenure Information /-->
> > 				<xsd:element name="tenure" 
> type="tenure_Type" default=""  
> > minOccurs="0"/>
> > 				<xsd:element 
> name="leaseholdYearsRemaining"  
> > type="integerOrNull_Type" minOccurs="0"/>
> > 				<!-- Property Room Information /-->
> > 				<xsd:element name="bedrooms" 
> type="integerRange_Type"/>
> > 				<xsd:element name="bathrooms" 
> type="integerRange_Type"/>
> > 				<xsd:element 
> name="receptionRooms" type="integerRange_Type"/>
> > 				<!-- Property Images, Supported 
> types: JPG, PNG, GIF /-->
> > 				<xsd:element name="mainImage" 
> type="asset_Type" minOccurs="0"/>
> > 				<!-- The file name of the image /-->
> > 				<xsd:element 
> name="additionalImage1" type="asset_Type"  
> > minOccurs="0"/>
> > 				<xsd:element 
> name="additionalImage2" type="asset_Type"  
> > minOccurs="0"/>
> > 				<xsd:element 
> name="additionalImage3" type="asset_Type"  
> > minOccurs="0"/>
> > 				<xsd:element 
> name="additionalImage4" type="asset_Type"  
> > minOccurs="0"/>
> > 				<!-- Floorplans, Up to four 
> images ( JPG, PNG, GIF ) OR a single 
> > PDF /-->
> > 				<xsd:element name="floorPlan1" 
> type="asset_Type" minOccurs="0"/>
> > 				<!-- The file name of the image /-->
> > 				<xsd:element name="floorPlan2" 
> type="asset_Type" minOccurs="0"/>
> > 				<xsd:element name="floorPlan3" 
> type="asset_Type" minOccurs="0"/>
> > 				<xsd:element name="floorPlan4" 
> type="asset_Type" minOccurs="0"/>
> > 				<!-- Brochure, A single PDF /-->
> > 				<xsd:element name="brochure" 
> type="asset_Type" minOccurs="0"/>
> > 				<!-- The file name of the pdf /-->
> > 				<!-- Virtual Tour -->
> > 				<xsd:element name="vTourURL" 
> type="xsd:string" minOccurs="0"/>
> > 				<!-- URL to a virtual Tour -->
> > 				<!-- Virtual Tour -->
> > 				<xsd:element name="vTour2URL" 
> type="xsd:string" minOccurs="0"/>
> > 				<!-- URL to a virtual Tour -->
> > 				<!-- HIP Document -->
> > 				<xsd:element name="HIPDocument" 
> type="asset_Type" minOccurs="0"/>
> > 				<!-- Filename or URL to an HIP 
> Document -->
> > 				<!-- EPC Document -->
> > 				<xsd:element name="EPCDocument" 
> type="asset_Type" minOccurs="0"/>
> > 				<!-- Filename or URL to an EPC 
> Document -->
> > 				<!-- Energy Efficiency Ratings -->
> > 				<xsd:element name="EERImage" 
> type="asset_Type" minOccurs="0"/>
> > 				<xsd:element name="EERCurrent" 
> type="xsd:integer" minOccurs="0"/>
> > 				<xsd:element 
> name="EERPotential" type="xsd:integer"  
> > minOccurs="0"/>
> > 				<!-- Environment Impact Ratings -->
> > 				<xsd:element name="EIRImage" 
> type="asset_Type" minOccurs="0"/>
> > 				<xsd:element name="EIRCurrent" 
> type="xsd:integer" minOccurs="0"/>
> > 				<xsd:element 
> name="EIRPotential" type="xsd:integer"  
> > minOccurs="0"/>
> > 				<!-- Optional Contact 
> Information. If provided will be used 
> > instead of contact information of the agent branch -->
> > 				<xsd:element name="contactName" 
> type="xsd:string" minOccurs="0"/>
> > 				<xsd:element name="contactTelephone"  
> > type="telephoneNumber_Type" minOccurs="0"/>
> > 				<xsd:element 
> name="contactEmail" type="xsd:string" minOccurs="0"/>
> > 				<!-- Additional Record Information /-->
> > 				<xsd:element name="createdDate" 
> type="xsd:dateTime"  
> > minOccurs="0"/>
> > 				<xsd:element 
> name="modifiedDate" type="xsd:dateTime"  
> > minOccurs="0"/>
> > 				<xsd:element 
> name="additionalKeywords" type="xsd:string"  
> > minOccurs="0"/>
> > 				<xsd:element name="notes" 
> type="xsd:string" minOccurs="0"/>
> > 			</xsd:sequence>
> > 			<xsd:sequence>
> > 				<xsd:element name="delete" 
> type="xsd:string" default="1"  
> > minOccurs="0"/>
> > 			</xsd:sequence>
> > 		</xsd:choice>
> > 		<xsd:attribute name="propertyID" 
> type="xsd:string" use="required"/>
> > 	</xsd:complexType>
> > 	<xsd:complexType name="asset_Type">
> > 		<xsd:simpleContent>
> > 			<xsd:extension base="xsd:string">
> > 				<xsd:attribute 
> name="modifiedDate" type="xsd:dateTime"  
> > use="optional"/>
> > 			</xsd:extension>
> > 		</xsd:simpleContent>
> > 	</xsd:complexType>
> > 	<!-- countryCode is always 2 alpha characters /-->
> > 	<xsd:simpleType name="countryCode_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:pattern value="[A-Za-z]{2}"/>
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > 	<!-- priceCurrency is always 3 alpha characters /-->
> > 	<xsd:simpleType name="priceCurrency_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:pattern value="[A-Za-z]{3}"/>
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > 	<!-- price,bedrooms,bathrooms, etc
> >          can be a string representation of an integer
> >          or an integer range of two integers seperated by ' 
> TO ' or ' 
> > - ' /-->
> > 	<xsd:simpleType name="integerRange_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:pattern value="([0-9]* ?(TO|-) 
> ?[0-9]*|[0-9]*)"/>
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > 	<xsd:simpleType name="integerOrNull_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:pattern value="[-0-9]*"/>
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > 	<!-- Telephone can contains numbers, spaces, brackets, 
> +'s and -'s 
> > /-->
> > 	<xsd:simpleType name="telephoneNumber_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:pattern value="[-0-9+ ()]*"/>
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > 	<!-- agentGroupMode has a set list of possible values /-->
> > 	<xsd:simpleType name="agentGroupMode_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:enumeration value="FULL"/>
> > 			<xsd:enumeration value="INCR"/>
> > 			<!-- Full /-->
> > 			<!-- Incremental /-->
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > 	<!-- pricePrefix has a set list of possible values /-->
> > 	<xsd:simpleType name="pricePrefix_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:enumeration value="F"/>
> > 			<xsd:enumeration value="I"/>
> > 			<xsd:enumeration value="O"/>
> > 			<xsd:enumeration value="A"/>
> > 			<xsd:enumeration value="S"/>
> > 			<xsd:enumeration value="R"/>
> > 			<xsd:enumeration value="B"/>
> > 			<xsd:enumeration value="G"/>
> > 			<xsd:enumeration value="P"/>
> > 			<xsd:enumeration value="W"/>
> > 			<xsd:enumeration value="M"/>
> > 			<xsd:enumeration value="N"/>
> > 			<!-- Asking price of /-->
> > 			<!-- Offers in the region of /-->
> > 			<!-- Offers in excess of /-->
> > 			<!-- Auction guild price of /-->
> > 			<!-- Subject to contract /-->
> > 			<!-- Price range of /-->
> > 			<!-- Prices from /-->
> > 			<!-- Guide price /-->
> > 			<!-- Price on Application /-->
> > 			<!-- Weekly rental of /-->
> > 			<!-- Monthly rental of /-->
> > 			<!-- Annual rental of /-->
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > 	<!-- sellingState has a set list of possible values /-->
> > 	<xsd:simpleType name="sellingState_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:enumeration value="V"/>
> > 			<xsd:enumeration value="U"/>
> > 			<xsd:enumeration value="H"/>
> > 			<xsd:enumeration value="N"/>
> > 			<xsd:enumeration value="S"/>
> > 			<xsd:enumeration value="L"/>
> > 			<!-- Viewing /-->
> > 			<!-- Under offer /-->
> > 			<!-- Hidden /-->
> > 			<!-- New Instruction /-->
> > 			<!-- Sold /-->
> > 			<!-- Let /-->
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > 	<!-- propertyType has a set list of possible values /-->
> > 	<xsd:simpleType name="propertyType_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:enumeration value="H"/>
> > 			<xsd:enumeration value="F"/>
> > 			<xsd:enumeration value="A"/>
> > 			<!-- House /-->
> > 			<!-- Flat /-->
> > 			<!-- Agricultural  /-->
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > 	<!-- saleOrRent has a set list of possible values /-->
> > 	<xsd:simpleType name="saleOrRent_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:enumeration value="S"/>
> > 			<xsd:enumeration value="R"/>
> > 			<!-- Sale /-->
> > 			<!-- Rent /-->
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > 	<!-- tenure has a set list of possible values /-->
> > 	<xsd:simpleType name="tenure_Type">
> > 		<xsd:restriction base="xsd:string">
> > 			<xsd:enumeration value="F"/>
> > 			<xsd:enumeration value="S"/>
> > 			<xsd:enumeration value="L"/>
> > 			<xsd:enumeration value="X"/>
> > 			<xsd:enumeration value=""/>
> > 			<!-- Freehold /-->
> > 			<!-- Share of freehold /-->
> > 			<!-- Leasehold /-->
> > 			<!-- Not Specified /-->
> > 			<!-- Not Specified /-->
> > 		</xsd:restriction>
> > 	</xsd:simpleType>
> > </xsd:schema>
> 
> --
> CTO, DecisionSoft Limited
> +44 1865 203192 / +44 7968 408138
> 
> 
> 


From mike@s... Thu Jun 14 22:54:09 2007
Received: from maggie.w3


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent