Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xml-dev] datatype functionality I'd like to see

From: Jeni Tennison <jeni@------------.--->
To: bry@------.---
Date: 7/8/2004 1:16:00 PM
Hi,

> Perhaps I am uninformed however, can anyone think of any particular
> schema language one can do this in, and if you are the person who
> knows of such a language can you give me an example if possible.
> (not that it's something I need to do, just something I thought
> would be extremely useful to be able to do at some point)

This was one of the features of the datatype library language that I
have been working on [1]. You could do something like (bearing in mind
I don't know how SSNs actually work):

  <define name="Digit">
    <charGroup><range from="0" to="9" /></charGroup>
  </define>

  <datatype name="SSN">
    <parse>
      <group name="state">
        <repeat exactly="3"><ref name="Digit" /></repeat>
      </group>
      <string>-</string>
      <group name="individual">
        <repeat exactly="2"><ref name="Digit" /></repeat>
        <string>-</string>
        <repeat exactly="4"><ref name="Digit" /></repeat>
      </group>
    </parse>
    ...
  </datatype>

and in the rest of the datatype definition you'd work with a tree
containing <state> and <individual> elements. For example, the SSN
123-12-1234 would become:

  <SSN><state>123</state>-<individual>12-1234</individual></SSN>

At http://www.jenitennison.com/datatypes/#implementation, there's an
implementation that transforms the datatype library syntax into an
XSLT 2.0 stylesheet that contains a bunch of functions for each
datatype. You could probably do something with Schematron such that
you declare the datatypes in the Schematron schema and then use them
in the test expressions, as long as you were happy using an XSLT 2.0
processor, but I haven't pursued that.
  
I'm currently in the process of revising the language I initially came
up with so that (among other changes) you can just use named
subexpressions within a regular expression; something like:

  <datatype name="SSN">
    <format>
      <regex>(?[state][0-9]{3})-(?[individual][0-9]{2}-[0-9]{4})</regex>
    </format>
    ...
  </datatype>

or use other (extensible) methods for expressing the format of a
value, such as BNF or PEGs or whatever the particular datatype library
processor understands, but it's all work in progress...

Cheers,

Jeni

[1] http://www.jenitennison.com/datatypes/

---
Jeni Tennison
http://www.jenitennison.com/


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent