Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: Conditional Levels of a Schema

From: "C. M. Sperberg-McQueen" <cmsmcq@-------------.--->
To: Dieter Menne <dieter.menne@------------.-->
Date: 4/6/2009 4:08:00 PM
On 2 Apr 2009, at 12:05 , Dieter Menne wrote:

> Hi,
>
> we are currently defining a format for medical data storage
> (hrmconsensus.org). The full version is available
> http://hrmconsensus.org/media/hrm/xhrm/xhrm02/xhrm0_2.xsd here .
>
> In the simplified example below, we have the always mandatory  
> deviceTyp. For
> patientsType, we would like to have a global conditional switch so  
> that
> three flavors are possible
>
> -- minOccurs = "0" for internal clinical use
> -- minOccurs = "1" for archiving, must contain patient info
> -- minOccurs = "never" anonymized, must not contain patient info

I may be being dense, but it's not clear to me what your requirement
is.  Is it that

(A) You want the internal clinical systems to use a schema with

   <xs:element name="patients" type="patientsType" minOccurs="0"/>

while the archival system uses

   <xs:element name="patients" type="patientsType" minOccurs="1"/>

while tools and data flows for anonymized data should use

   <xs:element name="patients" type="patientsType" maxOccurs="0"/>

?  In other words, you want to work with three related but different
schemas?

Or is it that

(B) based on some signal in the XML, the 'patients' element must occur,
must not occur, or may occur?

You don't seem to mention any visible signal in the XML, so I'm
guessing it's not B.


> I know that the latter is not possible, that conditionals are not  
> supported
> in XSL,

I'm not sure what you mean by that.  There are many conditions one
can check with the subset of regular languages which XSD uses for
content models.  It's true that to check conditions with a content
model you may need to write the content model in a particular way.

> and that Schematron would be an alternative.  Note that the
> conditionals occur in several nesting levels, so that we cannot easily
> combine versions of a master element with details, but they are  
> always of
> the type "may", "must", "must not".

I'm not sure what you mean by this.

> We would like to avoid having several xsd files and prefer a common  
> file
> with branching.

Is this (a) in order to avoid redundancy and eliminate the problem
of inconsistent updates during maintenance of the schema document(s)?
Or (b) because there are some important consumers of your work (maybe
potential users, maybe your bosses, maybe ISO Pascal programmers) who
might, you suspect, find it too hard to grasp the idea of a schema
made up by consulting more than one file at schema construction time?
Or (c) because you have no control over the schema processors to
be used with this schema, and you do not believe that xsd:include
is sufficiently interoperable to be relied upon? (d) Because
you believe in your hearts that you are defining a single language
here, and you want to make that fact manifest by producing a single
schema document?  (In this case, there is the troubling fact that
the 'patients' element follows three different syntactic rules based
not on syntactic context but based on application context, which
suggests that formally speaking you really are defining not one
language, but three.) (e) for some other reason?

Any of these can be a plausible reason (so forgive me if my tone
seems flippant or dismissive -- no offense to you intended), but
what you need to do may vary a lot depending on which reason you have.

> Any ideas or references to ideas are appreciated.

Some possibilities that occur to me off the top of my head.

(1) You single-source the schema document using a literate
programming system (or a macro processor).  So you have eliminated
the inconsistent-maintenance problem.  From your single source
you generate three schema documents, called clinical.xsd,
archival.xsd, and anonymized.xsd.  The appropriate tools and
systems use the appropriate schema document.

The suggestions made by Michael Kay and Pete Cordell both fall
into this category, I think.

(2) A particular variant of the preceding.  In the main schema
document, the relevant declaration reads

   <xs:element name="patients" type="patientsType"
       minOccurs="&patients.minOccurs;"
       maxOccurs="&patients.maxOccurs;"
   />

And the document begins

   <!DOCTYPE xs:schema SYSTEM ... >

By whatever means you choose, the different tools use different
entity declarations for patients.minOccurs and patients.maxOccurs.

(3) You declare that the syntactic rule in the language you are
defining is that 'patients' may occur optionally, and specify that
it is up to application-level checking to ensure that each
of the three applications you have described checks to see that
'patients' occurs, or does not occur, as prescribed.  (That is,
you kick the problem over to the business rule people and tell
them it's their problem not yours.)

(4) You enclose 'patients' in an enclosing element, indicating
which of the three rules the instance document is supposed to
be following at the moment.  So the sequence which now contains
deviceType and patients now reads instead:

    <xsd:sequence>
     <xsd:element name="device" type="deviceType"/>
     <xsd:choice>
      <xsd:element name="clinicalpatients">
       <xsd:complexType>
        <xsd:sequence>
	<xsd:element name="pateients" type="patientsType" minOccurs="0"/>
        </xsd:sequence>
       </xsd:complexType>
      </xsd:element>
      <xsd:element name="archivalpatients">
       <xsd:complexType>
        <xsd:sequence>
	<xsd:element name="pateients" type="patientsType" minOccurs="1"/>
        </xsd:sequence>
       </xsd:complexType>
      </xsd:element>
      <xsd:element name="anonymizedpatients">
       <xsd:complexType>
        <xsd:sequence/>
       </xsd:complexType>
      </xsd:element>
     </xsd:choice>
    </xsd:sequence>

The systems which transfer records from the clinical applications to
the archiving application, or to applications using anonymized data,
are responsible for changing the wrapper, which thus becomes a visible
signal that the record has been touched by the transfer application.
(This may be useful in debugging records transfer problems.)

(5) You get rid of the nesting and simply replace 'patients'
with three flavors of patients, all using the same type but
with different occurrence requirements.  Your sequence now becomes

    <xsd:sequence>
     <xsd:element name="device" type="deviceType"/>
     <xsd:choice>
      <xsd:element name="clinicalpatients" type="patientsType"  
minOccurs="0"/>
      <xsd:element name="archivalpatients" type="patientsType"  
minOccurs="1"/>
      <xsd:element name="anonymizedpatients">
       <xsd:complexType>
        <xsd:sequence/>
       </xsd:complexType>
      </xsd:element>
     </xsd:choice>
    </xsd:sequence>

Again the records transfer tools are responsible for changing the
name of the element in order to signal that they have done their work.

If you really want to document that 'clinicalpatients' and
'archivalpatients' and 'anonymizedpatients' are all really just
flavors of 'patients', by all means define an abstract 'patients'
element and make them all substitutable for it.

(6) You put an appropriate flag into the content model not as a
wrapper around 'patients' but as a preceding sibling:

    <xsd:sequence>
     <xsd:element name="device" type="deviceType"/>
     <xsd:choice>
      <xsd:sequence>
       <xsd:element name="clinical" type="our:flavor" minOccurs="1"/>
       <xsd:element name="patients" type="patientsType" minOccurs="0"/>
      </xsd:sequence>
      <xsd:sequence>
       <xsd:element name="archival" type="our:flavor" minOccurs="1"/>
       <xsd:element name="patients" type="patientsType" minOccurs="1"/>
      </xsd:sequence>
      <xsd:sequence>
       <xsd:element name="anonymized" type="our:flavor" minOccurs="1"/>
      </xsd:sequence>
     </xsd:choice>
    </xsd:sequence>

Which of these seems most appealing will depend on a lot of things,
including what it is you really want when you say you want a
conditional, and possibly including also what you think the other
tools you work with are going to be capable of doing.

I hope this helps.

Michael Sperberg-McQueen


-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com
* http://cmsmcq.com/mib
* http://balisage.net
****************************************************************





From lists@n... Tue Apr 07 06:49:16 2009
Received: 


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent