Altova Mailing List Archives>Archive Index >microsoft.public.xsl Archive Home >Recent entries >Thread Prev - Structuring flat data >Thread Next - Re: Structuring flat data Re: Structuring flat dataTo: NULL Date: 3/5/2006 10:13:00 PM ariana_paris@y... wrote: > Cutting a long story short, I have some files in a rather flat XML > structure which I now want to upgrade to a more sophisticated schema. The problem of encapsulation can be solved using SGML if you can write a DTD which allows missing start-tags and end-tags on the right selection of elements. For example: <!DOCTYPE xml [ <!ELEMENT xml - - (chapter+)> <!ELEMENT chapter O O (title?,section+)> <!ELEMENT section O O (p|break)*> <!ELEMENT (title,p) - - (#PCDATA)> <!ELEMENT break - O EMPTY> ]> <xml> <title>Title 1</title> <p>Aaaaaa</p> <break> <p>Bbbbbb</p> <title>Title 2</title> <p>Cccccc</p> </xml> (converting the <break/> to the SGML format <break>). Running this through OSGMLNORM (part of SP, same place onsgmls comes from) using the SGML Declaration for DocBook 3 with NAMECASE GENERAL NO gives: <xml> <chapter> <title>Title 1</title> <section> <p>Aaaaaa</p> <break> <p>Bbbbbb</p> </section> </chapter> <chapter> <title>Title 2</title> <section> <p>Cccccc</p> </section> </chapter> </xml> You just need to convert <break> back to <break/> afterwards. However, you would need to ensure that all non-ASCII characters are given as numeric character references, or do some deep surgery on the SGML Declaration to allow non-ASCII characters. > The files can have the following structures: > > Example 1: > <xml> > <p>blablabla</p> > </xml> Gives: <xml> <chapter> <section> <p>blablabla</p> </section> </chapter> </xml> > Example 2: > <xml> > <p>Aaaaaa</p> > <break/> > <p>Bbbbbb</p> > </xml> Gives: <xml> <chapter> <section> <p>Aaaaaa</p> <break> <p>Bbbbbb</p> </section> </chapter> </xml> > The point being that the files can have various combinations of titles > and breaks, or none at all. I'm hoping I can get a one-pass solution Modulo the caveats on content models and character encoding, this should work if you get the DTD right. SGML does still have its uses :-) ///Peter -- XML FAQ: http://xml.silmaril.ie/ | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
