Altova Mailing List Archives>Archive Index >microsoft.public.xsl Archive Home >Recent entries >Thread Prev - Re: Structuring flat data [Thread Next] Re: Structuring flat dataTo: NULL Date: 3/6/2006 9:54:00 PM Ariana wrote: > Thank you very much for the answer. There is a slight misunderstanding > on what I want to do with the breaks. For example, my example 2: > >> Example 2: >> <xml> >> <p>Aaaaaa</p> >> <break/> >> <p>Bbbbbb</p> >> </xml> > > should be transformed as follows: > > <xml> > <section> > <p>Aaaaaa</p> > </section> > <section> > <p>Bbbbbb</p> > </section> > </xml> > > In other words, <section> elements should contain everything that > starts and/or finishes with a <break/> element. Ah. OK, that changes the rules significantly. > And therein lies the > rub, since the <break/> element only appears in between "sections", so > it is difficult to programmatically find the start of the first section > and the end of the last one. This is where the SGML solution still works: if you modify the content model for section... <!DOCTYPE xml [ <!ELEMENT xml - - (chapter+)> <!ELEMENT chapter O O (title?,section+)> <!ELEMENT section O O (break?,p+)> <!ELEMENT (title,p) - - (#PCDATA)> <!ELEMENT break - O EMPTY> ]> <xml> <p>Aaaaaa</p> <break> <p>Bbbbbb</p> </xml> Now a break element will cause a new section: <xml> <chapter> <section> <p>Aaaaaa</p> </section> <section> <break> <p>Bbbbbb</p> </section> </chapter> </xml> You will of course have to remove the unwanted break elements afterwards, but that is easy enough to script. > The caveat about character encoding suggests that this may not be the > best solution in any case, since several of the files are in Spanish > and therefore contain a very high proportion of non-ASCII characters. But they are easy to change into numeric character references or even character entity references. A simple sed script can do that very fast. > But aside from that, the SGML solution you proposed is interesting - > it's good to know it still has its uses even though I haven't used it > for years (over 10, in fact - boy, I feel old :) Hehe. Still life in the old dog yet. ///Peter | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
