Altova Mailing List Archives>Archive Index >comp.text.xml Archive Home >Recent entries >Thread Prev - Re: BBC news story: Judge bans Microsoft Word sales >Thread Next - Re: BBC news story: Judge bans Microsoft Word sales Re: BBC news story: Judge bans Microsoft Word salesTo: NULL Date: 8/19/2009 9:50:00 AM On Aug 17, 5:31=A0pm, Piet van Oostrum <p...@cs.uu.nl> wrote: > >>>>> Pete Becker <p...@versatilecoding.com> (PB) wrote: > >PB> The cited news article is rather superficial. Be careful about drawi= ng > >PB> conclusions about how the legal system works from reading such sourc= es. > >PB> They're often wrong. > >PB> The patent itself was filed in 1994 (not 1998, as the article says) = and > >PB> issued in 1998. It mentions SGML (the parent of XML) in several plac= es, and > >PB> says that the method at issue is fundamentally different because it = does > >PB> not put structural information in the data stream. More particularly= : > >PB> =A0 =A0 =A0Thus, in sharp contrast to the prior art the present > >PB> =A0 =A0 =A0invention is based on the practice of separating encoding > >PB> =A0 =A0 =A0conventions from the content of a document. The invention > >PB> =A0 =A0 =A0does not use embedded metacoding to differentiate the con= tent > >PB> =A0 =A0 =A0of the document, but rather, the metacodes of the documen= t are > >PB> =A0 =A0 =A0separated from the content and held in distinct storage i= n a > >PB> =A0 =A0 =A0structure called a metacode map, whereas document content= is > >PB> =A0 =A0 =A0held in a mapped content area. Raw content is an extreme > >PB> =A0 =A0 =A0example of mapped content wherein the latter is totally > >PB> =A0 =A0 =A0unstructured and has no embedded metacodes in the data st= ream. > >PB> That doesn't sound like a description of XML. > > Well, read the whole patent. What they do is process a document with > embedded markup (like troff, SGML, XML, or maybe even TeX) in such a way > that inside the program the markup is separated from the plain text. The > external representation is still the marked up text. So it does apply to > XML. This is quite a primitive way of parsing the markup. It is just > scanning the input until you find a tag (called metacode in the patent) > copying the text before the tag to an output area, and copying the tag > to a list of tags (called a metacode map in the patent). So compared to > modern parsing techniques there are two differences: (1) nowaday you > usually build a parse tree; they have just a degenerate tree (only a > list). (2) usually the plain text is put in the leaves of the tree; they > have the text in one contiguous area, and the `parse tree' contains > pointers or indices to this area. > > The advantage of their structure comes when you need more than one tag > structure on top of the text: for example when you both have the > hierarchical XML structure and a structure with lines and pages. > > SGML has the possibility of having more than one structure in the same > document and that fact is mentioned in the patent. > > The only innovative idea in the patent is this separation because it > makes it easier to do editing on the document when you have more than > one structure on top of it. And I don't know how innovative it is > because once you need to edit a marked up text with more than one (markup= ) > structure on top of it, this is quite a logical choice. And moreover > ideas cannot be patented, so the idea doesn't count (but IANAL). > > Once you have this idea, implementing it is peanuts. You could give this > to any student that attends a beginner's programming course when they > have had strings, arrays and loops, and they should be able to solve it. > > So the patent is about the transformation of the marked up text to the > separated data structure and v.v. and about calculating another > structure from the first one, plus some minor other things. I find it > really silly that you can get a patent for this kind of thing. > > I am writing a small Python program that illustrates the patented > algorithms. > -- > Piet van Oostrum <p...@cs.uu.nl> > URL:http://pietvanoostrum.com[PGP 8DAE142BE17999C4] > Private email: p...@vanoostrum.org Isn't this very very similar to the weave and tangle system used in LaTeX/TeX? | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
