Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: BBC news story: Judge bans Microsoft Word sales

From: Peter Flynn <peter.nosp@-.--------.-->
To: NULL
Date: 8/17/2009 9:44:00 PM
Pete Becker wrote:
> SD_Data_Dude wrote:
>> On Aug 12, 3:30 pm, Joseph Wright <joseph.wri...@morningstar2.co.uk>
>> wrote:
>>> On Aug 12, 8:51 pm, Jonathan Fine <jf...@pytex.org> wrote:
>>>
>>>> http://news.bbc.co.uk/1/hi/technology/8197990.stm
>>>> ===
>>>> I4i filed a patent in 1998 that outlined a means for "manipulating the
>>>> architecture and the content of a document separately from each other"
>>>> invoking XML as a means allowing users to format text documents.
>>>> XML is also used extensively among other word-processing programs such
>>>> as OpenOffice.
>>>> ===
>>>> -- 
>>>> Jonathan
>>> That is "interesting". I suspect that lots of people will want to
>>> support MS here: I doubt many people fancy paying fees to produce XML
>>> (or ue it, or ...).
>>> -- 
>>> Joseph Wright
>>
>> This is, again, another evil and wrong use of the patenting system.
>> There is no "unique art" to this idea.  It's akin to patenting the
>> letter "q".
>>
>> The patent system should be reformed, and these crappy patents of the
>> blatantly obvious need to be eliminated.
> 
> The cited news article is rather superficial. Be careful about drawing 
> conclusions about how the legal system works from reading such sources. 
> They're often wrong.

Indeed. The application of the US patent system may be seriously flawed, 
but the point at issue is a little more complex than that.

[Warning: cross-posted to c.t.x]

This was (unsurprisingly) discussed in some detail in the hallways at 
the Balisage (markup) conference in Montreal last week.

> The patent itself was filed in 1994 (not 1998, as the article says) and 
> issued in 1998. It mentions SGML (the parent of XML) in several places, 
> and says that the method at issue is fundamentally different because it 
> does not put structural information in the data stream. More particularly:
> 
>     Thus, in sharp contrast to the prior art the present
>     invention is based on the practice of separating encoding
>     conventions from the content of a document. The invention
>     does not use embedded metacoding to differentiate the content
>     of the document, but rather, the metacodes of the document are
>     separated from the content and held in distinct storage in a
>     structure called a metacode map, whereas document content is
>     held in a mapped content area. Raw content is an extreme
>     example of mapped content wherein the latter is totally
>     unstructured and has no embedded metacodes in the data stream.
> 
> That doesn't sound like a description of XML.

It's not. It's largely garbage. Let's look a little more closely...

>> Thus, in sharp contrast to the prior art the present invention is 
>> based on the practice of separating encoding conventions from the 
>> content of a document.

By "encoding conventions" they appear to mean the specification of the 
method of encoding applied. Keeping these separate from the document 
content is not "in sharp contrast" to the prior art at all: SGML does 
precisely this in the SGML Declaration for a given DTD. On the other 
hand if by "encoding conventions" they just mean "markup", then they are 
equally incorrect: out-of-line markup ("standoff" markup) does precisely 
that, and has been in use as prior art for decades.

>> The invention does not use embedded metacoding to differentiate the
>> content of the document, 

To "differentiate the content of the document" from what, exactly? 
Omitting the indirect object from this sentence renders it null. Perhaps 
they mean "differentiate the document content from the markup"; if we 
allow them this latitude, then they are implying that they do not 
intersperse markup and character data ("mixed content"), which 
contradicts their earlier claim.

>> but rather, the metacodes of the document are separated from the
>> content and held in distinct storage in a structure called a
>> metacode map,

This tends to support the previous sentence, that they are using 
out-of-line markup.

>> whereas document content is held in a mapped content area. 

Ditto. In effect they are claiming that they keep the metadata (in 
effect, markup) separately from the document content. Mapping this way 
(using some kind of pointer mechanism) is certainly "prior art".

>> Raw content is an extreme example of mapped content wherein the
>> latter is totally unstructured and has no embedded metacodes in the
>> data stream.

Quite why they feel it necessary to specify this in different words when 
it has already been said, is unclear. Alternatively, if they are 
claiming "raw content" is akin to CDATA content (free of markup), then 
they are implying that their other kind of content ("document content") 
*does* have embedded metacodes (markup) in it -- which means it is *not* 
as earlier described (mapped and kept separate); which contradicts their 
original statement.

This whole claim appears to be entirely spurious and without merit. If 
they are unable to describe accurately what they claim to be patenting, 
then the patent should be voidable (and if you're a lawyer reading this, 
you can quote me). If they are claiming a patent on out-of-line markup, 
then it is already voidable on the grounds of prior art.

> The problem really has nothing to do with XML. 

It would appear so, except that it would be perfectly possible to 
implement such a system using XML as a carrier for the standoff markup 
metadata, which is in some places exactly what OOXML does.

> It's that MS used XML in a way that violated the patent. They could
> change to a completely proprietary non-XML file format and that
> wouldn't fix the problem.

OOXML does indeed use some standoff markup techniques (wholly 
unnecessarily) which obfuscate the document structure and make it harder 
to process. Microsoft already has a proprietary non-XML format (.doc) 
which operates in a similar (although differently implemented) manner. 
On the surface, i4i may therefore have a claim, although their own 
patent could be challenged on the grounds I have already described.

Microsoft could have avoided this by using standard XML document type 
design techniques in the construction of OOXML instead, but they failed 
to acquire sufficient knowledge of the technology before starting.

Purely from a document management point of view, and leaving aside any 
prejudices for or against the US patent system, Microsoft, or i4i, it 
would appear to be a deeply unwise business practice to commit any 
important information to the OOXML format alone (or to any format that 
cannot be adequately described without recourse to legal argument). Its 
use should be restricted to temporary output or interchange applications 
only, with business-critical information stored in some other more 
reliable XML container.

///Peter
-- 
XML FAQ: http://xml.silmaril.ie/


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent