Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


RE: [xml-dev] Is it time for the binary XML permathread to start up again?

From: noah_mendelsohn@--.---.---
To: "Alexander Philippou" <alex@------.--->
Date: 7/20/2007 4:44:00 PM
Alexander Philippou writes:

> And since the processing penalty of compression is proportional to doc 
size,

Yes, typically, at least to a first approximation (actually, some 
compression algorithms do a bit better on large documents, to the extent 
that the overhead of building dictionaries of commonly used terms gets 
done toward the beginning, and leveraged throughout).

> using FI instead of text makes sense even when doing http+gzip.

To the extent FI itself compresses, that's surprising.  I'm not 
disagreeing that gzip might run faster on the FI form than on the larger 
text form;  I'm surprised that size(gzip(FI)) << size(FI).  You wouldn't 
expect compression systems like gzip to do well on things that are already 
tightly coded.  On the contrary, many compression algorithms will actually 
somewhat expand things that are already compressed using other algorithms. 
 Basically, compression algorithms take a gamble that they can recognize 
some form(s) of redundancy and get them out.  If the input doesn't have 
redundancy in such forms, then you tend to wind up at best restating the 
input, plus a bit of overhead for the compression framework itself. 

If gzip is going to make the FI form larger, or not much smaller, then 
it's a bad use of time to run it, even if the time to gzip the FI is 
indeed much lower than the time to gzip the original text.

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent