Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xml-dev] [Shannon: information ~ uncertainty] Ramificationsto XML data exchange?

From: "Thomas B. Passin" <tpassin@-------.--->
To: xml-dev@-----.---.---
Date: 10/11/2004 1:30:00 PM
Roger L. Costello wrote:
> I am trying to get an understanding of Claude Shannon's work on
> information theory. Below I describe one small part of Shannon's
> work.  I would like to hear your thoughts on its ramifications to
> information exchange using XML.
> 
> INFORMATION
> 
> Shannon defines information as follows:
> 
> Information is proportional to uncertainty.  High uncertainty equates
>  to a high amount of information.  Low uncertainty equates to a low 
> amount of information.
> 
> More specifically, Shannon talks about a set of possible data. A set
> comprised of 10 possible choices of data has less information than a
> set comprised of a hundred possible choices.

> ... QUESTIONS
> 
> 1. How does this aspect (information ~ uncertainty) of Shannon's work
> relate to data exchange using XML?  (I realize that this is a very
> broad question. Its intent is to stimulate discussion on the
> application of Shannon's information/uncertainty ideas to XML data
> exchange)
> 
> 2. A schema is used to restrict the allowable forms that an instance 
> document may take.  So doesn't a schema reduce information?

I think you should be very cautious and thoughtful about trying to apply
Shannon to the sending of xml messages.  I think there are some tricky 
aspects that could make things non-obvious.  Some examples  -

1) A schema does not necessarily reduce the number of possible messages 
if it is possible to send schema-invalid messages over the channel. 
What conditions about restricting the sessage set need to be in place 
for Shannon's work to apply directly?

2) Under most schemas, there are an infinite number of possible messages 
(since most or all elements or attributes could hold content of 
indefinite length).  The usual measures of log N of log N/N aren't 
useful in this circumstance.

3) Shannon's work is usually thought of in terms of whether tokens get 
through the communications channel uncorrupted or not.  Is it 
technically correct to think of a single xml message as a token (I think 
not)?  If not, what if anything would play this role in an xml message?

I am not well-versed in this area, so I will step back and let others 
who are, do the talking.

Cheers,

Tom P


-- 
Thomas B. Passin
Explorer's Guide to the Semantic Web (Manning Books)
http://www.manning.com/catalog/view.php?book=passin


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent