Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


[Shannon: information ~ uncertainty] Ramifications to XML data exchange?

From: "Roger L. Costello" <costello@-----.--->
To: <xml-dev@-----.---.--->
Date: 10/11/2004 12:46:00 PM
Hi Folks,

I am trying to get an understanding of Claude Shannon's work on information
theory. Below I describe one small part of Shannon's work.  I would like to
hear your thoughts on its ramifications to information exchange using XML. 

INFORMATION

Shannon defines information as follows: 

    Information is proportional to uncertainty.  High uncertainty equates
    to a high amount of information.  Low uncertainty equates to a low
    amount of information.

    More specifically, Shannon talks about a set of possible data.
    A set comprised of 10 possible choices of data has less information than
    a set comprised of a hundred possible choices.

This may seem rather counterintuitive, but bear with me as I give an
example. 

In a book I am reading[1] the author gives an example which provides a nice
intuition of Shannon's statement that information is proportional to
uncertainty.

EXAMPLE

Imagine that a man is in prison and wants to send a message to his wife.
Suppose that the prison only allows one message to be sent, "I am fine".
Even if the person is deathly ill all he can
send is, "I am fine".  Clearly there is no information in this message.  

Here the set of possible messages is one.  There is no uncertainty and there
is no information. 

Suppose that the prison allows one of two messages to be sent, "I am fine"
or "I am ill".  If the prisoner sends one of these messages then some
information will be passed to his wife.

Here the set of possible messages is two.  There is uncertainty (of which
message will be sent).  When one of the two messages is selected by the
prisoner and sent to his wife some information is
passed.

Suppose that the prison allows one of four messages to be sent:

1. I am healthy and happy
2. I am healthy but not happy
3. I am happy but not healthy
4. I am not happy and not healthy

If the person sends one of these messages then even more information will be
passed.

Thus, the bigger the set of potential messages the more uncertainty. The
more uncertainty there is the more information there is.

Interestingly, it doesn't matter what the messages are.  All that matters is
the "number" of messages in the set.  Thus, there is the same amount of
information in this set:

   {"I am fine", "I am ill"}

as there is in this set:

   {A, B}

SIDE NOTES

a. Part of Shannon's goal was to measure the "amount" of information.
   In the example above where there are two possible messages the amount
   of information is 1 bit.  In the example where there are four
   possible messages the amount of information is 2 bits.

b. Shannon refers to uncertainty as "entropy".  Thus, the higher the
   entropy (uncertainty) the higher the information.  The lower the
   entropy the lower the information.

QUESTIONS

1. How does this aspect (information ~ uncertainty) of Shannon's work relate
to data exchange using XML?  (I realize that this is a very broad question.
Its intent is to stimulate discussion on the application of Shannon's
information/uncertainty ideas to XML data exchange)

2. A schema is used to restrict the allowable forms that an instance
document may take.  So doesn't a schema reduce information?  

/Roger  
 
[1] An Introduction to Cybernetics by Ross Ashby


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent