Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


RE: [xml-dev] xml over http - RFC 3023

From: "Michael Kay" <mike@--------.--->
To: "'Andrew Welch'" <andrew.j.welch@-----.--->, "'Rick Jelliffe'"
Date: 12/1/2008 10:54:00 AM
> > The out-of-band signalling of character encoding is a fundamentally 
> > broken idea, because there are no mechanisms for programs which 
> > generate data to memoize the character encoding used that can then 
> > feed the rest of the food-chain.
> 
> How about the BOM - that's one way isn't it?  I wonder if a 
> similar ignorable byte sequence could be added to the start 
> of all byte sequences to indicate the encoding of what's coming.

None of the ideas in use here is fundamentally broken, they are all doing
their best to deliver results in an imperfect world.

There are two things that would work, in a more perfect world:

(a) an XML document carries the knowledge of its own encoding (preferably
without the bizarre feature that you need to know what the encoding is
before you can decode the encoding name!), and the carrier doesn't meddle
with it

(b) an XML document is text; the carrier is responsible for knowing the
encoding of text and is allowed to change it; but it needs to know correctly
what the original encoding of the text is, and needs to inform the recipient
reliably what the final encoding of the text is.

Both of these break primarily because they make invalid assumptions about
the rest of the system. An XML document doesn't know its own encoding
because it's frequently created using tools such as text editors that don't
know they are dealing with XML and don't regard it as their responsibility
to set the encoding correctly. Equally, a carrier often doesn't know the
encoding of its payload message because APIs don't require the information
to be provided correctly.

So don't try to blame any one spec in this area. They are all doing their
best. But so long as we have systems that aren't type-safe end-to-end (for
example, operating system filestores without any real metadata) we're going
to get character encoding glitches somewhere along the line.

Michael Kay
http://www.saxonica.com/


_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@l...
subscribe: xml-dev-subscribe@l...
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php



transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent