Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xsl] I18N / UTF-8 versus US-ASCII

From: David Carlisle <davidc@--------->
To:
Date: 4/4/2006 12:16:00 PM
> It would be interesting to know if anyone who was using US-ASCII
> output had to switch to a broader encoding because of some issue....

I also tend do use US-ASCII for anything that might go near a web server
although I make sure to use omit-xml-encoding in that case as well so
that the files are well formed as UTF-8 documents and parsable by all
XML systems. An XML system may (and if i recall correctly, some older
ones, eg IE6 if running under windows 2000 in some configurations, did)
issue a fatal "unknown encoding" error if presented with a file saying
<?xml version="1.0" encoding="US-ASCII"?>
even though the same file would parse correctly if this line were
removed.

Of course the other cases where you can not use a restricted encoding
are cases where the element or attribute names use non-ascii characters.

> The only disadvantage I'm aware of is that anyone reading the file is
> presented with the numeric character references instead of the
> characters themselves - this is not normally a problem as the only
> people who ever examine the XML itself are developers, users only see
> the parsed content (at which point all the references have been
> resolved).

file size can be an issue. People have long arguments about whether the
(typically) one or two or three bytes per character cost of utf8 is
better or worse than the typically 2 or 4 bytes per character of utf16.
Anyone who thinks that is an issue worthy of consideration isn't going
to like the idea of using character references.
Encoding characters as &#x1234; is 8 bytes per character (or 9 if you
need higher planes) For some document types and languages and document
distribution methods making the file 4 times bigger than it would be in
utf16 isn't really an option.

David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent