Altova Mailing List Archives>Archive Index >xml-dev Archive Home >Recent entries >Thread Prev - Use of UTF-8 and UTF-16 >Thread Next - Re: [xml-dev] Use of UTF-8 and UTF-16 Re: [xml-dev] Use of UTF-8 and UTF-16To: Elliotte Harold <elharo@-------.---.---> Date: 11/2/2005 2:06:00 PM Elliotte Harold wrote:
> Rick Jelliffe wrote:
>
>> For CJK (Chinese, Japanese, Korean) XML documents, where three (or six)
>> bytes may be used by UTF-8 instead of UCS-16's two (or four), UTF-16
>> files
>> will usually be smaller.
>
>
> First a correction: UTF-8 never uses six bytes for anything. The largest
> UTF-8 character you'll ever see is 4 bytes wide.
>
hi,
I read somewhere that :
UTF-8 uses 6 bytes for ISO/IEC 10646
UTF-8 uses 4 bytes for Unicode
Unicode is a subset of ISO/IEC 10646 (in terms of addressing)
ISO/IEC 10646 is a subset of Unicode (in terms of semantic)
XML uses Unicode
--
Cordialement,
///
(. .)
-----ooO--(_)--Ooo-----
| Philippe Poulard |
----------------------- | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
