Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: UTF-8 & Unicode

From: "EU citizen" <noaddress@-----.--->
To: NULL
Date: 2/2/2005 4:47:00 PM
"Richard Tobin" <richard@c...> wrote in message
news:ctq7fk$51s$1@p......
> In article <4Z0Md.51$tX4.47@n...>,
> EU citizen <noaddress@f...> wrote:
>
> >> > Do web pages have to be created in unicode in order to use UTF-8
> >encoding?
>
> >> That's kind of a silly question because UTF-8 is a unicode encoding.
>
> >I wish people would give simple answers to simple questions.
>
> It may be a simple question for you, because you know what you mean,
> but for the  rest of us it's a hard-to-understand question, because
> if you use UTF-8, you are inevitably using Unicode, since it's a
> way of writing Unicode.
>
> But from what you say now, it looks as if your question is really
> about some Windows software.

No. I am using a version of Windows (like most computer users on this
planet). However, my question isn't specific to Windows. For all I knew,
declaring uft-8  encoding might've caused the file to be transformed into
utf-8 regardless of the original file format.


>
> >To let your XML parser understand these characters, you should save your
XML
> >documents as Unicode.
> >Windows 95/98 Notepad cannot save files in Unicode format.
> >You can use Notepad to edit and save XML documents that contain foreign
> >characters (like Norwegian or French æøå and êèé),
> >But if you save the file and open it with IE 5.0, you will get an ERROR
> >MESSAGE.
>
> Presumably this means that Notepad saves documents containing those
> characters in some non-Unicode encoding, in which case you must put
> an appropriate encoding declaration at the top of the document.  But
> you will need to know the name of the encoding that Notepad uses.
>
>   <?xml version="1.0" encoding="whatever-the-notepad-encoding-is"?>

Based on what I know now, I agree. I always assumed that Notepad, being a
simple text editor, saved files in Ascii format. Nothing in Notepad's Help,
Windows' Help or Microsoft's website says anything about the formt used by
Notepad. Through experimentation with the W3C HTML vakidator, I've worked
out that iso-8859-1will work for Notepad files with standard english text
plus acute accented vowels.

>
> >Windows 95/98 Notepad files must be saved with an encoding attribute.
>
> This is mysterious.  What does it mean?  That Notepad won't save
> them without one?  Or that you have to add one to make it work
> in the web browser?

I can't make head or tail of it.

>
> >To avoid this error you can add an encoding attribute to your XML
> >declaration, but you cannot use Unicode.
> >The encoding below (open it with IE 5.0), will NOT give an error message:
> ><?xml version="1.0" encoding="UTF-8"?>
>
> It only makes sense to say that you're using UTF-8 if you are.  If Notepad
> really  doesn't know about Unicode, this  will only be true if you
> restrict yourself to ASCII characters, because they're the same
> in UTF-8 as they are in ASCII and most other common encodings.
>

The need for the XML encoding  statement to match the original file format
was not mentioned in any of the (many) articles I've read on XM:/XHTML over
the last *four* years.




transparent
Print
Mail
Digg
delicious
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent