![]() |
![]() | ![]() | ![]() | Altova Mailing List Archives>Archive Index >comp.text.xml Archive Home >Recent entries >Thread Prev - Re: UTF-8 & Unicode >Thread Next - Re: UTF-8 & Unicode Re: UTF-8 & UnicodeTo: NULL Date: 2/2/2005 4:47:00 PM "Richard Tobin" <richard@c...> wrote in message news:ctq7fk$51s$1@p...... > In article <4Z0Md.51$tX4.47@n...>, > EU citizen <noaddress@f...> wrote: > > >> > Do web pages have to be created in unicode in order to use UTF-8 > >encoding? > > >> That's kind of a silly question because UTF-8 is a unicode encoding. > > >I wish people would give simple answers to simple questions. > > It may be a simple question for you, because you know what you mean, > but for the rest of us it's a hard-to-understand question, because > if you use UTF-8, you are inevitably using Unicode, since it's a > way of writing Unicode. > > But from what you say now, it looks as if your question is really > about some Windows software. No. I am using a version of Windows (like most computer users on this planet). However, my question isn't specific to Windows. For all I knew, declaring uft-8 encoding might've caused the file to be transformed into utf-8 regardless of the original file format. > > >To let your XML parser understand these characters, you should save your XML > >documents as Unicode. > >Windows 95/98 Notepad cannot save files in Unicode format. > >You can use Notepad to edit and save XML documents that contain foreign > >characters (like Norwegian or French æøå and êèé), > >But if you save the file and open it with IE 5.0, you will get an ERROR > >MESSAGE. > > Presumably this means that Notepad saves documents containing those > characters in some non-Unicode encoding, in which case you must put > an appropriate encoding declaration at the top of the document. But > you will need to know the name of the encoding that Notepad uses. > > <?xml version="1.0" encoding="whatever-the-notepad-encoding-is"?> Based on what I know now, I agree. I always assumed that Notepad, being a simple text editor, saved files in Ascii format. Nothing in Notepad's Help, Windows' Help or Microsoft's website says anything about the formt used by Notepad. Through experimentation with the W3C HTML vakidator, I've worked out that iso-8859-1will work for Notepad files with standard english text plus acute accented vowels. > > >Windows 95/98 Notepad files must be saved with an encoding attribute. > > This is mysterious. What does it mean? That Notepad won't save > them without one? Or that you have to add one to make it work > in the web browser? I can't make head or tail of it. > > >To avoid this error you can add an encoding attribute to your XML > >declaration, but you cannot use Unicode. > >The encoding below (open it with IE 5.0), will NOT give an error message: > ><?xml version="1.0" encoding="UTF-8"?> > > It only makes sense to say that you're using UTF-8 if you are. If Notepad > really doesn't know about Unicode, this will only be true if you > restrict yourself to ASCII characters, because they're the same > in UTF-8 as they are in ASCII and most other common encodings. > The need for the XML encoding statement to match the original file format was not mentioned in any of the (many) articles I've read on XM:/XHTML over the last *four* years. | ![]() | ![]() | ![]() |
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | |||||
|
