Altova Mailing List Archives


Re: [xsl] Character substitution

From: Jim Fuller <jim.fuller@-------------->
To:
Date: 1/10/2005 11:32:00 AM
Your input has a reference to unicode 128. that is a control character
(on the meaning of which you explitly shouldn't depend).

I would call it a ANSI character number 128 (using windows-1252 is just 
too new fangled for me) with its assoc unicode number being 8364



 > Now if you output to ISO-8859-1 then probably you should get the same
control character (ie byte 128). If your browser happens to decide to be
non conformant but friendly and show that as a euro, that's either good
or bad, depending on your point of view. If you output to Windows-1252
then that doesn't have those control characters (as the space is taken

isnt ANSI what you wordy folks call windows-1252 (once again for me this 
was once known as CP1252)? doesnt ANSI define ANSI char 128 as the euro?



It is my (feeble) understanding that ANSI characters 32 to 127 
correspond to those in the 7-bit ASCII character set, which are just the 
basic latin unicode character range. The next set of characters, e.g. 
160255, correspond to those in the latin-1 supplement unicode character 
range; positions 128159 in Latin-1 Supplement being reserved for 
controls. Though I might be mistaken, are not most of these used for 
printable characters in ANSI, of which 128 is the euro?



OK, I understand the anachronism now...which is why I have probably 
carried this with me for so long....and why I always started using

&#8364; with UNIX based systems.



up with extra printing symbols) so you should get a fatal encoding error

btw the same forgiveness occurs when using &#8364; ....it renders into 
euro symbol in Mozilla when char encoding is  ISO-8859-1...I must admit 
that I find it difficult to determine the default behavior.



telling you that you can't linearise character 128 into the windows
encoding (as that slot is taken up to linearise character 8364).  If
however the encoding support silently lineraises both 128 and 8364 on to
the same slot (so destroying the round tripping that is supposed to be
preserved by linearisation) you will see a euro, but whether that is
good or bad depends on your point of view...

not quite sure what my view is on this, with all the trickery that makes 
my computer able to display characters with multiple encodings managing 
all the backword compatibility issues, etc... it seems to be ball of 
twine and knots which just makes things work....and very little logical 
sense;



cheers, Jim Fuller

Disclaimer

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.