Altova Mailing List Archives
>xml-dev Archive Home
>Thread Prev - SAX2 r2 ... last call!
Re: [xml-dev] Re: [Sax-devel] SAX2 r2 ... last call!
To: "Simon St.Laurent" <simonstl@--------.--->
Date: 1/10/2002 10:37:00 PM
> I just suspect the point's worth making a little more strongly, as so > many of us have been brainwashed to think Java char=Unicode character. > Surrogate pairs whacked me a lot harder over the head than I thought, > and Java doesn't seem to take note. True for most folk. XML made me get my hands dirty with I18N stuff, and that one took a while for me to grok. I don't think it'll be intuitive to most folk, who've rarely had to look at such I18N issues. > > Point is that anyone working at the "character" level MUST > > NOT ASSUME that such characters consist only of a single > > Java "char" value. And that'd be true even if "char" were > > to make an incompatible change, and acquire a few extra > > bits at the left so that surrogates could in some cases be > > eliminated. > > So could the paragraph above appear in the documentation somewhere? I > think that would take of all my concerns. Yes, I was thinking of doing that. After I imbibe the other thread a bit more deeply, to make sure I pick up any other details. That should make it into the SAX2 r2 ContentHandler docs, and maybe also LexicalHandler.comment() if I get ambitious. - Dave