Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: Multilingual support in generated XML (RSS)

From: "Anthony Jones" <Ant@------------.--->
To: NULL
Date: 11/2/2006 4:07:00 PM


"BarakF" <frohlinger@y...> wrote in message
news:1162476269.294020.111830@i......
> > Is there any difference between the ASP page that reads from the DB and
> > writes the XML and that backpage.asp that reads from the DB and writes
> > the HTML output (which is UTF-8 encoded)? Are those ASP pages encoded
> > with the same code page/encoding?
>
>
> The backpage.asp - which reads the string from the DB and displays it -
> has the meta tag on top:
> <meta HTTP-EQUIV="content-type" CONTENT="text/html; charset=UTF-8">
>
> The server page that prepares the XML (it is a single function called
> on a server page), as well as the page that inserts it into the DB -
> does not have any encoding directive.
> None of the following:
> Response.CharSet = "UTF-8"
> Response.ContentType = "text/xml"
> Response.CodePage = 65001
>
>
> > How is the text stored in the DB, what DB is that, what column type is
> > the text stored in?
>
> The DB is Microsoft SQL Server 2000
> The field is defined as nvarchar and when inserting the string into the
> DB, through SP, I use
> "adVarWChar" define.
>
> BUT -
> When adding Response.CodePage = 65001 to all server pages - I see the
> Hebrew characters correctly.
> Should I always add Response.CodePage = 65001 to server pages?
> Should I leave the <meta HTTP-EQUIV="content-type" CONTENT="text/html;
> charset=UTF-8"> for HTML pages?
>
> Now, what I inserted previously looks like garbage, and the new posts
> in Hebrew look OK (in the DB, in the XML and in the ASP pages).
>
> I am confused. I don't know in which case I did right, and on which
> case I did wrong...
> What are the basic rules for UTF-8 support?

The problem is in the way ASP decodes the form inputs.

First you need to understand that the encoding a browser uses when
submitting the content of a form is taken from the encoding of the loaded
page.

In a somewhat counter intuitive way ASP uses the Response codepage to inform
it as to how to decode form fields in the request.  Hence for correct
operation a page receiving a form post should have it's Response codepage
set to the codepage that matches the character set specified when sending
the original form.

So in your case you have a form in a UTF-8 page.  Text entered is posted to
the server in UTF-8 encoding.  However the receiving page is currently set
to a ANSI code page hence the 2 byte character encodings that some of the
UTF-8 characters are using are treated as individual characters and that's
how it's stored in the DB.  Hence the content of the DB is corrupt.

Now that you are specifying the codepage of your output pages correctly you
are seeing the corruption.
Before that change you were telling the client it was receiving UTF-8 but
using an ANSI code page in the response.  It appeared to work because this
incorrect setup reverses the corruption of the characters in the DB when
sent to the client.

I hope I've made it clear, char sets and code pages can really bend your
mind. ;)



>
> Thanks, Gabi.
>




transparent
Print
Mail
Digg
delicious
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent