Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: XML parsing error: An invalid character was found in text cont

From: "Michael Rys [MSFT]" <mrys@------.---------.--->
To: NULL
Date: 4/2/2007 9:36:00 PM

Stefan explained it well why you get the error... note that in SQL Server 
2005, you can pass UTF-8 to the XML datatype via either DBTYPE XML or 
varbinary.

Best regards
Michael

"Anthony Jones" <Ant@y...> wrote in message 
news:%23chcWKWYHHA.2556@T......
>
> "Doug Chandler" <DougChandler@d...> wrote in message
> news:9F2519B4-E24B-4E7B-B15B-6352F06F298E@m......
>> Hi Stefan,
>>
>> Thanks for your explanation. Your suggestions are consistent with what I
>> have found over the past week. ISO-8859-1 does work except for the Euro
>> symbol (?) which becomes a question mark (?). It also copes with many
> other
>> western european characters. I was hoping to find a way of using UTF-8 
>> and
>> TEXT datatype. The production of the XML is from other systems but we can
>> specify the encoding to use. With UTF-16 yes everything works but the
> files
>> become twice as big and the resources used to process the file are twice
> as
>> much.
>>
>> It sounds like the solution for now is to use ISO-8859-1 which I was the
>> decision I was coming to anyway; I just could not understand why UTF-8
> would
>> not work into SQLServer which you have now explained. I may have to trap
> and
>> escape the Euro symbol so that it works properly, just in case it is 
>> used.
>>
>> My other option seems to be to get the XML produced using UTF-8 and 
>> change
>> it to UTF16 before pumping to the database using NText datatype. A bit of
> a
>> hack bodge me thinks.
>>
>> Many thanks for your help.
>>
>
> What data type is Xml.Text  ?
>
> It will be a string.
>
> What encoding does .NET use to store strings?
>
> Unicode
>
> It doesn't matter what encoding the XML declare specifies in the document 
> as
> long as it is consistent with the actual encoding of the document contents
> the unicode string result of Xml.Text will be correct.
>
> As has been pointed out your SP should use NText for the @XML parameter
> (regardless of the data types of the destination fields).  Your 
> destination
> fields should use a colation (a poor word MS chose to use there) that can
> accept all the expected characters.
>
> All should work fine assuming the XML generation is conforming to the 
> rules.
>
> My point is you are barking up the wrong tree if you think that changing 
> the
> encoding of the incoming XML from UTF-8 to UTF-16 is going to solve
> anything.
>
>
> You should also note that strictly speaking ? does not exist in the
> ISO-8859-1 charset although in many cases it may appear to work since 
> tools
> such IE will render char 128 from a source claiming to be ISO-8859-1 as ?.
> 128 is the euro symbol in the Windows-1252 charset which is compatible 
> with
> ISO-8859-1 except for characters 128-149 where ISO has a set of almost 
> never
> used control codes that windows has replaced with a set of more useful
> characters.
>
>
>
> 




transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent