Altova Mailing List Archives>Archive Index >microsoft.public.xml Archive Home >Recent entries >Thread Prev - Re: XML parsing error: An invalid character was found in text cont [Thread Next] Re: XML parsing error: An invalid character was found in text contTo: NULL Date: 4/2/2007 9:36:00 PM Stefan explained it well why you get the error... note that in SQL Server 2005, you can pass UTF-8 to the XML datatype via either DBTYPE XML or varbinary. Best regards Michael "Anthony Jones" <Ant@y...> wrote in message news:%23chcWKWYHHA.2556@T...... > > "Doug Chandler" <DougChandler@d...> wrote in message > news:9F2519B4-E24B-4E7B-B15B-6352F06F298E@m...... >> Hi Stefan, >> >> Thanks for your explanation. Your suggestions are consistent with what I >> have found over the past week. ISO-8859-1 does work except for the Euro >> symbol (?) which becomes a question mark (?). It also copes with many > other >> western european characters. I was hoping to find a way of using UTF-8 >> and >> TEXT datatype. The production of the XML is from other systems but we can >> specify the encoding to use. With UTF-16 yes everything works but the > files >> become twice as big and the resources used to process the file are twice > as >> much. >> >> It sounds like the solution for now is to use ISO-8859-1 which I was the >> decision I was coming to anyway; I just could not understand why UTF-8 > would >> not work into SQLServer which you have now explained. I may have to trap > and >> escape the Euro symbol so that it works properly, just in case it is >> used. >> >> My other option seems to be to get the XML produced using UTF-8 and >> change >> it to UTF16 before pumping to the database using NText datatype. A bit of > a >> hack bodge me thinks. >> >> Many thanks for your help. >> > > What data type is Xml.Text ? > > It will be a string. > > What encoding does .NET use to store strings? > > Unicode > > It doesn't matter what encoding the XML declare specifies in the document > as > long as it is consistent with the actual encoding of the document contents > the unicode string result of Xml.Text will be correct. > > As has been pointed out your SP should use NText for the @XML parameter > (regardless of the data types of the destination fields). Your > destination > fields should use a colation (a poor word MS chose to use there) that can > accept all the expected characters. > > All should work fine assuming the XML generation is conforming to the > rules. > > My point is you are barking up the wrong tree if you think that changing > the > encoding of the incoming XML from UTF-8 to UTF-16 is going to solve > anything. > > > You should also note that strictly speaking ? does not exist in the > ISO-8859-1 charset although in many cases it may appear to work since > tools > such IE will render char 128 from a source claiming to be ISO-8859-1 as ?. > 128 is the euro symbol in the Windows-1252 charset which is compatible > with > ISO-8859-1 except for characters 128-149 where ISO has a set of almost > never > used control codes that windows has replaced with a set of more useful > characters. > > > > | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
