Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xsl] character entities

From: "Andrew Welch" <andrew.j.welch@--------->
To:
Date: 11/3/2008 8:55:00 AM
Hi,

> I'm having a wee spot of bother with character entities.

It's character encoding rather than character entities

> This data is then put into fields within a Zend Search Lucene index, via
> php (that's why I first "flattened" it).
>
> This index data is then queried (again via php) and the results sent
> to/rendered by a browser.
>
> If I put &#241_; (minus the underline character, which I've added so
> this email is not mis-parsed) in my original xml, and using
> encoding="iso-8859-1" for it and my xsl stylesheet, then my xsl
> transforms that into a (Spanish) n character with a tilde on top: q.
>
> If I tell ZSL to index fields using 'iso-8859-1' encoding, my Spanish n
> becomes: CB1. If I tell ZSL to index fields using 'utf-8' encoding, my
> Spanish n becomes: C1.

These sorts of issues are nearly always a case of writing in one
encoding and reading in another, and you just need to track down where
the reading and writing is happening - it could be a string to byte
conversion in your code, or parsing of the markup in the browser, or
even the text viewer you are using to check the output (such as the
eclipse output window)

> I believe I need to prevent all parsers bar the browser at the end from
> parsing my "special characters", right? But how?

Not really, that's just a way of bypassing encoding problems and
doesn't address the underlying issue.

> Latest effort: I tried using encoding="utf-8" for all levels: my original
> xml, my xsl output, and the input to ZSL's index, & I also saved my xml
file
> as utf-8 format, and used the Spanish n inside my xml, i.e. q rather than
> &#241;. Doing that, the Spanish n was preserved through the xsl output, but
> ZSL stores it as: C1, & that's also how my browser displays it.

Ahh ok, well that's the right approach, you just need to examine the
code at every step and isolate that point where it's going wrong -
you've got to the output of transform ok, next is to carefully step
through what happens between that and "ZSL".

Using the actual n-tilde charactor or the character reference 241
shouldn't make any different, by the way...


cheers
--
Andrew Welch
http://andrewjwelch.com
Kernow: http://kernowforsaxon.sf.net/


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent