Altova Mailing List Archives>Archive Index >comp.text.xml Archive Home >Recent entries >Thread Prev - Re: Is it possible NOT to replace entity references? >Thread Next - Re: Is it possible NOT to replace entity references? Re: Is it possible NOT to replace entity references?To: NULL Date: 9/6/2005 12:54:00 AM Hi, thanks for the detailed explanation. You are right, these are two 'issues', I confused them because the Python SAX parser I use replaces both the predefined and the not predefined entity references, which is ok. I simply assumed an XSLT processor would also replace both, but that assumption is probably wrong. I don't know why I prefer ä over 'ä', maybe because 7-bit ASCI seems to be more portable, but I can't really find a use case where 'ä' would be less portable. Thanks, Stephan Martin Honnen wrote: > > > Stephan Hoffmann wrote: > > >> I use XML mainly as a source for HTML. HTML browsers 'know' >> certain entity references like é or ä. >> >> When I use XSL to transform XML to HTML or XML, these entities are >> replaced by what they refer to. >> >> Is there a way to avoid that? > > XSLT/XPath 1.0 at least which is the current version and the one > implemented by lots of processors and in wide-spread use does not > provide anything in its data model or in its instructions to create > entity references and to ensure that these are preserved and not > replaced by the entity content when the result of a transformation is > serialized. > You would need to look at a specific XSLT processor and check whether it > provides any mechanisms outside the standards to deal with entity and > entity references. > Saxon 6 has an extension function documented here: > <http://saxon.sourceforge.net/saxon6.5.4/extensions.html#saxon:entity-ref> > >> Two reasons to avoid that: >> - On my linux machine xsltproc replaced the entities in a way that >> my browser did not correctly display the resulting HTML >> (I updated my linux distribution and it now works). >> >> - < is replaced by < and the output is no longer valid XML/HTML > > But < and > are references to entities predefined in XML and > certainly if any application supposed to output XML or HTML outputs < > as a plain '<' character then the application is seriously broken. > This is a different issue, those characters '<' and '>' are obviously > special as they delimit tags in both XML and HTML and therefore need to > be escaped as < respectively >. > ä in HTML 4 stands for the character 'ä' and that has no special > meaning in XML or HTML so if an XSLT processor or other application > supposed to output XML or HTML simply inserts 'ä' instead of ä in a > document properly encoded and with the proper encoding used and declared > then there are no problems with well-formedness (or even validity). > > > | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
