Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


converting character entities to us-ascii /equivalents/

From: Robert Koberg <rob@------.--->
To: XML Developers List <xml-dev@-----.---.--->
Date: 10/6/2004 9:57:00 PM
Hi,

I need to output several versions of a page (through XSL 
transformations), one of which is us-ascii (for email). But, the content 
might contain some characters that are not supported by us-ascii (like 
em dash - &#151;).

I want the character entities to remain in the content. When 
transforming to us-ascii, I want to replace the entities with some ascii 
text equivalent: For example, '&#151;' would get converted to '--'.

The XML is pulled into the transformation through the document function 
using a custom URIResolver.

Is there an existing solution to this?

Does Apache's FOP and the text renderer handle this type of thing?

I have tried to set a ContentHandler (actually a DefaultHandler) on the 
XMLReader and tried to replace a character entity, but I am doing 
something wrong and a confused on how to proceed. Using the code below I 
get a recoverable error using saxon/aelfred and a failure when using 
saxon/xerces.

Here is a snippet from the URIResolver:


InputSource in = new InputSource(file.getAbsolutePath());
SAXSource source = new SAXSource(in);
XMLReader reader = null;
try {
   reader = 
XMLReaderFactory.createXMLReader("com.icl.saxon.aelfred.SAXDriver");
   //reader = 
XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
} catch (SAXException e) {
   System.err.println(e.getMessage());
}

reader.setContentHandler(new AsciiHandler());

source.setXMLReader(reader);

return source;



And the DefaultHandler has one method:


public void characters(char[] text, int start, int length) {

   String str = new String(text, start, length);
   if (str.indexOf(174) > -1) {
    str.replaceAll("\u00AE", "(Registered Trademark)");
   }
   text = str.toCharArray();
}

How can I do this? Is there a better way to handle this type of thing?

thanks,
-Rob


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent