Altova Mailing List Archives>Archive Index >xml-dev Archive Home >Recent entries >Thread Prev - Re: [xml-dev] practical question re: Java/XML handling [Thread Next] Re: [xml-dev] practical question re: Java/XML handlingTo: Andrew Welch <andrew.j.welch@-----.---> Date: 9/3/2009 1:47:00 PM >> How do you handle entities in the XML ?
This code handles entities just fine as long as they dont end in ".dtd"
if( systemId.toLowerCase().endsWith(".dtd"))
return new InputSource( new NullInputStream());
else
*return super.resolveEntity(publicId, systemId);*
David A. Lee
dlee@c...
http://www.calldei.com
http://www.xmlsh.org
812-482-5224
Andrew Welch wrote:
> How do you handle entities in the XML ?
>
>
> 2009/9/3 David A. Lee <dlee@c...>:
>
>> I solved this problem in a different that is less destructive. This also
>> works to replace a DTD with a different one or to force validation on a
>> schema even if a non-existant DTD is specified.
>>
>> This particular implementation requires using the SAXParser but I belive
>> the idea would work with other parsers that provide similar functionality,
>> namely an override of "resolveEntity". The key trick is to resolve all
>> DTD's with a "NullInputStream" ( these are trivial to write so I wont supply
>> the code here)
>> An empty DTD file validates any XML (atleast it does in my tests).
>>
>> Here's the snippet
>>
>>
>> private class ValidatorHandler extends DefaultHandler {
>> ..... // other methods as needed
>> @Override
>> public InputSource resolveEntity(String publicId, String systemId)
>> throws IOException,
>> SAXException {
>>
>> if( systemId.toLowerCase().endsWith(".dtd"))
>> return new InputSource( new NullInputStream());
>> else
>> return super.resolveEntity(publicId, systemId);
>> }
>> }
>>
>>
>>
>> SAXParserFactory f = SAXParserFactory.newInstance();
>> .... setup the factory
>>
>>
>>
>> SAXParser parser = f.newSAXParser();
>> ... setup the parser
>>
>>
>>
>> parser.parse(xml, new ValidatorHandler());
>>
>>
>>
>>
>>
>>
>>
>>
>> David A. Lee
>> dlee@c...
>> http://www.calldei.com
>> http://www.xmlsh.org
>> 812-482-5224
>>
>> Mike Sokolov wrote:
>>
>> After all the discussion about "What is data?" I don't know if this list is
>> the place to discuss actual details of implementation, but please feel free
>> to send me elsewhere if you can think of a better venue.
>>
>> I have a need to handle XML that references a non-existent DTD. The DTD is
>> irrelevant to the actual processing of the XML, and isn't available
>> anywhere, but it is declared in in the DOCTYPE. I'm sure many of you have
>> encountered this situation: it's practically the norm, in my experience.
>>
>> After years of dealing with this inherently unsatisfactory situation in a
>> variety of ways, I came up with a new one that I am liking at the moment,
>> which is to insert a Stream into a Java XML processing stack that strips out
>> the prolog of the XML document before handing it off to a parser. This has
>> the nice property that it doesn't require modifications to the stored XML
>> files. It loses PIs and comments and the XML decl, but I can live with
>> that.
>>
>> My question is twofold:
>>
>> 1) does the following code snippet actually do what it is claiming to? Does
>> anybody see any obvious mistakes? My knowledge of the format of DOCTYPE
>> decls and so on is somewhat limited. I read the spec and this seems to work
>> on the examples I have, but I suspect there are some cases I'm not handling.
>>
>> 2) Is there a better approach? Existing code to do the same thing? Some
>> way to tell parsers to ignore the DOCTYPE (even though that seems to run
>> counter to the spec)?
>>
>> Thanks for your attention...
>>
>> -Mike Sokolov
>>
>> /**
>> * An InputStream for XML that strips off the prolog of an XML
>> * document. The idea is to avoid having to prevent parsers from
>> attempting
>> * to process an external DTD.
>> *
>> * @author sokolov
>> *
>> */
>> class XmlNoPrologInputStream extends PushbackInputStream {
>> XmlNoPrologInputStream (InputStream base) throws IOException {
>> super (base, 2);
>> int c;
>> while ((c = read()) >= 0) {
>> if (c == '<') {
>> int c1 = read();
>> if (c1 < 0) {
>> // ill-formed
>> reset();
>> return;
>> }
>> // XML declaration, PI, comment or DOCTYPE
>> if (c1 == '?' || c1 == '!')
>> continue;
>> // must be the start of the document: arrange to begin
>> // reading here
>> unread(c1);
>> unread(c);
>> return;
>> }
>> }
>> }
>>
>> _______________________________________________________________________
>>
>> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
>> to support XML implementation and development. To minimize
>> spam in the archives, you must subscribe before posting.
>>
>> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
>> Or unsubscribe: xml-dev-unsubscribe@l...
>> subscribe: xml-dev-subscribe@l...
>> List archive: http://lists.xml.org/archives/xml-dev/
>> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>>
>>
>
>
>
>
| ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
