Altova Mailing List Archives>Archive Index >xml-dev Archive Home >Recent entries >Thread Prev - [xml-dev] practical question re: Java/XML handling >Thread Next - Re: [xml-dev] practical question re: Java/XML handling Re: [xml-dev] practical question re: Java/XML handlingTo: Mike Sokolov <sokolov@--------.---> Date: 9/3/2009 1:37:00 PM I solved this problem in a different that is less destructive. This also works to replace a DTD with a different one or to force validation on a schema even if a non-existant DTD is specified. This particular implementation requires using the SAXParser but I belive the idea would work with other parsers that provide similar functionality, namely an override of "resolveEntity". The key trick is to resolve all DTD's with a "NullInputStream" ( these are trivial to write so I wont supply the code here) An empty DTD file validates any XML (atleast it does in my tests). Here's the snippet private class ValidatorHandler extends DefaultHandler { ..... // other methods as needed @Override public InputSource resolveEntity(String publicId, String systemId) throws IOException, SAXException { * if( systemId.toLowerCase().endsWith(".dtd")) return new InputSource( new NullInputStream()); else return super.resolveEntity(publicId, systemId);* } } SAXParserFactory f = SAXParserFactory.newInstance(); .... setup the factory SAXParser parser = f.newSAXParser(); ... setup the parser parser.parse(xml, new ValidatorHandler()); David A. Lee dlee@c... http://www.calldei.com http://www.xmlsh.org 812-482-5224 Mike Sokolov wrote: > After all the discussion about "What is data?" I don't know if this > list is the place to discuss actual details of implementation, but > please feel free to send me elsewhere if you can think of a better venue. > > I have a need to handle XML that references a non-existent DTD. The > DTD is irrelevant to the actual processing of the XML, and isn't > available anywhere, but it is declared in in the DOCTYPE. I'm sure > many of you have encountered this situation: it's practically the > norm, in my experience. > > After years of dealing with this inherently unsatisfactory situation > in a variety of ways, I came up with a new one that I am liking at the > moment, which is to insert a Stream into a Java XML processing stack > that strips out the prolog of the XML document before handing it off > to a parser. This has the nice property that it doesn't require > modifications to the stored XML files. It loses PIs and comments and > the XML decl, but I can live with that. > > My question is twofold: > > 1) does the following code snippet actually do what it is claiming > to? Does anybody see any obvious mistakes? My knowledge of the > format of DOCTYPE decls and so on is somewhat limited. I read the > spec and this seems to work on the examples I have, but I suspect > there are some cases I'm not handling. > > 2) Is there a better approach? Existing code to do the same thing? > Some way to tell parsers to ignore the DOCTYPE (even though that seems > to run counter to the spec)? > > Thanks for your attention... > > -Mike Sokolov > > /** > * An InputStream for XML that strips off the prolog of an XML > * document. The idea is to avoid having to prevent parsers from > attempting > * to process an external DTD. > * > * @author sokolov > * > */ > class XmlNoPrologInputStream extends PushbackInputStream { > XmlNoPrologInputStream (InputStream base) throws > IOException { > super (base, 2); > int c; > while ((c = read()) >= 0) { > if (c == '<') { > int c1 = read(); > if (c1 < 0) { > // ill-formed > reset(); > return; > } > // XML declaration, PI, comment or DOCTYPE > if (c1 == '?' || c1 == '!') > continue; > // must be the start of the document: arrange to begin > // reading here > unread(c1); > unread(c); > return; > } > } > } > > _______________________________________________________________________ > > XML-DEV is a publicly archived, unmoderated list hosted by OASIS > to support XML implementation and development. To minimize > spam in the archives, you must subscribe before posting. > > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ > Or unsubscribe: xml-dev-unsubscribe@l... > subscribe: xml-dev-subscribe@l... > List archive: http://lists.xml.org/archives/xml-dev/ > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
