Altova Mailing List Archives>Archive Index >xml-dev Archive Home >Recent entries >Thread Prev - Re: [xml-dev] practical question re: Java/XML handling >Thread Next - Re: [xml-dev] practical question re: Java/XML handling Re: [xml-dev] practical question re: Java/XML handlingTo: "David A. Lee" <dlee@-------.---> Date: 9/3/2009 1:41:00 PM How do you handle entities in the XML ? 2009/9/3 David A. Lee <dlee@c...>: > I solved this problem in a different that is less destructive. This also > works to replace a DTD with a different one or to force validation on a > schema even if a non-existant DTD is specified. > > This particular implementation requires using the SAXParser but I belive > the idea would work with other parsers that provide similar functionality, > namely an override of "resolveEntity". The key trick is to resolve all > DTD's with a "NullInputStream" ( these are trivial to write so I wont supply > the code here) > An empty DTD file validates any XML (atleast it does in my tests). > > Here's the snippet > > > private class ValidatorHandler extends DefaultHandler { > ..... // other methods as needed > @Override > public InputSource resolveEntity(String publicId, String systemId) > throws IOException, > SAXException { > > if( systemId.toLowerCase().endsWith(".dtd")) > return new InputSource( new NullInputStream()); > else > return super.resolveEntity(publicId, systemId); > } > } > > > > SAXParserFactory f = SAXParserFactory.newInstance(); > .... setup the factory > > > > SAXParser parser = f.newSAXParser(); > ... setup the parser > > > > parser.parse(xml, new ValidatorHandler()); > > > > > > > > > David A. Lee > dlee@c... > http://www.calldei.com > http://www.xmlsh.org > 812-482-5224 > > Mike Sokolov wrote: > > After all the discussion about "What is data?" I don't know if this list is > the place to discuss actual details of implementation, but please feel free > to send me elsewhere if you can think of a better venue. > > I have a need to handle XML that references a non-existent DTD. The DTD is > irrelevant to the actual processing of the XML, and isn't available > anywhere, but it is declared in in the DOCTYPE. I'm sure many of you have > encountered this situation: it's practically the norm, in my experience. > > After years of dealing with this inherently unsatisfactory situation in a > variety of ways, I came up with a new one that I am liking at the moment, > which is to insert a Stream into a Java XML processing stack that strips out > the prolog of the XML document before handing it off to a parser. This has > the nice property that it doesn't require modifications to the stored XML > files. It loses PIs and comments and the XML decl, but I can live with > that. > > My question is twofold: > > 1) does the following code snippet actually do what it is claiming to? Does > anybody see any obvious mistakes? My knowledge of the format of DOCTYPE > decls and so on is somewhat limited. I read the spec and this seems to work > on the examples I have, but I suspect there are some cases I'm not handling. > > 2) Is there a better approach? Existing code to do the same thing? Some > way to tell parsers to ignore the DOCTYPE (even though that seems to run > counter to the spec)? > > Thanks for your attention... > > -Mike Sokolov > > /** > * An InputStream for XML that strips off the prolog of an XML > * document. The idea is to avoid having to prevent parsers from > attempting > * to process an external DTD. > * > * @author sokolov > * > */ > class XmlNoPrologInputStream extends PushbackInputStream { > XmlNoPrologInputStream (InputStream base) throws IOException { > super (base, 2); > int c; > while ((c = read()) >= 0) { > if (c == '<') { > int c1 = read(); > if (c1 < 0) { > // ill-formed > reset(); > return; > } > // XML declaration, PI, comment or DOCTYPE > if (c1 == '?' || c1 == '!') > continue; > // must be the start of the document: arrange to begin > // reading here > unread(c1); > unread(c); > return; > } > } > } > > _______________________________________________________________________ > > XML-DEV is a publicly archived, unmoderated list hosted by OASIS > to support XML implementation and development. To minimize > spam in the archives, you must subscribe before posting. > > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ > Or unsubscribe: xml-dev-unsubscribe@l... > subscribe: xml-dev-subscribe@l... > List archive: http://lists.xml.org/archives/xml-dev/ > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php > -- Andrew Welch http://andrewjwelch.com Kernow: http://kernowforsaxon.sf.net/ _______________________________________________________________________ XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting. [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ Or unsubscribe: xml-dev-unsubscribe@l... subscribe: xml-dev-subscribe@l... List archive: http://lists.xml.org/archives/xml-dev/ List Guidelines: http://www.oasis-open.org/maillists/guidelines.php | ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
