Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


RE: [xml-dev] JDOM XSLT TransformerException (newbie)

From: "Michael Kay" <mike@--------.--->
To: "'Jack Bush'" <netbeansfan@-----.---.-->
Date: 1/6/2009 3:44:00 PM
It's not a good idea to address your questions to me 
personally. 
 
It's 
the XML parser that is fetching the DTD, not the transformation 
engine.
 
And 
it's not fetching the DTD in order to do validation, it's fetching it for the 
other information that a DTD contains, such as entity definitions. Switching off 
validation doesn't switch off reading the DTD. (I thought I made that clear in 
my previous response.)
 
Michael Kay
http://www.saxonica.com/



  
  
  From: Jack Bush 
  [mailto:netbeansfan@y...] 
Sent: 06 January 2009 
  13:25
To: Michael Kay
Cc: 
  xml-dev@l...
Subject: Re: [xml-dev] JDOM XSLT 
  TransformerException (newbie)


  
  
  
  Hi Michael,
   
  By replacing the FileReader with InputStream the following codes has 
  finally allow me to read and transformed state.xml to state.html but 
  only when there is an Internet Online connection:
   
  19. SAXBuilder stateBuilder = new 
  SAXBuilder("org.ccil.cowan.tagsoup.Parser", false); 
  20. stateBuilder.setValidation(false);
  21. FileInputStream stateIS = new FileInputStream("E:\\state.xml");
  22. BufferedInputStream stateBIS = new BufferedInputStream(stateIS);
  23. Document stateOriginaljdomDocument = stateBuilder.build(stateBIS);
  24. TransformerFactory stateFactory = TransformerFactory.newInstance();
  25. Transformer stateTransformer = stateFactory.newTransformer(new 
  StreamSource("E:\\stateStyleSheet.xsl"));
  26. JDOMSource stateSource = new JDOMSource(stateOriginaljdomDocument);
  27. JDOMResult stateResult = new JDOMResult();
  28. stateTransformer..transform(stateSource, stateResult);
  ......
  Offline
  javax.xml.transform.TransformerException: org.jdom.JDOMException: DTD 
  parsing error: www.w3.org
  at 
  org.apache.xalan.transformer.TransformerImpl.fatalError(TransformerImpl.java:738)
  at 
  org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:712)
  at 
  org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1126)
  at 
  org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1104)
  at XMLProject.main(generateXML.java:28)
  Caused by: org.jdom.JDOMException: DTD parsing error: www.w3.org
  at 
  org.jdom.transform.JDOMSource$DocumentReader.parse(JDOMSource.java:525)
  at 
  org.apache.xml.dtm.ref.DTMManagerDefault.getDTM(DTMManagerDefault.java:478)
  at 
  org.apache.xalan.transformer..TransformerImpl.transform(TransformerImpl.java:655)
  ... 3 more
   
  It appears that the transformation process is trying to validate DTD even 
  though I have turned validation off during parsing. Can you confirm whether 
  the validation attempt is occurring during parsing or transformation step? And 
  how to prevent it from recurring? 
   
  At the sametime, the content of state.html is:
   
  <?xml version="1.0" encoding="UTF-8"?>
<html>
  
  <body>
    <h2>Transformed State 
  Detail</h2>
    <table 
  border="1">
      <tr 
  bgcolor="lightblue">
        <th 
  align="left">Area 
  Link</th>
        <th 
  align="left">Area 
  Name</th>
      
  </tr>
    </table>
  
  </body>
</html>

  This means that the stateStyleSheet..xsl below is not able to use XPath 
  search to retrieve both Area Link and Area 
  Name from state.xml:
   
  <?xml version="1.0" encoding="ISO-8859-1"?>
  <xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  
  <xsl:template match="/">
  <html>
  <body>
  <h2>Transformed State Detail</h2>
  <table border="1">
  <tr bgcolor="lightblue">
  <th align="left">Area Link</th>
  <th align="left">Area Name</th>
  </tr>
  <xsl:for-each 
  select="/html/body/div[@id='content']/table[@class='sresults']/tr/td/a">
  <tr>
  <td><xsl:value-of select="@href"/></td>
  <td><xsl:value-of select="@title"/></td>
  </tr>
  </xsl:for-each>
  </table>
  </body>
  </html>
  </xsl:template>
  </xsl:stylesheet>
   
  Again, the following XPath search statements have found Area 
  Link, Area Name from state.xml:
   
        XPath stateXpath = 
  XPath.newInstance("/ns:html/ns:body/ns:div[@id='content']/ns:table[@class='sresults']/ns:tr/ns:td/ns:a");
      
  stateXpath.addNamespace("ns", "http://www.w3.org/1999/xhtml");

  In short, how to include the implicit/explicit/both namespace to accurately 
  pick up these values from within stateStyleSheet.xsl?
   
  Many thanks again for your valuable advice,
  Jack

  

  
  
  From: Michael Kay 
  <mike@s...>
To: 
  Jack Bush <netbeansfan@y...>; Robert Koberg 
  <rob@k...>
Cc: 
  xml-dev@l...
Sent: 
  Monday, 5 January, 2009 10:19:14 PM
Subject: RE: [xml-dev] JDOM XSLT 
  TransformerConfigurationException


  

  Well, for some reason it looks as if you are trying to 
  parse using TagSoup but the stack trace shows you are actually parsing using 
  Xerces.
   
  Michael Kay
  http://www.saxonica.com/

  
    
    
    From: Jack Bush 
    [mailto:netbeansfan@y...] 
Sent: 05 January 2009 
    02:38
To: Michael Kay; Robert Koberg
Cc: 
    xml-dev@l...
Subject: Re: [xml-dev] JDOM XSLT 
    TransformerConfigurationException


    
    
    
    Hi Michael, 
     
    The following statements generated state.xml file: 
     
    URL stateUrl = new URL("http://www.abc.com");
    URLConnection stateconnection = stateUrl.openConnection(); 
    stateisInHtml = stateconnection.getInputStream();
    statedisInHtml = new DataInputStream(new 
    BufferedInputStream(stateisInHtml)); 
    System.out.flush();
    statefosOutHtml = new FileOutputStream("state.html");
    while ((oneChar=statedisInHtml.read()) != -1)
    statefosOutHtml..write(oneChar);
    .....
     
    statefrInHtml = new FileReader("state.html");
    statebrInHtml = new BufferedReader(statefrInHtml);
    SAXBuilder statesaxBuilder = new 
    SAXBuilder("org.ccil.cowan.tagsoup..Parser", false);
    org.jdom.Document statejdomDocument = 
    statesaxBuilder.build(statebrInHtml);
    XMLOutputter stateoutputter = new XMLOutputter();
    statefwOutXml = new FileWriter("state.xml");
    statebwOutXml = new BufferedWriter(statefwOutXml);
    stateoutputter.output(statejdomDocument, statebwOutXml);
     
    XPath had no problem looking up state.xml.
     
    Thanks,
    Jack
     
    
    
    
    From: Michael Kay 
    <mike@s...>
To: Jack Bush 
    <netbeansfan@y...>; Robert Koberg 
    <rob@k...>
Cc: 
    xml-dev@l...
Sent: 
    Monday, 5 January, 2009 2:13:33 AM
Subject: RE: [xml-dev] JDOM XSLT 
    TransformerConfigurationException


    

    Nevertheless, I now encountered another 
    issue this time:
    
      
      
       
      java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 
      sequence. 
       
      There's only one 
      explanation of that: the parser is expecting the document to be encoded in 
      UTF-8 but it isn't. To understand why it isn't, you need to examine how 
      the document was created and any transcodings that might have taken place 
      before it reached the parser.
       
      
        
      at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown 
      Source)
        at 
      org.apache.xerces.impl.io.UTF8Reader.read(Unknown 
      Source)
        at 
      org.apache.xerces.impl.XMLEntityScanner.load(Unknown 
      Source)
        at 
      org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown 
      Source)
        at 
      org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown 
      Source)
        at 
      org.apache..xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown 
      Source)
        at 
      org.apache.xerces.parsers.XML11Configuration.parse(Unknown 
      Source)
        at 
      org.apache.xerces..parsers.XML11Configuration.parse(Unknown 
      Source)
        at 
      org.apache.xerces.parsers.XMLParser.parse(Unknown 
      Source)
        at 
      org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown 
      Source)
        at 
      org.jdom.input.SAXBuilder.build(SAXBuilder.java:489)
        
      at org.jdom...input.SAXBuilder.build(SAXBuilder.java:928)
              at 
      JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:45)

      The header of state.xml is as follows:
       
      
        <?xml version="1.0" encoding="UTF-8" 
      ?> 
        <!DOCTYPE html (View Source for full 
      doctype...)> 
      
      - <html 
      xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml">
Any ideas on what is the cause of this 
      issue and how to overcome it? Likewise, how to define the 
      correct proper namespace prefix? Is it possible that this 
      document has two namespaces. A default one and one with prefix 'html'? If 
      so, which one should I use?
       
       It's certainly inelegant to bind the same 
      namespace to two prefixes like this, though it's not incorrect. Again to 
      prevent it happening we need to understand how you created the 
      document.
       
      Michael 
      Kay 

    
    Stay connected to the people that matter most with a smarter inbox. Take a look.

  
  Stay connected to the people that matter most with a smarter inbox.. Take a look.


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent