Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


RE: [xml-dev] JDOM XSLT TransformerConfigurationException

From: "Michael Kay" <mike@--------.--->
To: "'Jack Bush'" <netbeansfan@-----.---.-->, "'Robert Koberg'"
Date: 1/5/2009 11:20:00 AM
Well, for some reason it looks as if you are trying to parse using TagSoup
but the stack trace shows you are actually parsing using Xerces.
 
Michael Kay
http://www.saxonica.com/


  _____  

From: Jack Bush [mailto:netbeansfan@y...] 
Sent: 05 January 2009 02:38
To: Michael Kay; Robert Koberg
Cc: xml-dev@l...
Subject: Re: [xml-dev] JDOM XSLT TransformerConfigurationException



Hi Michael, 

 

The following statements generated state.xml file: 

 

URL stateUrl = new URL("http://www.abc.com");

URLConnection stateconnection = stateUrl.openConnection(); 

stateisInHtml = stateconnection.getInputStream();

statedisInHtml = new DataInputStream(new
BufferedInputStream(stateisInHtml)); 

System.out.flush();

statefosOutHtml = new FileOutputStream("state.html");

while ((oneChar=statedisInHtml.read()) != -1)

statefosOutHtml.write(oneChar);

.....

 

statefrInHtml = new FileReader("state.html");

statebrInHtml = new BufferedReader(statefrInHtml);

SAXBuilder statesaxBuilder = new
SAXBuilder("org.ccil.cowan.tagsoup..Parser", false);

org.jdom.Document statejdomDocument = statesaxBuilder.build(statebrInHtml);

XMLOutputter stateoutputter = new XMLOutputter();

statefwOutXml = new FileWriter("state.xml");

statebwOutXml = new BufferedWriter(statefwOutXml);

stateoutputter.output(statejdomDocument, statebwOutXml);

 

XPath had no problem looking up state.xml.

 

Thanks,

Jack

 


  _____  

From: Michael Kay <mike@s...>
To: Jack Bush <netbeansfan@y...>; Robert Koberg <rob@k...>
Cc: xml-dev@l...
Sent: Monday, 5 January, 2009 2:13:33 AM
Subject: RE: [xml-dev] JDOM XSLT TransformerConfigurationException


Nevertheless, I now encountered another issue this time:

 
java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence. 
 
There's only one explanation of that: the parser is expecting the document
to be encoded in UTF-8 but it isn't. To understand why it isn't, you need to
examine how the document was created and any transcodings that might have
taken place before it reached the parser.
 

        at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
        at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown Source)
        at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatc
her.dispatch(Unknown Source)
        at
org.apache..xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
        at org.apache.xerces..parsers.XML11Configuration.parse(Unknown
Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:489)
        at org.jdom..input.SAXBuilder.build(SAXBuilder.java:928)
        at
JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:45)

The header of state.xml is as follows:
 
  <?xml version="1.0" encoding="UTF-8" ?> 
  <!DOCTYPE html (View Source for full doctype...)> 
- <html xmlns="http://www.w3.org/1999/xhtml"
xmlns:html="http://www.w3.org/1999/xhtml">

Any ideas on what is the cause of this issue and how to overcome it?
Likewise, how to define the correct proper namespace prefix? Is it possible
that this document has two namespaces. A default one and one with prefix
'html'? If so, which one should I use?
 
 It's certainly inelegant to bind the same namespace to two prefixes like
this, though it's not incorrect. Again to prevent it happening we need to
understand how you created the document.
 
Michael Kay 


  _____  

Stay connected to the people that matter most with a smarter inbox. Take a
look
<http://au.rd.yahoo.com/galaxy/mail/tagline2/*http://au.docs.yahoo.com/mail/
smarterinbox> .



transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent