Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xml-dev] Accurate line and column numbers for elements and attributesin Java

From: John Snelson <john.snelson@------.--->
To: "Klotz, Leigh" <Leigh.Klotz@-----.--->
Date: 6/6/2008 5:45:00 PM
Hi Leigh,

There's a reason that most XML parsers don't report accurate line and 
column numbers - the XML specification makes it difficult in a number of 
ways. For instance, the namespaces spec basically requires that all 
attributes of an element be parsed before the element can be reported, 
which often means that line and column information for the attributes is 
lost. Similarly, the XML spec applies attribute value normalization to 
each attribute value, meaning that the original position of characters 
in the source document is lost.

What a lot of XML tools end up doing is reporting a line number and then 
doing something like quoting the attribute value marked with the 
position of the error. This is quite unsatisfactory when you're used to 
accurate line and column numbers from other tools - especially for 
programming languages.

In writing my own parser FAXPP [1] (in C, so not great for you), one of 
my aims was to be able to use it to report accurate line and column 
numbers when parsing XSLT. In order to do that it has an option to turn 
off attribute value normalization - but of course the application then 
has to perform the same operation later to be described as a conformant 
XML parser.

John

[1] http://faxpp.sourceforge.net/

Klotz, Leigh wrote:
> I'm developing an XML representation for a templating system which can
> report errors at both parse and at execution time.
> 
> The existing system does a good job of reporting input file line numbers
> on its element- and attribute-like constructs, and I'd like to provide
> that same functionality with the XML representation.
> 
> Most SAX2 implementations (including the default one with recent Sun
> Java systems) offer start and end line numbers of tags, but provide only
> the ending column number of element content.
> 
> While I'm able to make use of this information, the most critical need
> for column numbers comes in attribute values, so that errors in
> attribute-value template expressions can be reported with column numbers
> accurate in the original input stream.  SAX2 is no help here, as the
> attributes spring forth fully formed as if from Zeus's forehead.
> 
> I've taken a look at Piccolo, which has a commented-out set of column
> number info (on elements only), and at Micahel Kay's suggestion of using
> the Saxon AugmentedSource (
> http://www.nabble.com/-xml-dev--Line-number-of-a-node-to7990605.html#a79
> 90605 ) but again this provides only line numbers, not column numbers,
> and not for attributes.
> 
> Does anyone have experience or recommendations for a non-validating,
> namespace-aware XML parser in Java which supports or can easily be made
> to support accurate beginning line and column numbers of both elements
> and attributes (and maybe text)?  The solution needs to be released
> under a BSD license.
> 
> Thank you,
> Leigh.

-- 
John Snelson, Oracle Corporation            http://snelson.org.uk/john
Berkeley DB XML:            http://oracle.com/database/berkeley-db/xml
XQilla:                                  http://xqilla.sourceforge.net


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent