Altova Mailing List Archives>Archive Index >comp.text.xml Archive Home >Recent entries >Thread Prev - SAX multiple calls to characters() [Thread Next] Re: SAX multiple calls to characters()To: NULL Date: 8/5/2006 5:53:00 PM
Le 02-08-2006, metzger@r... <metzger@r...> a écrit :
> I am using the function listed below to handle characters events in
> SAX. It does not handle multiple sequential calls to this function
> correctly.
> For example, I am getting
> "2 4 816 32 64" as a value for an element when processing <vec> 2 4 8
> 16 32 64 </vec>
> because I am getting 2 calls to process the text in this element, one
> for "2 4 8" and the other for "16 32 64".
The expected behavior of a SAX API preserve all characters, whitespaces
included, of the input document. Your problem is either in your code, either
(less probably ;-)) in the SAX implementation you used.
> I have tried appending a
> blank to the result after each call to this function, but that
> sometimes splits numbers or words, depending on what is passed to this
> function.
I think this is definitly not the good solution :-)
> Is there a better way of handling multiple characters events? Thanks
> public void characters(char[] chars, int start, int length) {
> while ( (length > 0) && Character.isWhitespace(chars[start]) )
> {
> ++start;
> --length;
> }
> while ( (length > 0) &&
> Character.isWhitespace(chars[start+length-1]) ) {
> --length;
> }
This is this piece of code, as far as I understand, which is responsible for
the behaviour you complain about! You remove the trailling whitespace
characters in the characters chunks you receive, so how can you expect to see
the whitespace characters in the outputed string?
> if ( length > 0 ) {
> _text += new String(chars,start,length);
> }
> }
Concatenation is dangerous for performance. You may consider using a
StringBuffer (sb.append(chars, start, length)).
| ||||||
| Company | Legal | Press | Partners | Careers | Sitemap | Contact Us | Altova Blog | Mobile | Full Site | |||
|
