Altova Mailing List Archives

Re: [xsl] xsl:sort with msxml english language, danish characters, weird results

From: "W. Eliot Kimber" <ekimber@------------------->
Date: 10/25/2004 8:38:00 AM
Michael Kay wrote:

I'm not sure I'm following here--at least using Java 
you should be able to achieve any collation sequence whatsoever.

But I'm not sure what you mean by sorting 646 before 10646.

A possible algorithm is that any sequence of digits counts as a single
collation unit, which is collated before the first collation unit derived
from non-digit characters, and has a collation value equal to its decimal

I don't believe you can achieve this with a RuleBasedCollator.

Ah, I understand now--I misunderstood your comment as being about the 
standards, not the strings "646" and "10646".

I think you are correct, although I'll have to test it.

Of course, this type of rule can be implemented using a custom 
Comparator implementation that implements whatever rule you want, 
delegating the character-level comparison to a rule-based collator. I 
don't think there's any way that a purely declarative mechanism, which 
is what I understand the UCA to define (and what RuleBasedCollator 
implements) to handle all cases.


W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(512) 372-8122



