Re: [xsl] for-each-group grouping accented versions of letters

Date: 4/21/2012 1:03:00 AM
You can strip the accents by unicode decomposition and then removing the 
diacritical marks:

<xsl:for-each-group select="index-0"
                   normalize-unicode(heading, 'NFKD'),
               ), 1, 1
   <xsl:sort select="current-grouping-key()"/>

When writing the group (= starting letter) to an output file further 
down in you template, you should sort it according to the upper-case(…) 
part as first sort key, then according to the actual heading as a second 
(tie-breaker) sort key.

So it’s best to make a function (call it, e.g., my:sortkey) out of 

In that function, you can also do other useful stuff, such as 
eliminating stop words or replacing all numbers with a zero, so that 
everything that starts with a number will be in the same group.


On 2012-04-21 02:03, Graydon wrote:
> So I've got an XML index file, which is too large for some downstream
> processing to be entirely pleased with.  The requirement is to split the
> file up, grouping index entries (index-0 elements; the index element is
> the overall container element) by the first character of their child
> heading element.
> Using XSLT 2.0, this is pretty easy:
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet exclude-result-prefixes="xs xd" version="2.0"
>    xmlns:xd=";
>    xmlns:xs=""
>    xmlns:xsl="">
>    <xsl:template match="/wkna-shared-cms/index">
>      <xsl:for-each-group group-by="substring(heading,1,1)" select="index-0">
>        <xsl:sort select="./heading"/>
>        <xsl:result-document href="eitaindex+Topical_Index_{current-grouping-key()}.xml">
>          <wkna-shared-cms>
>            <index area="{/wkna-shared-cms/index/@area}"
>              xml:lang="{/wkna-shared-cms/index/@xml:lang}">
>              <num cite="Topical Index {current-grouping-key()}">
>                <xsl:sequence select="current-grouping-key()"/>
>              </num>
>              <xsl:copy-of select="/wkna-shared-cms/index/index-metadata"/>
>              <xsl:copy-of select="current-group()"/>
>            </index>
>          </wkna-shared-cms>
>        </xsl:result-document>
>      </xsl:for-each-group>
>    </xsl:template>
> </xsl:stylesheet>
> The problem is that some of the initial characters of the headings have
> accents, and it's desired that the accented characters and the
> unaccented characters group together, so that E and É and Ê, etc. all
> group together in a group with a current-grouping-key() of "E".
> I can imagine doing this in a painful way with conditional statements
> and an exhaustive list of characters, but I'm hoping someone can tell me
> there's a better way.
> Thanks!
> -- Graydon
