Altova Mailing List Archives
>xsl-list Archive Home
>Recent entries
>Thread Prev - [xsl] for-each-group grouping accented versions of letters together
>Thread Next - Re: [xsl] for-each-group grouping accented versions of letters
Re: [xsl] for-each-group grouping accented versions of letters
To: xsl-list@-----.------------.---
Date: 4/21/2012 1:03:00 AM
You can strip the accents by unicode decomposition and then removing the
diacritical marks:
<xsl:for-each-group select="index-0"
group-by="substring(
upper-case(
replace(
normalize-unicode(heading, 'NFKD'),
'[̀-ͯ]',
''
)
), 1, 1
)">
<xsl:sort select="current-grouping-key()"/>
When writing the group (= starting letter) to an output file further
down in you template, you should sort it according to the upper-case(…)
part as first sort key, then according to the actual heading as a second
(tie-breaker) sort key.
So it’s best to make a function (call it, e.g., my:sortkey) out of
upper-case(…).
In that function, you can also do other useful stuff, such as
eliminating stop words or replacing all numbers with a zero, so that
everything that starts with a number will be in the same group.
Gerrit
On 2012-04-21 02:03, Graydon wrote:
> So I've got an XML index file, which is too large for some downstream
> processing to be entirely pleased with. The requirement is to split the
> file up, grouping index entries (index-0 elements; the index element is
> the overall container element) by the first character of their child
> heading element.
>
> Using XSLT 2.0, this is pretty easy:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet exclude-result-prefixes="xs xd" version="2.0"
> xmlns:xd="www.---.com;
> xmlns:xs="http://www.w3.org/2001/XMLSchema"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> <xsl:template match="/wkna-shared-cms/index">
> <xsl:for-each-group group-by="substring(heading,1,1)" select="index-0">
> <xsl:sort select="./heading"/>
> <xsl:result-document href="eitaindex+Topical_Index_{current-grouping-key()}.xml">
> <wkna-shared-cms>
> <index area="{/wkna-shared-cms/index/@area}"
> xml:lang="{/wkna-shared-cms/index/@xml:lang}">
> <num cite="Topical Index {current-grouping-key()}">
> <xsl:sequence select="current-grouping-key()"/>
> </num>
> <xsl:copy-of select="/wkna-shared-cms/index/index-metadata"/>
> <xsl:copy-of select="current-group()"/>
> </index>
> </wkna-shared-cms>
> </xsl:result-document>
> </xsl:for-each-group>
> </xsl:template>
> </xsl:stylesheet>
>
> The problem is that some of the initial characters of the headings have
> accents, and it's desired that the accented characters and the
> unaccented characters group together, so that E and É and Ê, etc. all
> group together in a group with a current-grouping-key() of "E".
>
> I can imagine doing this in a painful way with conditional statements
> and an exhaustive list of characters, but I'm hoping someone can tell me
> there's a better way.
>
> Thanks!
>
> -- Graydon
>
> --~------------------------------------------------------------------
> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
> To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
> or e-mail:<mailto:xsl-list-unsubscribe@l...>
> --~--
>
--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsieke@l..., http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930
Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe@l...>
--~--
Disclaimer
These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

