Altova Mailing List Archives
>xsl-list Archive Home
>Thread Prev - [xsl] Better Way to Group Siblings By Start/End Markers?
RE: [xsl] Better Way to Group Siblings By Start/End Markers?
Date: 6/23/2008 11:05:00 PM
Another possibility is to use xsl:for-each-group with group-starting-with. I seem to remember that when I last did this, however, it turned out to be easier using sibling recursion - that is, have each w:r element apply-templates to its immediately following sibling. Either way, processing Word XML using XSLT is not for the faint-hearted. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Eliot Kimber [mailto:ekimber@xxxxxxxxxxxx] > Sent: 23 June 2008 23:04 > To: xsl-list > Subject: [xsl] Better Way to Group Siblings By Start/End Markers? > > I am experimenting with using XSLT to convert Office Open XML > into InCopy INCX (the CS3 Word import fails to capture some > things I need captured from the Word data). > > One challenge is handling Word fields, which need to be > converted to any number of different, and > differently-structured, INCX constructs (whose details are > not important here). > > A Word field is organized as a sequence of w:r elements > within a larger sequence of w:r elements. A field start is > indicated by a w:r with a field start indicator and the field > end is indicated by another w:r with a field end indicator. > The w:r elements between these two marker elements comprise > the field data, which can be any number of things, including > w:r elements that would easily occur outside the scope of the > field (e.g., w:r containing literal document content). > > Here is a typical sample: > > <w:r> > <w:t xml:space="preserve">- </w:t> > </w:r> > <w:r > w:rsidR="00BA1D13"> > <w:fldChar > w:fldCharType="begin"/> > </w:r> > <w:r > w:rsidR="00BA1D13"> > <w:instrText>HYPERLINK "http://www.example.com/"</w:instrText> > </w:r> > <w:r > w:rsidR="00BA1D13"> > <w:fldChar > w:fldCharType="separate"/> > </w:r> > <w:r > w:rsidRPr="00B233E5"> > <w:t>HTTP</w:t> > </w:r> > <w:r > w:rsidR="00BA1D13"> > <w:fldChar > w:fldCharType="end"/> > </w:r> > > I have this for-each-group that seems to group correctly, but > I'm wondering if there's a simpler expression that does what I want: > > <xsl:for-each-group select="w:r" > group-adjacent=" > string(self::*[w:fldChar[@w:fldCharType = 'begin' or > @w:fldCharType = 'end']] or > (self::*[preceding-sibling::*/w:fldChar[@w:fldCharType = > 'begin']] and > self::*[following-sibling::*/w:fldChar[@w:fldCharType = > 'end']] and > count((self::*[preceding-sibling::*/w:fldChar[@w:fldCharType > = > 'begin']])/(*[following-sibling::*/w:fldChar[@w:fldCharType > = 'end']]) > | > (self::*[following-sibling::*/w:fldChar[@w:fldCharType = > 'end']])) = 1 > )) > " > > > > In prose (at least this is what I intend the above expression > to mean): if w:r has child w:fldChar where @w:fldCharType = > 'begin' or 'end' or w:r has both a preceding sibling w:r with > a w:fldChar of type 'begin' and a following sibling w:r with > a w:fldChar of type 'end' AND the nearest preceding sibling > field start has the same nearest following sibling field end > as the current node, then return the grouping "true" else > return the grouping key "false". > > Whew. > > I can't think of a simpler way to say this. Is there one? > > I realize I could factor some of the complexity of the > expression out into a function or two, which I will probably do. > > Thanks, > > Eliot > > ---- > Eliot Kimber | Senior Solutions Architect | Really Strategies, Inc. > email: ekimber@xxxxxxxxxxxx <mailto:ekimber@xxxxxxxxxxxx> > office: 610.631.6770 | cell: 512.554.9368 2570 Boulevard of > the Generals | Suite 213 | Audubon, PA 19403 www.reallysi.com > <http://www.reallysi.com> | http://blog.reallysi.com > <http://blog.reallysi.com> | www.rsuitecms.com > <http://www.rsuitecms.com>