Home. 
.

transparent

transparent

transparent

Altova Mailing List Archives


Re: [xsl] How to split an RegEx into several lines for readability?

From: Abel Braaksma <abel.online@--------->
To:
Date: 5/2/2007 12:19:00 PM
Dimitre Novatchev wrote:




<xsl:variable name="myregex" as="xs:string">
   (          <!-- grab everything -->
   "          <!-- start of a q. string -->
   [^"]*      <!-- zero or more non-quotes -->
   "          <!-- end of a q. string -->
   )          <!-- closing 'grab all' -->
</xsl:variable>





I think this is probably the most amazing and useful tip I got in this
thread -- fully deserves to be in the XSLT FAQ!

I have never before seen a string variable defined in this way --
probably because we do nod have an "x" modifier (and modifiers at all)
when generally defining variables.

And you don't need to, as you can see above :)



Btw, many languages define an object or type Regex, which is something 
like a precompiled regular expression. Unfortunately, the W3 committee 
never created something like xs:regex, which keeps us from defining 
precompiled regexes. If I understood Michael's comment correctly, the 
optimizer may or may not 'see' that the regular expression is static or 
dynamic. If it decides that it is dynamic, the method above will 
introduce quite a performance hit as Saxon will recompile the regex on 
each use.



If only I were capable of having an input XML document with elements of 
type xs:regex, the parser could (then) precompile and reuse them (but 
now I'm drifting...).




Also, I don't think I've ever seen before comments interspersed within
the string contents of an xsl:variable.

Well, you should have a look at my code ;)

We do a lot of text-to-xml transform and complex regular expressions 
help us a lot, but are notoriously hard to read. Hence, about a year 
ago, I asked about the same question, and I summarized the cumulative 
answers here: 
http://www.biglist.com/lists/xsl-list/archives/200607/msg00733.html



Other ideas that came up by several people included:



 * use a custom function that takes a string, removes your self-defined 
comments and returns a valid regex (there's another thread where I 
offered a regex-test by a regex, which may come in useful)

 * use the AVT possibilities to introduce XPath comments:  
regex=".*{''(: comment here :)}[abc]" or regex=".*{()(: comment here 
:)}[abc]"

 * various ways of concat/join plus interspersed XPath comments



Also, in that thread, Michael explained why there wasn't a similar 
construct as the '#' comment in Perl regexes.



In practice, in AVTs, I found the xsl:variable approach extremely useful 
not only for its ease of verbosity, but also because it reliefs me from 
doubling { and } characters, and escaping (mixed) quotes.



Cheers,
-- Abel Braaksma


transparent
Print
Mail
Like It
Disclaimer
.

These Archives are provided for informational purposes only and have been generated directly from the Altova mailing list archive system and are comprised of the lists set forth on www.altova.com/list/index.html. Therefore, Altova does not warrant or guarantee the accuracy, reliability, completeness, usefulness, non-infringement of intellectual property rights, or quality of any content on the Altova Mailing List Archive(s), regardless of who originates that content. You expressly understand and agree that you bear all risks associated with using or relying on that content. Altova will not be liable or responsible in any way for any content posted including, but not limited to, any errors or omissions in content, or for any losses or damage of any kind incurred as a result of the use of or reliance on any content. This disclaimer and limitation on liability is in addition to the disclaimers and limitations contained in the Website Terms of Use and elsewhere on the site.

.
.

transparent

transparent