Please enable JavaScript to view this site.

Altova MapForce 2020 Basic Edition

MapForce can use regular expressions in the pattern parameter of the tokenize-regexp function, to find specific strings.


The regular expression syntax and semantics for XSLT and XQuery are identical to those defined in Please note that there are slight differences in regular expression syntax between the various programming languages.




the string that the regex works on


the regular expression


optional parameter to define how the regular expression is to be interpreted


the result of the function


Tokenize-regexp returns a sequence of strings. The connection to the Rows item creates one row per item in the sequence.


regex syntax

Literals e.g. a single character:

e.g. The letter "a" is the most basic regex. It matches the first occurrence of the character "a" in the string.


Character classes []

This is a set of characters enclosed in square brackets.


One, and only one, of the characters in the square brackets are matched.




Matches a lowercase vowel.




Matches must or just


Please note that "pattern" is case sensitive, a lower case a does not match the uppercase A.



Character ranges [a-z]

Creates a range between the two characters. Only one of the characters will be matched at one time.




Matches any lowercase characters between a and z.



negated classes [^]

using the caret as the first character after the opening bracket, negates the character class.




Matches any character not in the character class, including newlines.


Meta characters "."

Dot meta character

matches any single character (except for newline)




Matches any single character.


Quantifiers ? + * {}

Quantifiers define how often a regex component must repeat within the input string, for a match to occur.



zero or one

preceding string/chunk is optional


one or more

preceding string/chunks may match one or more times


zero or more

preceding string/chunks may match zero or more times


min / max

no. of repetitions a string/chunks has to match

e.g. mo{1,3} matches mo, moo, mooo.





parentheses are used to group parts of a regex together.



Alternation/or        allows the testing of subexpressions form left to right.

(horse|make) sense - will match "horse sense" or "make sense"



These are optional parameters that define how the regular expression is to be interpreted. Individual letters are used to set the options, i.e. the character is present. Letters may be in any order and can be repeated.



If present, the matching process will operate in the "dot-all" mode.


The meta character "." matches any character whatsoever. If the input string contains "hello" and "world" on two different lines, the regular expression "hello*world" will only match if the s flag/character is set.



If present, the matching process operates in multi-line mode.


In multi-line mode the caret ^ matches the start of any line, i.e. the start of the entire string and the first character after a newline character.


The dollar character $ matches the end of any line, i.e. the end of the entire string and the character immediately before a newline character.


Newline is the character #x0A.



If present, the matching process operates in case-insensitive mode.

The regular expression [a-z] plus the i flag would then match all letters a-z and A-Z.



If present, whitespace characters are removed from the regular expression prior to the matching process. Whitespace chars. are #x09, #x0A, #x0D and #x20.


Exception:Whitespace characters within character class expressions are not removed e.g. [#x20].


Note:When generating code, the advanced features of the regex syntax might differ slightly between the various languages, please see the specific regex documentation for your language.

© 2020 Altova GmbH