XPath/XQuery replace function

Summary

Returns a string produced from the input string by replacing any substrings that match a given regular expression with a supplied replacement string.

Signatures

fn:replace(
$input as xs:string?,
$pattern as xs:string,
$replacement as xs:string
) as xs:string
fn:replace(
$input as xs:string?,
$pattern as xs:string,
$replacement as xs:string,
$flags as xs:string
) as xs:string

Properties

This function is deterministic, context-independent, and focus-independent.

Rules

The effect of calling the first version of this function (omitting the argument $flags) is the same as the effect of calling the second version with the $flags argument set to a zero-length string. Flags are defined in .

The $flags argument is interpreted in the same manner as for the fn:matches function.

If $input is the empty sequence, it is interpreted as the zero-length string.

The function returns the xs:string that is obtained by replacing each non-overlapping substring of $input that matches the given $pattern with an occurrence of the $replacement string.

If two overlapping substrings of $input both match the $pattern, then only the first one (that is, the one whose first character comes first in the $input string) is replaced.

If the q flag is present, the replacement string is used as is.

Otherwise, within the $replacement string, a variable $N may be used to refer to the substring captured by the Nth parenthesized sub-expression in the regular expression. For each match of the pattern, these variables are assigned the value of the content matched by the relevant sub-expression, and the modified replacement string is then substituted for the characters in $input that matched the pattern. $0 refers to the substring captured by the regular expression as a whole.

More specifically, the rules are as follows, where S is the number of parenthesized sub-expressions in the regular expression, and N is the decimal number formed by taking all the digits that consecutively follow the $ character:

  1. If N=0, then the variable is replaced by the substring matched by the regular expression as a whole.

  2. If 1<=N<=S, then the variable is replaced by the substring captured by the Nth parenthesized sub-expression. If the Nth parenthesized sub-expression was not matched, then the variable is replaced by the zero-length string.

  3. If S<N<=9, then the variable is replaced by the zero-length string.

  4. Otherwise (if N>S and N>9), the last digit of N is taken to be a literal character to be included "as is" in the replacement string, and the rules are reapplied using the number N formed by stripping off this last digit.

For example, if the replacement string is "$23" and there are 5 substrings, the result contains the value of the substring that matches the second sub-expression, followed by the digit 3.

Unless the q flag is used, a literal $ character within the replacement string must be written as \$, and a literal \ character must be written as \\.

If two alternatives within the pattern both match at the same position in the $input, then the match that is chosen is the one matched by the first alternative. For example:

 fn:replace("abcd", "(ab)|(a)", "[1=$1][2=$2]") returns "[1=ab][2=]cd"

Examples

The expression fn:replace("abracadabra", "bra", "*") returns "a*cada*".

The expression fn:replace("abracadabra", "a.*a", "*") returns "*".

The expression fn:replace("abracadabra", "a.*?a", "*") returns "*c*bra".

The expression fn:replace("abracadabra", "a", "") returns "brcdbr".

The expression fn:replace("abracadabra", "a(.)", "a$1$1") returns "abbraccaddabbra".

The expression fn:replace("abracadabra", ".*?", "$1") raises an error, because the pattern matches the zero-length string

The expression fn:replace("AAAA", "A+", "b") returns "b".

The expression fn:replace("AAAA", "A+?", "b") returns "bbbb".

The expression fn:replace("darted", "^(.*?)d(.*)$", "$1c$2") returns "carted".

Error Conditions

A dynamic error is raised if the value of $pattern is invalid according to the rules described in section .

A dynamic error is raised if the value of $flags is invalid according to the rules described in section .

A dynamic error is raised if the pattern matches a zero-length string, that is, if the expression fn:matches("", $pattern, $flags) returns true. It is not an error, however, if a captured substring is zero-length.

In the absence of the q flag, a dynamic error is raised if the value of $replacement contains a dollar sign ($) character that is not immediately followed by a digit 0-9 and not immediately preceded by a backslash (\).

In the absence of the q flag, a dynamic error is raised if the value of $replacement contains a backslash (\) character that is not part of a \\ pair, unless it is immediately followed by a dollar sign ($) character.

Notes

If the input string contains no substring that matches the regular expression, the result of the function is a single string identical to the input string.