1. Functions

    XPath 3.0 has introduced several new function enhancements:

    • XPath 3.0 supports the creation of 'user-defined' functions (was not possible in XPath 1.0 and 2.0).
    • Functions have been elevated to first-class status ('higher order' functions).
    • New functions have been added to the built-in function library.
    1. Inline Functions

      In XPath 3.0, user-defined functions are created as inline (anonymous) functions.

      Inline functions are 'anonymous', meaning that the function is unnamed.

      An inline function definition begins with the keyword 'function', followed by an optional list of parameters. Each parameter may also specify a type. The parameter list is followed by an optional return type, followed by the body of the function.

      function () {'hello world'}
      This function declaration contains an empty parameter list followed by the body of the function. This function simply returns the string 'hello world'.
      function ($arg) {$arg * 2}
      This function declaration declares a parameter '$arg' and the function returns the value of '$arg' multiplied by 2.
      function ($arg as xs:integer) as xs:integer {$arg * 2}
      This function declaration is 'strongly typed' i.e it specifies types for the parameters and return value of the function. In this case it takes a parameter '$arg' of type 'xs:integer' and returns a value of type 'xs:integer'. The body of the function simply multiplies the value of '$arg' by 2.

      Because inline functions have no name, they are usually defined in a 'let' expression bound to a variable. The variable acts as an identifier for the function so that it can be used. The scope of the function is the let expression in which the function is defined.

      let $double_funct :=  
               function ($arg)
                  {
                      $arg * 2 
                  }        
               return
               $double_funct(45)
      result:
      90
      This 'let' expression declares a variable '$double_funct' which is bound to the inline function. At the end of the 'let' expression the value 45 is passed to the variable, which because the function is bound to it, will result in the integer value 90 being returned i.e. 45 * 2.
    2. Higher Order Functions

      Functions have been elevated to first class status in XPath 3.0. This means that:

      • It is possible to pass a function as an argument to another function
      • It is possible to return a function from a function
      • It is possible to assign a function to a variable

      An example of a higher order function is shown below:

      let $temp :=
              function($t as xs:integer, $funct as function(xs:integer) as xs:decimal) as xs:decimal
                 {
                    $funct($t)
                 },
      
              $f_to_c :=
                function($t as xs:integer) as xs:decimal
                  {
                    (5 div 9)*($t -32)
                  },
               
               $c_to_f :=
                function($t as xs:integer) as xs:decimal
                  {
                    (9 div 5)*($t) + 32
                  }
            
               return
               ('fahrenheit to celsius: ', $temp(68, $f_to_c), ', celsius to fahrenheit: ', $temp(0 , $c_to_f))
      result:
      ('fahrenheit to celsius: ', 20 ,'celsius to fahrenheit: ', 32)
      This 'let' expression declares a variable '$temp' which is bound to the inline function. The inline function takes two arguments, an 'xs:integer' value and a function, and returns an 'xs:decimal' value. The 'function' argument in turn takes an 'xs:integer' argument and returns an 'xs:decimal' value. The 'let' expression also defines two other variables '$f_to_c' and '$c_to_f', which are each bound to inline functions taking an 'xs:integer' argument and returning an 'xs:decimal' value. The 'return' clause of the 'let' expression calls the higher order function bound to '$temp', passing to it an 'xs:integer' as well as the function bound to the '$f_to_c' variable and then the function bound to the '$c_to_f' variable respectively.

      At the end of this chapter the built-in higher order functions (i.e. higher order functions which are built-in to XPath 3.0) will be covered.

      1. Built-in Higher Order Functions

        XPath 3.0 contains several new built-in functions. Some of these new built-in functions are higher order functions. The built-in higher order functions are covered in this subsection.

        The built-in higher order functions include:

        • for-each()
        • filter()
        • fold-left()
        • fold-right()
        • for-each-pair()
        1. for-each()

          The 'for-each' function is a built-in higher order function which takes two arguments and returns a sequence of items.

          It applies the function supplied in the second argument to each item in the sequence supplied in the first argument.

          The first argument is a sequence of items.

          The second argument is a function which takes one argument, an item, and returns a sequence of items.

          for-each(1 to 5, function($arg) {$arg *  10})
          result:
          (10, 20, 20, 40, 50)
          • In this example the first argument of the 'for-each' function is the sequence 1 to 5 i.e. (1, 2, 3, 4, 5). The second argument is an inline function which takes an 'item' as an argument and returns a sequence of items. The function passed as the second argument is applied to each item in the sequence of the first argument, and multiplies it by 10. The result of this function call is the sequence (10, 20, 30, 40, 50).
          • The 'for-each' function is very similar to the 'for' expression covered in chapter 3. The equivalent 'for' expression of the 'for-each' function example above is: for $i in (1, 2, 3, 4, 5) return $i*10
          • In this example you will also note that there is an inline function definition which has not been defined and bound to a variable within a 'let' expression (as in the previous example).
        2. filter()

          The 'filter' function is a built-in higher order function which takes two arguments and returns a sequence of items.

          It filters the sequence of items in the first argument to only those items for which the function supplied in the second argument returns 'true'.

          The first argument is a sequence of items.

          The second argument is a function which takes an item as an argument and returns an 'xs:boolean' value.

          filter(1 to 5, function($arg) {$arg mod 2 = 0})
          result:
          (2, 4)
          • In this example the first argument of the 'filter' function is the sequence of items '1 to 5' i.e. (1, 2, 3, 4, 5). The second argument is an inline function which simply checks to see if an item mod 2 = 0 (i.e. if the item value divided by 2 produces no remainder). If so, it will return this item. Of the sequence of items (1, 2, 3, 4, 5), only 2 and 4 produce no remainder when divided by 2, so in this example the function returns the sequence (2, 4).
          • The above example would be equivalent to (1, 2, 3, 4, 5) [. mod 2 = 0], where '[. mod 2 = 0]' is a predicate and '.' is the context item in the sequence.
        3. fold-left()

          The 'fold-left' function is a built-in higher order function which takes three arguments and returns a sequence of items.

          A fold reduces a sequence of items/values to one value. 'fold-left' process the sequence from left to right.

          The first argument '$seq' is a sequence of items to be processed (from left to right).

          The second argument is a base value.

          The third argument is a function '$func'. The function takes two arguments ('$arg1' and '$arg2') which are both sequences of items. '$func' also returns a sequence of items.

          'fold-left' takes a function '$func' which operates on a pair of values 'arg1' and 'arg2'. 'fold-left' applies '$func' repeatedly, with an accumulated result as the first argument '$arg1', and the next item in the sequence '$seq' as the second argument 'arg2'. The accumulated result is initially set to the base value, which by convention causes the function to return the value of the other argument unchanged. For example, in the case of addition the base value might be 0, because 0 + $arg2 = $arg2 (i.e. $arg2 is unchanged), whereas in the case of multiplication a base value of 1 would be more appropriate because 1 * $arg2 = $arg2, and therefore 1 causes '$arg2' to remain unchanged.

          fold-left(1 to 5, 1, function($arg1, $arg2) {$arg1 * $arg2})
          result:
          120
          In this example the inline function multiplies '$arg1' by '$arg2' with a base value of 1 i.e. (((((1 * 1) * 2) * 3) * 4) * 5)
          fold-left(1 to 5, 0, function($arg1, $arg2) {$arg1 * $arg2})
          result:
          0
          In this example the inline function multiplies '$arg1' by '$arg2' with a base value of 0 i.e. (((((0 * 1) * 2) * 3) * 4) * 5)
          fold-left(1 to 5, 0, function($arg1, $arg2) {$arg1 + $arg2})
          result:
          15
          In this example the inline function adds '$arg2' to '$arg1' with a base value of 0 i.e (((((0 + 1) + 2) + 3) + 4) + 5)
          fold-left(1 to 5, 0, function($arg1, $arg2) {$arg1 - $arg2})
          result:
          -15
          In this example the inline function subtracts '$arg2' from '$arg1' with a base value of 0 i.e. (((((0 - 1) - 2) - 3) - 4) - 5)
          fold-left(('a', 'b', 'c'), 'z' , function($arg1, $arg2) {concat($arg1, $arg2)})
          result:
          'zabc'
          In this example the inline function concatenates '$arg1' with '$arg2', with a base value of 'z' i.e. concat(concat(concat('z','a'), 'b'), 'c')
        4. fold-right()

          The 'fold-right' function is a built-in higher order function which takes three arguments and returns a sequence of items.

          A fold reduces a sequence of items/values to one value. 'fold-right' process the sequence from right to left.

          The first argument '$seq' is a sequence of items to be processed (from right to left).

          The second argument is a base value.

          The third argument is a function '$func'. The function takes two arguments ($arg1 and $arg2) which are both sequences of items. '$func' also returns a sequence of items.

          'fold-right' takes a function '$func' which operates on a pair of values '$arg1' and '$arg2'. 'fold-right' applies '$func' repeatedly, with the next item in the sequence as the first argument '$arg1', and the result of processing the remainder of the sequence as the second argument '$arg2'. The accumulated result is initially set to the base value, which by convention causes the function to return the value of the other argument unchanged. For example, in the case of addition the base value might be 0, because 0 + $arg2 = $arg2 (i.e. '$arg2' is unchanged), whereas in the case of multiplication a base value of 1 would be more appropriate because 1 * $arg2 = $arg2, and therefore 1 causes $arg2 to remain unchanged.

          fold-right(1 to 5, 0, function($arg1, $arg2) {$arg1 + $arg2})
          result:
          15
          • In this example the inline function adds $arg2 (accumulated result) to $arg1(next item in sequence) with a base value of 0 i.e (1 + (2 + (3 + (4 + (5 + 0))))
          • 'fold-left' and 'fold-right' willl yield the same results for a function which performs an associative operation on its arguments. Addition and multiplication are associative operations, division and subtraction are not.
          fold-right(1 to 5, 0, function($arg1, $arg2) {$arg1 - $arg2})
          result:
          3
          In this example the inline function subtracts $arg2 from $arg1 i.e. (1 - (2 - (3 - (4 - (5 - 0)))))
          fold-right(('a', 'b', 'c'), 'z' , function($arg1, $arg2) {concat($arg1, $arg2)})
          result:
          'abcz'
          In this example the inline function concatenates '$arg1' and '$arg2' with a base value of 'z' i.e. concat('a', concat('b', concat( 'c', 'z')))
        5. for-each-pair()

          The 'for-each-pair' function is a built-in higher order function which takes three arguments and returns a sequence of items.

          It applies the function passed as the third argument to consecutive items from the sequence of items in the first and second arguments and returns the resulting sequence.

          for-each-pair((1, 10, 100), (2, 5, 10), function($arg1, $arg2) {$arg1 * $arg2}))
          result:
          (2, 50, 1000)
          In this example the function passed as the third argument to the 'for-each-pair' function multiplies the consecutive items in the sequence of the first and second arguments and then returns the results of the each pair 1 * 2, 10 * 5, 100 * 10 i.e. (2, 50, 100)
          for-each-pair((1, 10, 100), (2, 5), function($arg1, $arg2) {$arg1 * $arg2}))
          result:
          (2, 50)
          In this example the first sequence has 3 items and the second sequence has 2 items, therefore only the first and second items of the first sequence will be processed (1 * 2) and (10 * 5) i.e. (2, 50)
    3. Function Composition

      Function composition is when the output of one function is used as the input for another function. For example we can create a function 'a' which take two functions 'b' and 'c' as arguments, where the output of function 'c' will be used as the input of function 'b'. This is best demonstrated by example.

      let $z :=
                function($a as function(*), $b as function(*))
                as function(*)
                   {
                     function($c as item()*)
                      {
                       $a($b($c))
                      }
                   },
      
                $div10 :=
                function($x as xs:double) as xs:double
                  {
                    $x div 10
                  },
      
                $pow2 :=
                function($y as xs:double) as xs:double
                  {
                   $y * $y
                  }
      
                 return
                 ($z($div10, $pow2)(5), $z($pow2, $div10)(5))
      result:
      (2.5, 0.25)
      In this example the function bound to the variable '$z' takes two arguments '$a' and '$b', which are both functions. Two other functions are declared in the 'let' expression which are bound to the '$div10' and '$pow2' variables respectively. The return clause calls the function bound to '$z', twice. The first call, calls it with the first argument '$div10' and the second argument '$pow2'. The second call, calls it with the first argument '$pow2' and the second argument '$div10'. In both cases the value 5 is passed to the function resulting in the output (2.5, 0.25) i.e. 5*5 div 10 = 2.5, and (5 div 10)*(5 div 10) = 0.25.
    4. Partial Functions

      A partial function application is when a function with many arguments uses the '?' placeholder to indicate that the argument will be provided at a 'later point'.

      for-each-pair( 1 to 5, ( 'London', 'New York', 'Vienna', 'Paris', 'Tokyo' ), concat( ?, ' ',  ? ) )
      result:
      ('1 London', '2 New York', '3 Vienna', '4 Paris', '5 Tokyo')
      In this example, the 'concat' function specified as the third argument of the 'for-each-pair' function concatenates each pair of items from the first argument and second argument sequences and returns the resulting sequence.

      A partial function implementation is when a function with many arguments binds some of its arguments to values so that a new function can be created from this function which only has to pass the remaining arguments. Calling the new function with the 'remaining' arguments would therefore be equivalent to calling the 'base' function with all arguments.

      let $tax_rate :=
                function($rate as xs:integer, $amount as xs:decimal) as xs:decimal
                {
                  ($rate div 100) * $amount
                },
      
                $income_tax :=
                function($amount as xs:decimal) as xs:decimal
                {
                 $tax_rate(15, ?)($amount)
                },
      
               $luxury_tax :=
                function($amount as xs:integer) as xs:decimal
                {
                  $tax_rate(50, ?)($amount)
                }
      
              return 
                  ($income_tax(300), $luxury_tax(50))
      result:
      (45, 25)
      In this example an anonymous function which takes two arguments, an integer '$rate' and a decimal '$amount' is bound to the '$tax_rate' variable. '$rate' is divided by 100 and then multiplied by '$amount'. Two other anonymous functions are also defined and bound to '$income_tax' and '$luxury_tax' respectively. '$income_tax' and '$luxury_tax' both have one argument '$amount', they also both call the function bound to '$tax_rate' and specify values for the first argument of the '$tax_rate' function, but do not specify values for the second argument. Instead a placeholder '?' is used to indicate that the argument will be specified later. The function call is followed by ($amount) indicating that the value passed to '$income_tax' or '$luxury_tax' should be passed to $tax_rate and used as the 'non specified' argument. The return cause of the 'let' statement calls both '$income_tax' and '$luxury_tax' with a single argument, which due to the partial implementation, would be the same as calling the '$tax_rate' function with both arguments.
    5. Closures

      By now we know that a higher order function can take functions as parameters and/or return a function. If a function 'a' returns a function 'b' and function 'b' contains data from function 'a' this is known as a closure.

      let $date :=
             function($upperdate as xs:date)
                as function(xs:date, xs:string) as xs:string
                  {
                     function($birthdate as xs:date, $name as xs:string) as xs:string
                         {
                            'On ' || fn:adjust-date-to-timezone($upperdate, ()) || ', ' || $name || ' was ' || fn:days-from-duration($upperdate - $birthdate ) || ' days old'
                          }
                   },
              $days_old:= $date(fn:current-date())
              return
              ($days_old(xs:date('1957-08-13'), 'Mary' ), $days_old(xs:date('1990-12-13'), 'John' ))
      result:
      ('On 2014-10-24, Mary was 20890 days old', 'On 2014-10-24, John was 8715 days old')
      In this example a function which takes an 'xs:date' argument '$upperdate' is bound to the '$date' variable. The '$date' function also returns a function which takes an 'xs:date' and an 'xs:string' value. The returned function uses data from the '$date' function i.e. it uses the '$upperdate' value which was passed to '$date', demonstrating a closure.