XPath/XQuery xml-to-json function

Summary

Converts an XML tree, whose format corresponds to the XML representation of JSON defined in this specification, into a string conforming to the JSON grammar.

Signatures

fn:xml-to-json(
$input as node()?
) as xs:string?
fn:xml-to-json(
$input as node()?,
$options as map(*)
) as xs:string?

Properties

This function is deterministic, context-independent, and focus-independent.

Rules

The effect of the one-argument form of this function is the same as calling the two-argument form with an empty map as the value of the $options argument.

The first argument $input is a node; the subtree rooted at this node will typically be the XML representation of a JSON document as defined in .

If $input is the empty sequence, the function returns the empty sequence.

The $options argument can be used to control the way in which the conversion takes place. The option parameter conventions apply.

The entries that may appear in the $options map are as follows:

Determines whether additional whitespace should be added to the output to improve readability. xs:boolean false The processor must not insert any insignificant whitespace between JSON tokens. The processor may insert whitespace between JSON tokens in order to improve readability. The specification imposes no constraints on how this is done.

The node supplied as $input must be one of the following:

  1. An element node whose name matches the name of a global element declaration in the schema given in ("the schema") and that is valid as defined below:

    If the type annotation of the element matches the type of the relevant element declaration in the schema (indicating that the element has been validated against the schema), then the element is considered valid.

    Otherwise, the processor may attempt to validate the element against the schema, in which case it is treated as valid if and only if the outcome of validation is valid.

    Otherwise (if the processor does not attempt validation using the schema), the processor must ensure that the content of the element, after stripping all attributes (at any depth) in namespaces other than http://www.w3.org/2005/xpath-functions, is such that validation against the schema would have an outcome of valid.

    The process described here is not precisely equivalent to schema validation. For example, schema validation will fail if there is an invalid xsi:type or xsi:nil attribute, whereas this process will ignore such attributes.

  2. An element node E having a key attribute and/or an escaped-key attribute provided that E would satisfy one of the above conditions if the key and/or escaped-key attributes were removed.

  3. A document node having exactly one element child and no text node children, where the element child satisfies one of the conditions above.

Furthermore, $input must satisfy the following constraint (which cannot be conveniently expressed in the schema). Every element M that is a descendant-or-self of $input and has local name map and namespace URI http://www.w3.org/2005/xpath-functions must satisfy the following rule: there must not be two distinct children of M (say C/1 and C/2) such that the normalized key of C/1 is equal to the normalized key of C/2. The normalized key of an element C is as follows:

  • If C has the attribute value escaped-key="true", then the value of the key attribute of C, with all JSON escape sequences replaced by the corresponding Unicode characters according to the JSON escaping rules.

  • Otherwise (the escaped-key attribute of C is absent or set to false), the value of the key attribute of C.

Nodes in the input tree are handled by applying the following rules, recursively. In these rules the term "an element named N" means "an element node whose local name is N and whose namespace URI is http://www.w3.org/2005/xpath-functions".

  1. A document node having a single element node child is processed by processing that child.

  2. An element named null results in the output null.

  3. An element $E named boolean results in the output true or false depending on the result of xs:boolean(fn:string($E)) .

  4. An element $E named number results in the output of the string result of xs:string(xs:double(fn:string($E)))

  5. An element named string results in the output of the string value of the element, enclosed in quotation marks, with any special characters in the string escaped as described below.

  6. An element named array results in the output of the children of the array element, each processed by applying these rules recursively: the items in the resulting list are enclosed between square brackets, and separated by commas.

  7. An element named map results in the output of a sequence of map entries corresponding to the children of the map element, enclosed between curly braces and separated by commas. Each entry comprises the value of the key attribute of the child element, enclosed in quotation marks and escaped as described below, followed by a colon, followed by the result of processing the child element by applying these rules recursively.

  8. Comments, processing instructions, and whitespace text node children of map and array are ignored.

Strings are escaped as follows:

  1. If the attribute escaped="true" is present for a string value, or escaped-key="true" for a key value, then:

    any valid JSON escape sequence present in the string is copied unchanged to the output;

    any invalid JSON escape sequence results in a dynamic error ;

    any unescaped occurrence of quotation mark, backspace, form-feed, newline, carriage return, tab, or solidus is replaced by \", \b, \f, \n, \r, \t, or \/ respectively;

    any other codepoint in the range 1-31 or 127-159 is replaced by an escape in the form \uHHHH where HHHH is the upper-case hexadecimal representation of the codepoint value.

  2. Otherwise (that is, in the absence of the attribute escaped="true" for a string value, or escaped-key="true" for a key value):

    any occurrence of backslash is replaced by \\

    any occurrence of quotation mark, backspace, form-feed, newline, carriage return, or tab is replaced by \", \b, \f, \n, \r, or \t respectively;

    any other codepoint in the range 1-31 or 127-159 is replaced by an escape in the form \uHHHH where HHHH is the upper-case hexadecimal representation of the codepoint value.

Examples

The input <array xmlns="http://www.w3.org/2005/xpath-functions"><number>1</number><string>is</string><boolean>1</boolean></array> produces the result [1,"is",true].

The input <map xmlns="http://www.w3.org/2005/xpath-functions"><number key="Sunday">1</number><number key="Monday">2</number></map> produces the result {"Sunday":1,"Monday":2}.

Error Conditions

A dynamic error is raised if the value of $options includes an entry whose key is defined in this specification, and whose value is not a permitted value for that key.

A dynamic error is raised if the value of $input is not a document or element node or is not valid according to the schema for the XML representation of JSON, or if a map element has two children whose normalized key values are the same.

A dynamic error is raised if the value of $input includes a string labeled with escaped="true", or a key labeled with escaped-key="true", where the content of the string or key contains an invalid JSON escape sequence: specifically, where it contains a backslash (\) that is not followed by one of the characters ", \, /, b, f, n, r, t, or u, or where it contains the characters \u not followed by four hexadecimal digits (that is [0-9A-Fa-f]{4}).

Notes

The rule requiring schema validity has a number of consequences, including the following:

The input cannot contain no-namespace attributes, or attributes in the namespace http://www.w3.org/2005/xpath-functions, except where explicitly allowed by the schema. Attributes in other namespaces, however, are ignored. Nodes that do not affect schema validity, such as comments, processing instructions, namespace nodes, and whitespace text node children of map and array, are ignored. Numeric values are restricted to those that are valid in JSON: the schema disallows positive and negative infinity and NaN. Duplicate key values are not permitted. Most cases of duplicate keys are prevented by the rules in the schema; additional cases (where the keys are equal only after expanding JSON escape sequences) are prevented by the prose rules of this function. For example, the key values \n and \u000A are treated as duplicates even though the rules in the schema do not treat them as such.

The rule allowing the top-level element to have a key attribute (which is ignored) allows any element in the output of the fn:json-to-xml function to be processed: for example, it is possible to take a JSON document, convert it to XML, select a subtree based on the value of a key attribute, and then convert this subtree back to JSON, perhaps after a transformation. The rule means that an element with the appropriate name will be accepted if it has been validated against one of the types mapWithinMapType, arrayWithinMapType, stringWithinMapType, numberWithinMapType, booleanWithinMapType, or nullWithinMapType.