XPath/XQuery parse-xml-fragment function

Summary

This function takes as input an XML external entity represented as a string, and returns the document node at the root of an XDM tree representing the parsed document fragment.

Signature

fn:parse-xml-fragment(
$arg as xs:string?
) as document-node()?

Properties

This function is nondeterministic, context-dependent, and focus-independent. It depends on static-base-uri.

Rules

If $arg is the empty sequence, the function returns the empty sequence.

The input must be a namespace-well-formed external general parsed entity. More specifically, it must be a string conforming to the production rule extParsedEnt in , it must contain no entity references other than references to predefined entities, and it must satisfy all the rules of for namespace-well-formed documents with the exception that the rule requiring it to be a well-formed document is replaced by the rule requiring it to be a well-formed external general parsed entity.

The string is parsed to form a sequence of nodes which become children of the new document node, in the same way as the content of any element is converted into a sequence of children for the resulting element node.

Schema validation is not invoked, which means that the nodes in the returned document will all be untyped.

The precise process used to construct the XDM instance is implementation-defined. In particular, it is implementation-defined whether an XML 1.0 or XML 1.1 parser is used.

The static base URI from the static context of the fn:parse-xml-fragment function call is used as the base URI of the document node that is returned.

The document URI of the returned node is absent.

The function is not deterministic: that is, if the function is called twice with the same arguments, it is implementation-dependent whether the same node is returned on both occasions.

Examples

The expression fn:parse-xml-fragment("<alpha>abcd</alpha><beta>abcd</beta>") returns a newly created document node, having two elements named alpha and beta as its children; each of these elements in turn is the parent of a text node.

The expression fn:parse-xml-fragment("He was <i>so</i> kind") returns a newly created document node having three children: a text node whose string value is "He was ", an element node named i having a child text node with string value "so", and a text node whose string value is " kind".

The expression fn:parse-xml-fragment("") returns a document node having no children.

The expression fn:parse-xml-fragment(" ") returns a document node whose children comprise a single text node whose string value is a single space.

The expression fn:parse-xml-fragment('<?xml version="1.0" encoding="utf8" standalone="yes"?><a/>') results in a dynamic error because the "standalone" keyword is not permitted in the text declaration that appears at the start of an external general parsed entity. (Thus, it is not the case that any input accepted by the fn:parse-xml function will also be accepted by fn:parse-xml-fragment.)

Error Conditions

A dynamic error is raised if the content of $arg is not a well-formed external general parsed entity, if it contains entity references other than references to predefined entities, or if a document that incorporates this well-formed parsed entity would not be namespace-well-formed.

Notes

See also the notes for the fn:parse-xml function.

The main differences between fn:parse-xml and fn:parse-xml-fragment are that for fn:parse-xml, the children of the resulting document node must contain exactly one element node and no text nodes, wheras for fn:parse-xml-fragment, the resulting document node can have any number (including zero) of element and text nodes among its children. An additional difference is that the text declaration at the start of an external entity has slightly different syntax from the XML declaration at the start of a well-formed document.

Note that all whitespace outside the text declaration is significant, including whitespace that precedes the first element node.

One use case for this function is to handle XML fragments stored in databases, which frequently allow zero-or-more top level element nodes. Another use case is to parse the contents of a CDATA section embedded within another XML document.