This chapter explains what XPath is and provides an introduction to two fundamental concepts in XPath, the 'expression' and the 'sequence'.
XPath stands for XML Path Language.
The 'X' in XPath comes from its roots in XML, the eXtensible Markup Language.
The 'Path' in XPath comes from the fact that XPath uses a 'path like' syntax.
With XPath it is possible to identify parts of an XML document and perform computations on data in the XML document.
We are all familiar with the 'path like' syntax used to navigate the hierarchical 'tree like' structure of directories and files on Windows, Linux and Mac filesystems. In much the same way that a filesystem path navigates the hierarchical 'tree like' structure of directories and files on a filesystem, an XPath location path expression uses a 'path like' syntax to identify and navigate nodes in the hierarchical 'tree like' structure of nodes in an XML document.
Linux / Mac
In addition to being able to identify nodes in an XML document, XPath also includes a wealth of operators and built-in functions to perform computations on the data in the XML document.
XPath is used widely by a number of XML technologies including XML Schema, XQuery, XSLT and XPointer because of its capability to identify nodes and perform computations on data in an XML document.
The basic building block of XPath is the expression. An expression is simply a string of Unicode characters made up of keywords, symbols, and operands.
In the previous section we briefly encountered 'location path' expressions from which XPath derives its name. In addition to location paths, there are also several other types of expression in XPath. Although each expression type will be described separately in this document, it is important to note that XPath statements are not limited to one type of expression, in fact XPath statements are frequently constructed from several different expression types.
for $i in /company/office/employee return concat('Your name is ', $i/first_name)
Because of their importance in gaining a fundamental understanding of XPath, Chapter 2 is dedicated to covering location path expressions in greater detail. Chapter 3 covers the other types of XPath expressions which include iterative expressions, conditional expressions, quantitative expressions, comparison expressions etc.
Sequences are very important in XPath because every XPath expression returns a sequence. A sequence is simply an ordered collection of zero or more items.
A sequence with zero items is know as the empty sequence.
A sequence with a single item is know as a singleton sequence.
In XPath 3.0, an item in a sequence can be a node, an atomic value, or a function.
The XPath 3.0 data model defines 7 node types:
An atomic value is a permissible value in the set of values for a particular atomic type. The atomic types are all of the types which are derived either directly or indirectly from xs:anyAtomicType, shown in the type hierarchy diagram of the XML Schema 1.1 Datatypes specification.
The function is a new item type. In XPath 2.0 an item in a sequence could only be a node or atomic value, in XPath 3.0 an item can be a node, atomic value or a function.
In many of the examples in this training you will see an XPath expression followed by the result of the expression. Only in cases where the result is an empty sequence or a sequence which contains multiple items, will the result be contained in parentheses. If the result is an error, or a sequence which contains only one item, the result will not be shown in parentheses..
(4, 8, 75, 16, 2)
(5, 'hello', 75, 'world', 2, /company/office/employee/first_name)
XPath technology has existed for over 15 years, there are three versions of XPath:
XPath version 1.0 is very basic in comparison with versions 2.0 and 3.0. The type system only supports 4 datatypes: 'node', 'boolean', 'number' and 'string', and the built-in function library is comprised of a mere 27 functions.
XPath version 2.0 represented a major leap forward. A comprehensive type system supporting approximately 50 types (including all simple types defined in the XML Schema specification) was introduced, along with new operators and a greatly expanded built-in function library (well over 100 functions). The concept of sequences was also introduced - every value in an XPath 2.0 expression being a sequence. Node sets in XPath 1.0 were replaced by sequences of nodes in XPath 2.0.
XPath version 3.0 also introduces new datatypes, operators and built-in functions (there are now over 200), however the most significant change is the promotion of functions to first class values (higher order functions). A function can now be passed to another function (as an argument), or be returned from a function. In addition, in XPath 3.0 it is also possible for programmers to write their own functions (inline functions), and create their own function libraries.
The XPath 3.0 specification is comprised of three documents which specify the syntax, data model, and operators and functions respectively: