XPath Expressions You’ll Actually Use

Path (XML Path Language) is a powerful query language used to navigate and select nodes in XML documents. Whether you're parsing XML files, scraping web data, or writing automated tests, mastering XPath expressions is essential. This tutorial combines categorized expressions with function-focused examples to provide a comprehensive understanding of XPath.

🔍Want to try XPath expressions as you go?

This XPath tester is a handy tool that lets you experiment with XPath queries with our sample XML document.

Sample XML Document

We'll use the following XML document for our examples:

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
    <book category="cooking">
        <title lang="en">Everyday Italian</title>
        <author>Giada De Laurentiis</author>
        <year>2005</year>
        <price>30.00</price>
    </book>
    <book category="children">
        <title lang="en">Harry Potter</title>
        <author>J K. Rowling</author>
        <year>2005</year>
        <price>29.99</price>
    </book>
    <book category="web">
        <title lang="en">XQuery Kick Start</title>
        <author>James McGovern</author>
        <author>Per Bothner</author>
        <author>Kurt Cagle</author>
        <author>James Linn</author>
        <author>Vaidyanathan Nagarajan</author>
        <year>2003</year>
        <price>49.99</price>
    </book>
    <book category="web">
        <title lang="en">Learning XML</title>
        <author>Erik T. Ray</author>
        <year>2003</year>
        <price>39.95</price>
    </book>
</bookstore>

Categorized XPath Expressions

XPath expressions can be grouped based on how they select and filter nodes. Understanding these categories helps you write better queries faster.

Selecting Nodes

XPath uses path expressions to select nodes in an XML document. The node is selected by following a path or steps.

Expression	Description	Result
/bookstore	Selects the root element bookstore	<bookstore>...</bookstore>
/bookstore/book	Selects all book elements under bookstore	All <book> elements
//book	Selects all book elements in the document	All <book> elements
bookstore//book	Selects all book elements under bookstore, at any level	All <book> elements
//@lang	Selects all attributes named lang	lang="en" from each <title> element

Predicates

Predicates are used to find a specific node or a node that contains a specific value. Predicates are always embedded in square brackets.

Expression	Description	Result
/bookstore/book[1]	Selects the first book element under bookstore	First <book> element
/bookstore/book[last()]	Selects the last book element under bookstore	Last <book> element
/bookstore/book[position()<3]	Selects the first two book elements under bookstore	First two <book> elements
//title[@lang]	Selects all title elements with a lang attribute	All <title> elements with lang attribute
//title[@lang='en']	Selects all title elements with lang attribute equal to 'en'	All <title lang="en"> elements
/bookstore/book[price>35.00]	Selects all book elements with price greater than 35.00	<book> elements with price > 35.00
/bookstore/book[price>35.00]/title	Selects the title of books with price greater than 35.00	<title> elements of books with price > 35.00

Selecting Unknown Nodes

XPath wildcards can be used to select unknown XML nodes.

Expression	Description	Result
/bookstore/*	Selects all child elements of bookstore	All <book> elements
//*	Selects all elements in the document	All elements in the document
//title[@*]	Selects all title elements with any attribute	All <title> elements with attributes

Selecting Several Paths

By using the | operator in an XPath expression, you can select several paths.

Expression	Description	Result
//title\|//price	Selects all title and price elements in the document	All <title> and <price> elements
/bookstore/book[1] \| /bookstore/book[4]	Selects both the first and fourth <book> elements.	The book "Everyday Italian" The book "Learning XML"

Commonly Used XPath Functions

1. Selecting Text with `text()`

If you want to extract the visible text inside an HTML element (for example, the name of a product or the text on a button), the text() function helps you do just that.

Example: //title/text() selects the text of all <title> elements.

2. Using `contains()` to Match Partial Text or Attributes

Sometimes you want to match something even if you only know part of it—like a button that says “Sign in now” or a class name that changes slightly.

That’s where contains() comes in. It lets you match partial strings.

Example: //title[contains(text(), 'XML')] selects titles containing text 'XML'.

3. Using `position()` to Choose Elements by Order

What if there are many similar elements, and you only want the first or second one? position() lets you select elements based on their order in the HTML.

Example: /bookstore/book[position()=2] selects the second book.

4. Using `last()` to Get the Final Match

Want to select the last item in a list? last() selects the last node in a node set.

Example: /bookstore/book[last()] selects the last book.

5. Using `and`, `or`, and `not` for Logical Conditions

This logical operators for combining or negating conditions let you combine or filter conditions.

and: Both conditions must be true
or: Either condition can be true
not(): The condition must not be true

Example: You can use//book[price>30 and price<50] to select books priced between 30 and 50. This XPath returns all <book> nodes that fall within the specified range：

<book category="web">
    <title lang="en">XQuery Kick Start</title>
    <author>James McGovern</author>
    <author>Per Bothner</author>
    <author>Kurt Cagle</author>
    <author>James Linn</author>
    <author>Vaidyanathan Nagarajan</author>
    <year>2003</year>
    <price>49.99</price>
  </book>
<book category="web">
    <title lang="en">Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>

💡 If you need to retrieve the title of the book, try adding /title/text() at the end of this XPath.

6. `following-sibling::` and `preceding-sibling::`

These expressions help you navigate between sibling elements on the same level of the HTML tree.

For example, suppose your automation needs to capture the <price> element that follows a specified <title>: //title[.='Learning XML']/following-sibling::price

This finds the price that comes after the book titled Learning XML.

To go the other way—finding the <title> that comes before a given <price>—you can use: //price[.='39.95']/preceding-sibling::title

This selects the <title> of the book whose price is 39.95.

XPath Operators

Operators enhance expression power:

Operator	Usage	Example	Result
`=`	Equals	`//book[@category='web']`	Web books
`!=`	Not equals	`//book[@category!='web']`	Non-web books
`<`, `<=`, `>`, `>=`	Numeric comparisons	`//book[price>40]`	XQuery Kick Start
`and`, `or`, `not()`	Logical	`//book[price>30 and price<50]`	2 books whose prices that fall within the specified range

XPath Axes

Axes let you move in relation to nodes (parents, children, siblings).

Axis	Description	Example	Result
`child::`	Direct children	`child::title`	`<title>` elements only
`parent::`	Parent of node	`//price/parent::book`	The `<book>` element containing the price
`following-sibling::`	Next sibling	`//title/following-sibling::price`	`<price>` after title
`preceding-sibling::`	Previous sibling	`//price/preceding-sibling::title`	`<title>` before price

XPath Functions Cheat Sheet

Function	Description	Example
`text()`	Selects the text content of a node	`//title/text()`
`contains()`	Checks if a string contains a substring	`contains(title, 'XML')`
`starts-with()`	Checks if a string starts with a substring	`starts-with(title, 'Learning')`
`normalize-space()`	Removes leading and trailing spaces	`normalize-space(title)`
`string-length()`	Returns the length of a string	`string-length(title)`
`position()`	Returns the position of a node	`position()=1`
`last()`	Returns the last node in a node set	`last()`
`not()`	Negates a condition	`not(contains(title, 'XML'))`

To sum up, XPath may look intimidating at first glance, but most tasks require only a handful of expressions and functions. By understanding how paths, predicates, and common functions work, you can start writing more reliable and precise expressions. Practice with real XML or HTML documents, experiment in browser DevTools, and come back to this guide as a quick reference when you're stuck. Happy XPath-ing!

Sample XML Document

Categorized XPath Expressions

Selecting Nodes

Predicates

Selecting Unknown Nodes

Selecting Several Paths

Commonly Used XPath Functions

1. Selecting Text with text()

2. Using contains() to Match Partial Text or Attributes

3. Using position() to Choose Elements by Order

4. Using last() to Get the Final Match

5. Using and, or, and not for Logical Conditions

6. following-sibling:: and preceding-sibling::