Skip to main content

XPath Expressions You’ll Actually Use

Sophie avatar
Written by Sophie
Updated over a week ago

Path (XML Path Language) is a powerful query language used to navigate and select nodes in XML documents. Whether you're parsing XML files, scraping web data, or writing automated tests, mastering XPath expressions is essential. This tutorial combines categorized expressions with function-focused examples to provide a comprehensive understanding of XPath.

🔍Want to try XPath expressions as you go?

This XPath tester is a handy tool that lets you experiment with XPath queries with our sample XML document.

Sample XML Document

We'll use the following XML document for our examples:

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>


Categorized XPath Expressions

XPath expressions can be grouped based on how they select and filter nodes. Understanding these categories helps you write better queries faster.

Selecting Nodes

XPath uses path expressions to select nodes in an XML document. The node is selected by following a path or steps.

Expression

Description

Result

/bookstore

Selects the root element bookstore

<bookstore>...</bookstore>

/bookstore/book

Selects all book elements under bookstore

All <book> elements

//book

Selects all book elements in the document

All <book> elements

bookstore//book

Selects all book elements under bookstore, at any level

All <book> elements

//@lang

Selects all attributes named lang

lang="en" from each <title> element

Predicates

Predicates are used to find a specific node or a node that contains a specific value. Predicates are always embedded in square brackets.

Expression

Description

Result

/bookstore/book[1]

Selects the first book element under bookstore

First <book> element

/bookstore/book[last()]

Selects the last book element under bookstore

Last <book> element

/bookstore/book[position()<3]

Selects the first two book elements under bookstore

First two <book> elements

//title[@lang]

Selects all title elements with a lang attribute

All <title> elements with lang attribute

//title[@lang='en']

Selects all title elements with lang attribute equal to 'en'

All <title lang="en"> elements

/bookstore/book[price>35.00]

Selects all book elements with price greater than 35.00

<book> elements with price > 35.00

/bookstore/book[price>35.00]/title

Selects the title of books with price greater than 35.00

<title> elements of books with price > 35.00

Selecting Unknown Nodes

XPath wildcards can be used to select unknown XML nodes.

Expression

Description

Result

/bookstore/*

Selects all child elements of bookstore

All <book> elements

//*

Selects all elements in the document

All elements in the document

//title[@*]

Selects all title elements with any attribute

All <title> elements with attributes

Selecting Several Paths

By using the | operator in an XPath expression, you can select several paths.

Expression

Description

Result

//title|//price

Selects all title and price elements in the document

All <title> and <price> elements

/bookstore/book[1] | /bookstore/book[4]

Selects both the first and fourth <book> elements.

The book "Everyday Italian"
The book "Learning XML"


Commonly Used XPath Functions

1. Selecting Text with text()

If you want to extract the visible text inside an HTML element (for example, the name of a product or the text on a button), the text() function helps you do just that.

  • Example: //title/text() selects the text of all <title> elements.

2. Using contains() to Match Partial Text or Attributes

Sometimes you want to match something even if you only know part of it—like a button that says “Sign in now” or a class name that changes slightly.

That’s where contains() comes in. It lets you match partial strings.

  • Example: //title[contains(text(), 'XML')] selects titles containing text 'XML'.

3. Using position() to Choose Elements by Order

What if there are many similar elements, and you only want the first or second one? position() lets you select elements based on their order in the HTML.

  • Example: /bookstore/book[position()=2] selects the second book.

4. Using last() to Get the Final Match

Want to select the last item in a list? last() selects the last node in a node set.

  • Example: /bookstore/book[last()] selects the last book.

5. Using and, or, and not for Logical Conditions

This logical operators for combining or negating conditions let you combine or filter conditions.

  • and: Both conditions must be true

  • or: Either condition can be true

  • not(): The condition must not be true

  • Example: You can use//book[price>30 and price<50] to select books priced between 30 and 50. This XPath returns all <book> nodes that fall within the specified range:

<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>

💡 If you need to retrieve the title of the book, try adding /title/text() at the end of this XPath.

6. following-sibling:: and preceding-sibling::

These expressions help you navigate between sibling elements on the same level of the HTML tree.

For example, suppose your automation needs to capture the <price> element that follows a specified <title>: //title[.='Learning XML']/following-sibling::price

This finds the price that comes after the book titled Learning XML.

To go the other way—finding the <title> that comes before a given <price>—you can use: //price[.='39.95']/preceding-sibling::title

This selects the <title> of the book whose price is 39.95.


XPath Operators

Operators enhance expression power:

Operator

Usage

Example

Result

=

Equals

//book[@category='web']

Web books

!=

Not equals

//book[@category!='web']

Non-web books

<, <=, >, >=

Numeric comparisons

//book[price>40]

XQuery Kick Start

and, or, not()

Logical

//book[price>30 and price<50]

2 books whose prices that fall within the specified range


XPath Axes

Axes let you move in relation to nodes (parents, children, siblings).

Axis

Description

Example

Result

child::

Direct children

child::title

<title> elements only

parent::

Parent of node

//price/parent::book

The <book> element containing the price

following-sibling::

Next sibling

//title/following-sibling::price

<price> after title

preceding-sibling::

Previous sibling

//price/preceding-sibling::title

<title> before price


XPath Functions Cheat Sheet

Function

Description

Example

text()

Selects the text content of a node

//title/text()

contains()

Checks if a string contains a substring

contains(title, 'XML')

starts-with()

Checks if a string starts with a substring

starts-with(title, 'Learning')

normalize-space()

Removes leading and trailing spaces

normalize-space(title)

string-length()

Returns the length of a string

string-length(title)

position()

Returns the position of a node

position()=1

last()

Returns the last node in a node set

last()

not()

Negates a condition

not(contains(title, 'XML'))


To sum up, XPath may look intimidating at first glance, but most tasks require only a handful of expressions and functions. By understanding how paths, predicates, and common functions work, you can start writing more reliable and precise expressions. Practice with real XML or HTML documents, experiment in browser DevTools, and come back to this guide as a quick reference when you're stuck. Happy XPath-ing!

Did this answer your question?