What is an XPath injection? | Tutorial & examples

XPath injection: the basics

What is XPath injection?

XPath is a query language for XML documents. It was designed to simplify selecting nodes within the document structure.

XPath injection is a type of attack that can change the intent of an XPath query that is executed on an application’s backend. An application might be vulnerable to this attack if special characters are injected into a user-supplied input value, that input is not filtered and is concatenated with other strings to construct an XPath query, which is executed against an XML document.

Impacts of this attack can include bypassing authentication logic, or the disclosure of sensitive data within the XML document being queried.

About this lesson

In this lesson, you will learn how XPath injection works and how to protect your applications against it. We will begin by exploiting an XPath injection vulnerability in a simple application. Then we will analyze the vulnerable code and explore some options for remediation and prevention.

XPath injection in action

The Red Hills County Softball League has a web application built for its members and fans to view information about upcoming games, match results, and teams. But these small web applications aren't always built with security in mind.

Testing for XPath injection

STEP 1
STEP 2
STEP 3
STEP 4
STEP 5

Setting the stage

Mallory had a falling out with her former Bears teammates. The Red Hills County Softball League app is a great target to practice her hacking skills.

XPath injection under the hood

How does XPath injection work?

Firstly, let's recap what took place in the interactive example above:

Certain special characters, like a single quote, that were injected in the user-supplied query-string parameter “team” caused an error condition in the application
The error messages presented by the application suggested that the server-side code was likely using XPath queries to retrieve data that was displayed to the user
A combination of special characters was found, including injecting a null byte, which manipulated the XPath query in a way that changed the extent of the query but still resulted in valid XPath syntax
As a result of the injection attack, the application retrieved more data than was supposed to be displayed and disclosed it to the attacker

We will have a look at this vulnerable application in more detail by going through the server-side code.

In our example attack, the hacker injected the string: Bears’]%00

The null byte (%00) terminated the string representing the XPath query so the rest of the string concatenated after the user input was ignored and the XPath query effectively became the following string: /teams/team[name='Bears’]<null character>

The query returned all child nodes of the <team> node where the <name> node’s value was Bears, including nodes such as the team members’ date of birth, address, and email, which were supposed to be private and not displayed to the application user.

Impacts of XPath injection

By exploiting XPath injection, a malicious actor could disclose sensitive data within the XML document being queried. For example, in the vulnerable application we looked at above, the personal information of team members was leaked, resulting in a violation of privacy for those individuals.

If the application uses the XML data for any security-related decisions, such as a database of usernames and passwords to authenticate users against, then authentication could be bypassed.

Disclosure of the contents of other files on the application server’s file system may also be possible depending on the XPath library in use and its configuration. The doc() and doc-available() XPath functions, when implemented in the XPath library, can allow the reading of files on the local filesystem.

XPath injection mitigation

Use an allowlist

By restricting user-supplied input that is used to construct the XPath query to only known safe characters, the query can be securely constructed:

Encode user input

By encoding special characters injected into the user-supplied input, such as the single quote to its XML entity representation, we can avoid a situation where malicious user input breaks out of the intended XPath query syntax:

import {encode} from 'html-entities';
//…
const nodes = xpath.select("/teams/team[name='" + encode(teamName) + "']/members/member/name/text()", doc);

In this modified application code, the malicious user input demonstrated in the attack shown above, would be encoded so that it is: Bears']

And the resulting constructed XPath query would be: /teams/team[name='Bears']']/members/member/name/text()

Parameterized XPath queries

A better option is to use parameterized XPath queries, however, this depends on the specific XPath library or API the application uses, as some libraries do not implement parameterization of queries. Similar to SQL parameterized queries, the user input is inserted into the query as a variable, and any special characters in that user input cause the query to fail or are automatically escaped and cannot change the syntax of the query. A parameterized XPath query may look similar to the following:

//team[name = $teamname]

Where $teamname is supplied as an argument to the parameterization method call.

Static analysis tool

Adding a static application security testing (SAST) tool to your DevOps pipeline as an additional line of defense is an excellent way to catch vulnerabilities before they make it to production. There are many, but Snyk Code is our personal favorite, as it scans in real-time, provides actionable remediation advice, and is available from your favorite IDE.

XPath injection

Construct XPath queries to guard against malicious input

Select your ecosystem

XPath injection: the basics

What is XPath injection?

About this lesson

Blind XPath injection

XPath injection in action

Testing for XPath injection

Setting the stage

Testing for XPath injection

Setting the stage

Testing for XPath injection

Setting the stage

Testing for XPath injection

Setting the stage

Testing for XPath injection

Setting the stage

XPath injection under the hood

How does XPath injection work?

How does XPath injection work?

Impacts of XPath injection

Impacts of XPath injection

Scan your code & stay secure with Snyk - for FREE!

XPath injection mitigation

Use an allowlist

Use an allowlist

Use an allowlist

Use an allowlist

Use an allowlist

Encode user input

Encode user input

Encode user input

Encode user input

Encode user input

Parameterized XPath queries

Parameterized XPath queries

Static analysis tool

Quiz

Test your knowledge!

Quiz

Test your knowledge!

Quiz

Test your knowledge!

Quiz

Test your knowledge!

Quiz

Test your knowledge!

Quiz

Keep learning

Congratulations

FAQs

What to learn next?