• Browse topics
Login
Sign up

XPath injection

Construct XPath queries to guard against malicious input

Select your ecosystem

XPath injection: the basics

What is XPath injection?

XPath is a query language for XML documents. It was designed to simplify selecting nodes within the document structure.

XPath injection is a type of attack that can change the intent of an XPath query that is executed on an application’s backend. An application might be vulnerable to this attack if special characters are injected into a user-supplied input value, that input is not filtered and is concatenated with other strings to construct an XPath query, which is executed against an XML document.

Impacts of this attack can include bypassing authentication logic, or the disclosure of sensitive data within the XML document being queried.

About this lesson

In this lesson, you will learn how XPath injection works and how to protect your applications against it. We will begin by exploiting an XPath injection vulnerability in a simple application. Then we will analyze the vulnerable code and explore some options for remediation and prevention.

FUN FACT

Blind XPath injection

Although XPath injection is not a very commonly reported class of vulnerabilities, there have been some high-profile applications that have been previously affected by this issue. This includes the popular open-source e-commerce application Adobe Magento, which was affected by a blind XPath injection vulnerability. The vulnerability was discovered and fixed in 2021. Its severity was deemed to be critical as it could lead to arbitrary code execution.

XPath injection in action

The Red Hills County Softball League has a web application built for its members and fans to view information about upcoming games, match results, and teams. But these small web applications aren't always built with security in mind.

Testing for XPath injection

  • STEP 1
  • STEP 2
  • STEP 3
  • STEP 4
  • STEP 5

Setting the stage

Mallory had a falling out with her former Bears teammates. The Red Hills County Softball League app is a great target to practice her hacking skills.

xpath-start.svg

XPath injection under the hood

How does XPath injection work?

Firstly, let's recap what took place in the interactive example above:

  • Certain special characters, like a single quote, that were injected in the user-supplied query-string parameter “team” caused an error condition in the application
  • The error messages presented by the application suggested that the server-side code was likely using XPath queries to retrieve data that was displayed to the user
  • A combination of special characters was found, including injecting a null byte, which manipulated the XPath query in a way that changed the extent of the query but still resulted in valid XPath syntax
  • As a result of the injection attack, the application retrieved more data than was supposed to be displayed and disclosed it to the attacker

We will have a look at this vulnerable application in more detail by going through the server-side code.

The team parameter in the query string is read from the GET request and stored in the teamName variable without any input validation or encoding

In our example attack, the hacker injected the string: Bears’]%00

The null byte (%00) terminated the string representing the XPath query so the rest of the string concatenated after the user input was ignored and the XPath query effectively became the following string: /teams/team[name='Bears’]<null character>

The query returned all child nodes of the <team> node where the <name> node’s value was Bears, including nodes such as the team members’ date of birth, address, and email, which were supposed to be private and not displayed to the application user.

Impacts of XPath injection

By exploiting XPath injection, a malicious actor could disclose sensitive data within the XML document being queried. For example, in the vulnerable application we looked at above, the personal information of team members was leaked, resulting in a violation of privacy for those individuals.

If the application uses the XML data for any security-related decisions, such as a database of usernames and passwords to authenticate users against, then authentication could be bypassed.

Disclosure of the contents of other files on the application server’s file system may also be possible depending on the XPath library in use and its configuration. The doc() and doc-available() XPath functions, when implemented in the XPath library, can allow the reading of files on the local filesystem.

Scan your code & stay secure with Snyk - for FREE!

Did you know you can use Snyk for free to verify that your code
doesn't include this or other vulnerabilities?

Scan your code

XPath injection mitigation

Use an allowlist

By restricting user-supplied input that is used to construct the XPath query to only known safe characters, the query can be securely constructed:

Encode user input

By encoding special characters injected into the user-supplied input, such as the single quote to its XML entity representation, we can avoid a situation where malicious user input breaks out of the intended XPath query syntax:

import {encode} from 'html-entities';
//…
const nodes = xpath.select("/teams/team[name='" + encode(teamName) + "']/members/member/name/text()", doc);

In this modified application code, the malicious user input demonstrated in the attack shown above, would be encoded so that it is: Bears&apos;]

And the resulting constructed XPath query would be: /teams/team[name='Bears&apos;]']/members/member/name/text()

Parameterized XPath queries

A better option is to use parameterized XPath queries, however, this depends on the specific XPath library or API the application uses, as some libraries do not implement parameterization of queries. Similar to SQL parameterized queries, the user input is inserted into the query as a variable, and any special characters in that user input cause the query to fail or are automatically escaped and cannot change the syntax of the query. A parameterized XPath query may look similar to the following:

//team[name = $teamname]

Where $teamname is supplied as an argument to the parameterization method call.

Static analysis tool

Adding a static application security testing (SAST) tool to your DevOps pipeline as an additional line of defense is an excellent way to catch vulnerabilities before they make it to production. There are many, but Snyk Code is our personal favorite, as it scans in real-time, provides actionable remediation advice, and is available from your favorite IDE.

Quiz

Test your knowledge!

Quiz

Which of the following techniques effectively prevents XPath injection attacks in JavaScript by encoding special characters in user input before passing it to an XPath expression?

Keep learning

To learn more about XPath injection, check out some other great content:

Congratulations

You have taken your first step into learning what XPath injection is, how it works, what the impacts are, and how to protect your own applications. We hope that you will apply this knowledge to make your applications safer.

We'd really appreciate it if you could take a minute to rate how valuable this lesson was for you and provide feedback to help us improve! Also, make sure to check out our lessons on other common vulnerabilities.