Directory traversal

Unintended disclosure of sensitive files

Directory traversal: the basics

What is directory traversal?

A directory traversal attack aims to access files and directories that are stored outside the intended folder. By manipulating files with "dot-dot-slash (../)" sequences and its variations, or by using absolute file paths, it may be possible to access arbitrary files and directories stored on the filesystem; including application source code, configuration, and other critical system files.

About this lesson

In this lesson, you will learn how directory traversal works and how to mitigate it in your application. You will first use a directory traversal attack to hack a vulnerable web server. We will then explain directory traversal by showing you the backend code of that vulnerable server. Finally, we will teach you how to prevent directory traversal from affecting your code.

Ready to learn? Buckle your seat belts, put on your hacker's hat, and let's get started!

Fun Fact

Valar Morghulis - The faceless vulnerability

The directory traversal vulnerability wears many faces. Some people also call it path traversal, path manipulation, dot-dot-slash, directory climbing, or the backtracking vulnerability. All of these are actually the same vulnerability.

Directory traversal in action

Try now

Hacking a to-do app

To increase revenue and survive until the next funding round, a company called startup.io decided to create a side product. Since the market for image hosting platforms has recently become a bit saturated, the firm made a call to build an app for managing to-do lists instead.

Sadly, their to-do app is vulnerable to directory traversal attack. Let's use a terminal window and curl to exploit the vulnerability. Our goal is to view the /etc/passwd stored on the backend server.

Demo terminal
cURL the about page

The application is hosted on https://todoapp.startup.io. First, let's try to curl a page we should have access to by running the following in the terminal:

curl https://todoapp.startup.io/public/about.html

We see the about.html page returned, which is to be expected. Notice that this HTML page is being served from the public directory.

Directory traversal under the hood

How does directory traversal work?

Essentially, the attack is accomplished by adding characters such as ../ into a URL that serves content from a directory structure. The content is usually served from a base directory, such as /public. An attacker can supply filenames that contain ../ or a URL encoded equivalent %2e%2e%2f. These URLs allow the attacker to break out of the base directory and view files stored in other folders on the filesystem.

A directory traversal attack which shows a hacker sending malicious payload to a server and accessing files which shouldn't be publicly accessible

A directory traversal attack which shows a hacker sending malicious payload to a server and accessing files which shouldn't be publicly accessible

To illustrate this, let's jump into the code. Below you will find the a function, which constructs a filesystem path from the URL. All files and directories returned by the function are served statically by the web server.

os.Getwd() returns the current working directory of the process.

Directory traversal mitigation

Permit only safe filesystem paths

Golang’s filepath.Join() will return the specified path generated from the components. To restrict access to the current working directory, we should check that the logo file path is the same as the expected base filesystem path where the image files reside.

Let’s look at how to do this by walking through the remediated code shown below. Firstly, we generate the directory component for the base path separately and assign it to the basePath variable. We then generate the absolute file path for the logo by combining the basePath and query string parameter for the logo filename. To check what the directory for the logo is, we use the filepath.Dir() function, which returns the specified directory path of the logo file without the filename on the end. We can then compare if this directory is the same as the base path and reject any requests where it’s not what we expect. You can see here in the result, our injected path would be converted to a canonical path after joining with our working directory:

If our app’s base path is “/opt/todoapp/images” and the attack payload of “../../../etc/passwd” is injected into the logo filename query string parameter, it will result in an error because the directory “/etc/” is outside the base path directory of “/opt/todoapp/images”.

Fun Fact

Zip Slip - A more dangerous cousin

As presented in this lesson, directory traversal is a read-only vulnerability: it allows the attacker to read sensitive files. However, there is a more dangerous cousin in the directory traversal family tree. That cousin is called Zip Slip, and it allows the attacker to execute commands by overwriting files on a remote server. Sounds scary? It is! Check out more about Zip Slip on Snyk's Zip Slip research page.

Keep learning

To learn more about directory traversal, check out some other great content produced by Snyk:

Congratulations

You’ve learned what directory traversal is and how to protect your systems from it. We hope you will apply your new knowledge wisely and make your code much safer. Feel free to rate how valuable this lesson was for you and provide feedback to make it even better! Also, make sure to check out our lessons on other common vulnerabilities.