What is directory traversal?

Resolving paths

We can use the “realpath” function to dereference the requested (relative) path, and return the absolute path of the file.

By defining a base path (or an array of allowable paths) in our program, we can compare them to the output of realpath and decide whether a request should be permitted or not. If the prefix paths do not match up, then our user could be attempting to access files they’re not supposed to!

Now the images may only be used if they reside within /var/www/html/site.com/images/, and it will not result in code injection, because we are using readfile() instead of include(). If the requested file doesn’t exist, or the path doesn’t begin with the one we specify, then this code will return a default placeholder.

Validate canonical path

The most robust way to prevent directory traversal attacks is to avoid relying on user-supplied input when dealing with the filesystem APIs. Unfortunately, this is easier said than done and might require rewriting a considerable chunk of your application.

A more realistic mitigation mechanism is to prevent the user-supplied directory from being higher up on the filesystem than the directory used to serve static content.

For example, if an application serves files from /wwwroot/public/, any canonical representation of the user requested path must start with /wwwroot/public/. Otherwise, the request could break out of the target directory.

The code example below shows how to normalize the user-supplied path and check whether it starts in the expected directory. To achieve this in JavaScript, use path.normalize or similar.

Directory traversal mitigation

Verify the input

The path normalization will deal with malicious inputs such as https://todoapp.startup.io/public/../. However, we can still trick it by encoding the . character as %2e. To be fully protected, you need to sanitize your user-supplied data and get rid of unexpected inputs, for instance:

Maintain a set of allowed filesystem paths and compare the user input against that set.
Allow only alphanumeric characters and reject inputs that contain other characters.

If the above measures are impractical, consider disallowing dangerous characters explicitly. For example, the below code removes the URL-encoded characters from the user-supplied input.

Don't reinvent the wheel: use open-source libraries

Correct sanitization of user input is hard work and requires constant verification against newly discovered ways to bypass known protection methods. In almost all cases, it is a better choice to use a well-maintained open-source library.

For instance, to serve static files in JavaScript, consider using st. Another option is to build your application with web frameworks, such as express, which have built-in support for serving static content.

To decide which libraries to trust, use Snyk Advisor! Snyk Advisor provides information on a given package's popularity, community support, and security. Also, check your open source libraries with vulnerability scanners such as Snyk, which will notify you about all new vulnerabilities discovered in any libraries you are using, and will help you mitigate them easily.

How do you mitigate directory traversal?

To recap, to mitigate directory traversal in your codebase, avoid calling filesystem APIs with user-supplied data as input. If that is not practical, validate that the user-supplied path is a child of the directory which the application serves from. Remember to sanitize the input to prevent malicious payloads from tricking you through techniques such as URL encoding. Finally, instead of writing all the logic yourself, consider using popular open-source libraries which handle things for you.

Bonus: directory traversal in the wild

st is a popular JavaScript library used for serving static files. In its early version, it was vulnerable to directory traversal, which actually posed a serious security threat for the entire NPM ecosystem.

The diff below is from the commit, which added the sanitization to catch directory traversal attempts with URL encoding.

st mitigation explained

Take the code tour to understand the fix.

Validate canonical path

The most robust way to prevent directory traversal attacks is to avoid relying on user-supplied input when dealing with the filesystem APIs. Unfortunately, this is easier said than done and might require rewriting a considerable chunk of your application.

A more realistic mitigation mechanism is to prevent the user-supplied directory from being higher up on the filesystem than the directory used to serve static content.

For example, if an application serves files from /wwwroot/public/, any canonical representation of the user requested path must start with /wwwroot/public/. Otherwise, the request could break out of the target directory.

The code example below shows how to normalize the user-supplied path and check whether it starts in the expected directory. To achieve this in Java, use Path.normalize or similar.

Verify the input

The path normalization will deal with malicious inputs such as https://todoapp.startup.io/public/../. However, we can still trick it by encoding the . character as %2e. To be fully protected, you need to sanitize your user-supplied data and get rid of unexpected inputs, for instance:

Maintain a set of allowed filesystem paths and compare the user input against that set.
Allow only alphanumeric characters and reject inputs that contain other characters.

If the above measures are impractical, consider disallowing dangerous characters explicitly. For example, the below code removes the URL-encoded characters from the user-supplied input:

Don't reinvent the wheel: use open-source libraries

Correct sanitization of user input is hard work and requires constant verification against newly discovered ways to bypass known protection methods. In almost all cases, it is a better choice to use a well-maintained open-source library.

For instance, consider building your web application with Spring Boot, which has built-in support for serving static content.

To decide which libraries to trust, use Snyk Advisor! Snyk Advisor provides information on a given package's popularity, community support, and security. Also, check your open source libraries with vulnerability scanners such as Snyk, which will notify you about all new vulnerabilities discovered in any libraries you are using, and will help you mitigate them easily.

How do you mitigate directory traversal?

To recap, to mitigate directory traversal in your codebase, avoid calling filesystem APIs with user-supplied data as input. If that is not practical, validate that the user-supplied path is a child of the directory which the application serves from. Remember to sanitize the input to prevent malicious payloads from tricking you through techniques such as URL encoding. Finally, instead of writing all the logic yourself, consider using popular open-source libraries which handle things for you.

Removing Symbolic Links

Symbolic Links (or symlinks) link to another file or directory in the file system, for example ../ points to the directory above the current working directory. We can use os.path.realpath to dereference the symlinks, return the absolute path and restrict access to the current working directory. You can see here in the result, our injected path would be converted to a canonical path before joining with our working directory:

When combined with the working directory, it will produce an error because there is no nested directory /home/etc/passwd in the applications working directory and will not allow an attacker to traverse outside of the directory.

We can update our code (with some detailed comments!) to make it safer:

Permit only safe filesystem paths

Golang’s filepath.Join() will return the specified path generated from the components. To restrict access to the current working directory, we should check that the logo file path is the same as the expected base filesystem path where the image files reside.

Let’s look at how to do this by walking through the remediated code shown below. Firstly, we generate the directory component for the base path separately and assign it to the basePath variable. We then generate the absolute file path for the logo by combining the basePath and query string parameter for the logo filename. To check what the directory for the logo is, we use the filepath.Dir() function, which returns the specified directory path of the logo file without the filename on the end. We can then compare if this directory is the same as the base path and reject any requests where it’s not what we expect. You can see here in the result, our injected path would be converted to a canonical path after joining with our working directory:

If our app’s base path is /opt/todoapp/images and the attack payload of ../../../etc/passwd is injected into the logo filename query string parameter, it will result in an error because the directory /etc/ is outside the base path directory of /opt/todoapp/images.

Resolving paths

We can use the Path library’s functions to dereference relative paths, and return the absolute path of a file! The Path library has a function, Path.GetFullPath, which returns this absolute path:

string abs_img_path = Path.GetFullPath(img_path);

By defining a base path (or an array of allowable paths) in our program, we can compare them to the output of realpath and decide whether a request should be permitted or not. If the prefix paths do not match up, then our user could be attempting to access files they’re not supposed to!

Now when the file is retrieved from the file system, we know that it’s being retrieved from a whitelisted (approved) directory - and so unless you’re storing sensitive information in that directory, your code is directory traversal free!

Directory traversal

Unintended disclosure of sensitive files

Select your ecosystem

Directory traversal: the basics

About this lesson

Valar Morghulis - The faceless vulnerability

Directory traversal in action

Hacking a to-do app

Listing the public page

List one page up

Circumventing sanitization

Accessing sensitive information

Directory traversal under the hood

How does directory traversal work?

The vulnerable code

The vulnerable code

The vulnerable code

Double trouble!

The vulnerable code

The vulnerable code

The vulnerable code

Scan your code & stay secure with Snyk - for FREE!

Directory traversal mitigation

Resolving paths

Zip Slip - A more dangerous cousin

Validate canonical path

Verify the input

Don't reinvent the wheel: use open-source libraries

How do you mitigate directory traversal?

Zip Slip - A more dangerous cousin

Bonus: directory traversal in the wild

st mitigation explained

Validate canonical path

Verify the input

Don't reinvent the wheel: use open-source libraries

How do you mitigate directory traversal?

Zip Slip - A more dangerous cousin

Removing Symbolic Links

Zip Slip - A more dangerous cousin

Permit only safe filesystem paths

Zip Slip - A more dangerous cousin

Resolving paths

Zip Slip - A more dangerous cousin

Quiz

Test your knowledge!

Quiz

Test your knowledge!

Quiz

Test your knowledge!

Quiz

Keep learning

Congratulations

What to learn next?

Broken object level authorization

Session persistence after logout

Insecure default variable initialization