Type confusion
The dangers of assuming a type
JavaScript
Type confusion, also known as type manipulation, is an attack vector that can occur in interpreted languages such as JavaScript and PHP, which use dynamic typing. In dynamic typing, the type of a variable is determined and updated at runtime, as opposed to being set at compile-time in a statically typed language.
While dynamic typing can make software more flexible and development faster, it also opens the door to type confusion attacks. In this type of attack, an attacker modifies the type of a given variable in order to trigger unintended behavior. This can lead to all types of bypasses and vulnerabilities, such as cross-site scripting, access control bypasses, and denial of service.
About this lesson
In this lesson, you will learn how type confusions work and how to protect your applications against them. We will begin by exploiting a type confusion vulnerability in a simple application. Then we will analyze the vulnerable code and explore some options for remediation and prevention.
We step into the shoes of Joe, who is a back-end developer at a social media company.
Joe has previously had a cross-site scripting attack on his social media site, because users could add <
and >
to their usernames. This led to users modifying their pages and spreading XSS worms.
After this incident, Joe implemented a strict username policy where the characters <
and >
are not allowed anymore. Joe also updated every username in the database to exclude those characters. But has he done enough?
Let’s break down the story above, and get into detail of why changing the parameter type resulted in a bypass of the character restrictions.
The social media site used the following JavaScript (Node.js) code for the sign-up endpoint:
As you can see, the signup function is rather simple. It takes in two POST parameters, does a username validation check, and performs some password transformations and validations before adding the user to the database and returning a “thank you” message.
The isValidUsername
function is responsible for determining whether the username contains forbidden characters. This is the function that, in theory, should have prevented the cross-site scripting attack from happening.
Can you spot the mistake?
Node.js uses dynamic typing, meaning that the username
parameter does not have a static type and can thus be anything. It can be an object
, array
, string
, boolean
, integer
, etc. However, the developer (Joe) mistakenly assumed that the username POST parameter would always be a string, and programmed the validation logic according to that assumption.
When an attacker supplied a username with type array
, the req.body.username
variable became an array, meaning the isValidUsername
function received an array rather than a string.
As you might have figured out, running includes
on ["test<script>alert(\"Hello World\")</script>"]
with the substring <
and >
returns false, because there is no key in the array with the same value.
The following examples show the behavior of the includes
function on type string
and array
:
Because the includes
statement resulted in false
, the username was deemed valid and the user was created.
Note that the string representation of ["<script>alert(\"Hello World\")</script>"]
is <script>alert(\"Hello World\")</script>
. So that would be the username of the user after utilizing the bypass.
Impacts of type confusion
Type confusion almost always leads to unexpected behavior that could result (if exploited) in all kinds of vulnerabilities. Type confusion can happen when user input is not validated or when data is passed between different parts of the application without being properly converted or checked.
In the case above, the application expected a string, but an array was supplied, leading to a sanitization bypass and resulting in cross-site scripting.
There is no predefined impact that can be achieved with type confusion. This is why we call it unexpected behavior. The following are some examples of impacts that have been achieved in the past with known type confusion bugs: cross-site scripting, denial of service, data exposure, access control bypass, file inclusion, and remote code execution.
Preventing type confusions is easier than locating them. Each sort of type confusion will have a different approach to fixing or preventing it.
Let's look back at the sign-up code from Joe and apply fixes.
In the previous chapter, we have established that the isValidUsername
function can be fooled by supplying a username
of type array
. Mitigating this consists of type checking or explicit type casting.
For example:
The original code has been extended with a typeof check on the username and password.
When the attacker now supplies an array as username or password, the if statement will be true
, and an error message will be returned, making the attack impossible.
In essence, variable types should be considered user input as much as the value of the variable. Never assume the type, or pass it around without being explicit (such as type checking or casting).
Additionally, a type-safe language like TypeScript can help prevent type confusion by catching errors before the program runs.
Test your knowledge!
Keep learning
To learn more about type confusion, check out some other great content:
- CodeQL has a detection module for type confusions. More info here
- Snyk has a blog posts series about type confusion at https://snyk.io/blog/type-manipulation/
- Examples, information and definitions can be found at the CWE-843 page