Race condition
Also known as: concurrent execution using shared resource with improper synchronization
Select your ecosystem
What is a race condition?
Race conditions are a type of vulnerability that occurs in multi-threaded or concurrent processing environments. They arise when two or more processes access and manipulate the same data or resource simultaneously without proper coordination. This concurrent access can lead to unexpected and erratic behaviors, as the outcome of the operations depends on the sequence or timing of these processes.
In practice, race conditions can manifest in various forms, often subtle and difficult to detect. A compelling example in web applications is the single packet attack, where a race condition can be triggered by completing multiple HTTP requests in a single packet. This new and sophisticated technique can lead to the server processing these requests concurrently, causing conflicts in data handling or transaction processing.
About this lesson
In this lesson, you will learn about vulnerabilities stemming from race conditions and how to protect your applications against them. We will step into the shoes of a hacker named Jordan, who doubled his account balance by abusing a race condition in his online banking platform.
Meet Jordan, a savvy hacker who is closely examining the online payment system of Saturn Bank, a renowned digital banking platform. His recent loan rejection from them has motivated him to probe for potential vulnerabilities in their system.
Jordan zeroes in on their wire transfer functionality. He suspects that executing simultaneous requests might reveal an underlying flaw; a race condition. To test this, he creates a bash script designed to launch two wire transfer POST
requests at the same time. Let's see this in action.
Let’s break down what happened in the story above. Jordan was sending two wire transaction requests simultaneously, which led to both transfers being processed, while his balance was only enough for one of these.
A simplified version of the backend code from Saturn Bank's payment API is as follows:
In this example, complicated balance processing logic has been replaced with a simple, demonstrative example with account balances. Together with this goes a small, artificially introduced delay to simulate the more expensive functions that usually complete these actions. For example, database queries or communication with other systems.
As you might have noticed, the balance check and update are not atomic operations. When Jordan's script sends two simultaneous requests, this small delay of 10 milliseconds is significant enough for both requests to arrive at the balance check almost concurrently before any of the other has reached the deduction section. Both requests find sufficient funds in Jordan's account, which at that point is still $100. This perfect timing allows both transactions to proceed with the transfer, despite the account having enough funds for only one transaction.
What is the impact of a race condition?
The impact of race conditions is highly context-dependent, with their effects varying widely based on the environment and application. Common impacts include unauthorized data access, corruption of data, and system crashes. In scenarios involving user privileges or subscriptions, race conditions can allow users to exceed their authorized limits, such as accessing features or resources beyond their subscription plan. In other cases, they might lead to bypassing security checks, enabling unauthorized actions within a system.
Another typical impact is the disruption of the normal operation flow, resulting in unreliable application behavior. This can manifest in everything from minor annoyances, like duplicated posts on a social platform, to severe outcomes like incorrect data processing in a critical system, such as in the story with Jordan. The unpredictable nature of race conditions also means that they can cause intermittent issues that are challenging to replicate and diagnose, leading to prolonged system vulnerabilities.
In Jordan’s scenario, the race condition resulted in a significant financial advantage where he was able to increase his funds through concurrent transactions, a situation akin to creating money without a legitimate source. This kind of impact is especially critical in financial systems, where the integrity and reliability of transactions are paramount. However, the ability to exploit a race condition depends greatly on several factors, including the system's response time and the timing of the requests. In this case, Jordan executed only two concurrent requests, but with a larger time window or repeated attempts, he could have potentially increased the exploitation to a dozen requests. At last, it's important to note that race conditions are not consistently replicable; their occurrence can depend on numerous variables, making them both unreliable, and potentially dangerous in their unpredictability.
Mitigating race conditions involves implementing strategies that ensure synchronized access to shared resources in concurrent processing environments. The cornerstone of prevention lies in designing systems where access to shared data or resources is controlled and coordinated. This often involves using synchronization mechanisms like mutexes (mutual exclusions) or semaphores, which allow only one thread to access a shared resource at a time, thereby preventing simultaneous access that could lead to a race condition. Another effective strategy is to adopt immutable objects, which, once created, cannot be modified. This makes them inherently safe in concurrent environments, as there are no state changes that could lead to race conditions. Additionally, employing atomic operations, which are indivisible and completed in a single step, ensures that no other process can interrupt or observe the operation halfway through, thereby maintaining consistency.
Different systems and programming contexts require tailored approaches to prevent race conditions. For instance, in systems involving database logic, the use of transactions is crucial. Transactions ensure that a series of database operations are treated as a single atomic unit, either all succeeding or all failing, which maintains data consistency.
In the specific scenario of Jordan and his bank, several targeted mitigation strategies could be employed. Firstly, introducing proper locking mechanisms within the application code can prevent concurrent requests from simultaneously modifying the account balances. This could be implemented by using a mutex to lock the account balance for the duration of the check and update operations, ensuring that only one transaction can modify the balance at a time. Furthermore, implementing rate limiting on the API could prevent an excessive number of concurrent requests, reducing the risk of race conditions being exploited by malicious users.
Here is how the modified backend code would look like, not vulnerable to race conditions:
Note that in this case, a global mutex lock is employed, instead of a lock per account. This is done to keep the demonstrative code simplified for easy reading. This does however, in production systems, cause a significant delay on account transfers, as every transfer effectively waits on the lock to be released from another transaction, which most likely impacts different balances, making it an unnecessary wait.
Test your knowledge!
Keep learning
Learn more about race conditions with these resources:
- The CWE page matching this vulnerability type
- A non-web focussed blog post about race conditions in Node.js
- Official Golang article about detecting and fixing race conditions in Go
- A PortSwigger post about web race conditions
- James Kettle’s research about race conditions using various new techniques
- A pentest guide to race conditions