• Browse topics
Login
Login

SNYK LEARN LOGIN

OTHER REGIONS

For Snyk Enterprise customers with regional contracts. More info

Uncontrolled Recursion

How infinite loops can ruin your life and crash your systems.

~20mins estimated

Python

Uncontrolled Recursion: The Basics

What is uncontrolled recursion?

The term recursion comes from a specific kind of function built into backend code. But what does it really mean? You might have come across recursion before in math classes. An example is the Fibonacci sequence, where each element is the sum of the previous two (any term can be written as F(n) = F(n-1) + F(n-2)). The meaning behind this comes from the ability to break the whole into smaller and smaller parts, and each element is self-referential.

fibonacci

In programming, a recursive function uses a unique method of problem-solving based on that process. It will take the given problem and break it down into smaller and smaller pieces until it reaches what is called the ‘base case,’ at which point the recursion can terminate. Intermediately, the function calls itself over and over on these pieces until it stops. The reason this is such a powerful tool is that it allows problems that naturally follow this kind of inductive structure up from the base case (each step being called a ‘recurrence relation’, like the one given above for Fibonacci) to be figured out much more naturally. Plus, it keeps code succinct and organized!

As you can imagine, though, a serious problem arises when the function struggles to find its base case. If you keep breaking problems down into tiny pieces and trying to use the function again, with no defined end in sight, memory resources are wasted dramatically by a persistent, never-ending loop.

FUN FACT

Compression!

Recursion is often used in data compression and image processing. One common mathematical method for this is singular value decomposition, a form of data compression that breaks the structure down into parts and eliminates unnecessary singular values. Although this process is usually iterative, there is also a recursive form that can be used more dynamically.

About This Lesson

Take a look at recursive functions with us, and learn what causes them to loop infinitely (leading to problems like stack overflow and program crashes). Then, see what this can mean in terms of exploits, and pick up some mitigation tactics to keep yourself and your organization safe.

Importance and Background of Uncontrolled Recursion

A couple of data structures that recursive functions are specifically compatible with are linked lists and trees. The reason being that the function can analyze each node or element by calling itself on the child node(s) (in a tree) or element (in a list). By making its way down to the final node or element, the entire data structure can be easily covered quickly and efficiently.

Some salient applications of recursive functions include sorting and searching. A real-world example is web crawling, which employs recursive functions to navigate through a trail of connected websites and extract defined data. Also, there is a mechanism called route reflection that uses recursive routing for the Border Gateway Protocol. In this scenario, routers move information through the network hierarchy recursively, which is similar to the domain name system, where recursive queries contact the authoritative DNS servers iteratively until a final domain name answer is discovered. Clearly, this thinking comes into play in various infrastructures and systems.

However, when strategies are not implemented to make sure recursion terminates, the repeated function calls lead to stack memory and resource consumption, which can then be exploited by bad actors. The call stack is a finite amount of memory allocated to storing temporary data, such as during function calls. Each time the function runs, a stack frame is added to memory, and it continues to do so until it is terminated. When the space runs out, for example when a recursive function runs infinitely, the additional stack frames can invade adjacent memory, and can then be used to corrupt the program (leading to things like privilege escalation), crash an app/system (denial of service), corrupt or leak sensitive data (see our PII lesson), or even execute malicious code.

Uncontrolled recursion In Action

Imagine you are building a tool to index files for a search engine. Your code is designed to enter a folder, list the files, and then recurse into any subfolders it finds.

The Setup

An attacker creates a folder structure that looks like this:

Documents/
└── Project/
└── Shortcut_to_Documents (A symbolic link pointing back to Documents/)

When your script hits Shortcut_to_Documents, it thinks it has found a new folder. It enters this supposed folder, finds Project, enters that, finds the shortcut again, and begins an infinite descent.

WARNING: This code is for educational purposes. It is designed to fail and will cause a RecursionError or high CPU usage if the "circular link" logic is triggered.

Scan your code & stay secure with Snyk - for FREE!

Did you know you can use Snyk for free to verify that your code
doesn't include this or other vulnerabilities?

Scan your code

Uncontrolled Recursion Under the Hood

To understand why the code fails, let’s look at what the computer is doing behind the scenes. Every time a function is called, the system must also remember where it came from in order to return home.

Let's break down the vulnerable logic line by line.

Every time this line executes, the CPU creates a Stack Frame, a small block of memory that stores the current path string and the depth integer. The depth parameter is incremented with each call, but note that it is never used as a guard. Nothing in this code will stop the recursion based on how deep it goes.

The program looks at everything inside the current folder. If the folder contains a circular link, a symbolic link pointing back to a parent directory, full_path becomes a path the program has technically already visited. Crucially, the program has no way of knowing that yet.

This is where the vulnerability lives. The code asks: "Is this a directory?" The operating system answers "Yes" because a symbolic link to a folder is identified as a directory. With no check for whether this path has been visited before, the function calls itself again, unconditionally.

The Stack Overflow

In memory, the call stack grows like a physical stack of plates. Each new function call places a frame on top.

Frame 1: path="/Users/Shared/App/Home", depth=1 ← Bottom
Frame 2: path="/Users/Shared/App/Home/Project", depth=2
Frame 3: path="/Users/Shared/App/Home/Project/Shortcut_to_Documents", depth=3
Frame 4: path="/Users/Shared/App/Home", depth=4 ← Same as Frame 1
Frame 5: path="/Users/Shared/App/Home/Project", depth=5
...
Frame 1000: Stack limit reached

Python tracks this and raises a RecursionError once the call stack hits its limit. Without that built-in safety net, in a language like C, for example, the stack would silently overflow, corrupting memory or crashing the process entirely. Python's limit prevents the worst outcome, but the program still fails completely and any work done up to that point is lost.

Why This Vulnerability is Dangerous

Denial of Service (DoS)

This is the most common result. An attacker can provide a specially crafted input (like a ZIP file containing a circular directory structure, often called a "Zip Bomb") that causes the application to consume 100% of the CPU and RAM. This freezes the application, making it unavailable to legitimate users.

Memory Corruption

In lower-level languages like C or C++, the system doesn't always have a "safety net" like Python’s RecursionError. When the stack overflows, it can begin overwriting data in adjacent memory regions (the Heap). If an attacker can control what is written into that adjacent memory, they can perform a Stack-Based Buffer Overflow, potentially hijacking the program's execution flow.

Remote Code Execution (RCE)

By corrupting the stack, a sophisticated attacker might overwrite the home base of a function. Instead of the function returning to the safe part of your code, it jumps to a piece of malicious code the attacker has injected into memory. Now, the attacker is running commands with the same privileges as your application.

Here is an example of the parameter condition not being uploaded to the else clause in a lower-level language (in this case, C++):

FUN FACT

AI??

In terms of AI/Machine Learning applications, recursive neural networks are used in natural language processing to break sentences apart and repeatedly apply operations to words and their grammatical structures. Decision tree algorithms also have a recursive construction, where each node splits the feature space based on attributes used to predict some target variable. Recursion is important in the design and implementation of intelligent systems.

Uncontrolled Recursion Mitigation

Secure Code

Now, let’s look at a secure version of the directory crawler. It uses two independent defenses: a depth limit to cap how far the recursion can grow, and a set of visited inode IDs to detect cycles before they can repeat.

Why two mitigations instead of one? They defend against different threats. Cycle detection handles the symbolic link attack described in this lesson. Depth-limited handles a different scenario, where an attacker creates an extremely deep but non-circular folder tree, which would exhaust the stack without ever revisiting a node. Together, they cover both possible vulnerabilities.

In general, here are a few go-to mitigation strategies…

Management of Call Stack/Buffer Above

As mentioned before, each function call means that the system allocates a stack frame containing local variables, return addresses, and parameters. As a safegaurd, when this stack has a fixed limit, the maximum depth is strictly defined, and the system can throw and exception before it crashes or leaks adjacent memory.

Plus, buffer protections manage the buffer within these frames. Some common techniques are stack canaries, address space layout randomization (ASLR), and non-executable stack. Stack canaries act as a special check that the function must do before returning to the return address. If it was overwritten, the entire program shuts down immediately. ASLR randomizes where the stack is located in memory, making it harder for an attacker to overwrite. Finally, a non-executable stack makes stack memory only available for data, so even if malicious code is injected, the CPU will refuse to execute it. Overall, these prevent an attacker from using a recursive loop to corrupt a local variable and change the home base/return address (which again has the potential to redirect the program to malicious code).

Tail Recursion/Tail Call Optimization

In tail recursion/tail call optimization, the recursive call, or step that allows the function to repeat itself, is the last operation within the function. Therefore, the compiler or interpreter can use the same stack frame for each recursive call, getting rid of the need to take up additional stack space. This works because the function does not need to remember anything after the recursive step, and hence has no reason to store local variables, return addresses, etc. Not only can this make the function more efficient, but it can also prevent stack overflow errors since it is physically impossible for an attacker to cause deep recursion.

Input Validation and Termination Conditions

Termination depends on the set base case. If an attacker is able to use input to bypass this base case, it causes the function to recurse infinitely. However, hardening input conditions ensures that this stopping criteria is always reachable and prevents malicious actors from using input to cause incorrect behavior.

To provide an example, say you have a simple recursive function to calculate Fibonacci numbers. If an attacker inputs a huge integer, this could cause a dangerous execution that may lead to resource exhaustion. However, sanitizing input by only allowing the function to execute when the input is less than 1,000 caps the function at a safe depth.

Updating Parameters

Finally, updating parameters can help to guarantee convergence of the function. A safe recursive function gets closer to termination with each step, so the parameters must update at each step or the algorithm will reach stagnation. A clever attacker might target the updating logic within a function in order to trigger a fail or stall. Making sure that parameters are strictly transformed in a way that progresses the function helps to prevent this possible attack vector.

Quiz

Quiz

Which of the following describes Tail Call Optimization (TCO) as a defense mechanism?

Keep Learning

Now that you know all about how to mitigate possible vulnerabilities introduced by uncontrolled recursion, check out some of the other Snyk Top 10 lessons if you haven't already!

Or some other related lessons:

Congratulations

Great work!! You are now well informed about uncontrolled recursive patterns in your software, and know how to mitigate their exploit. Keep on going with more Learn lessons about memory, or continue with the Snyk Top 10 to effectively secure your systems.