Unbounded consumption in LLMs

Unbounded consumption: the basics

What is unbounded consumption?

Unbounded consumption is a vulnerability that occurs when a Large Language Model (LLM) application is tricked into an excessive, and often recursive, use of resources without proper limits. This can lead to a Denial of Service (DoS), where the application becomes slow or unavailable for legitimate users, and can result in surprisingly high financial costs from API usage and computing power.

Unlike traditional applications where resource limits are often well-understood, LLMs can be unpredictable. They might get stuck in a loop trying to solve a complex problem, make an unexpectedly large number of calls to external tools or APIs, or spend too much time processing an overly long input from a user. If there are no safeguards to cap this consumption, an attacker can intentionally craft inputs that trigger this behavior, effectively weaponizing the LLM against its own infrastructure.

About this lesson

In this lesson, you'll see how a seemingly helpful LLM-powered assistant can be turned into a resource-draining machine. We'll follow a story where an attacker exploits a travel planning bot's recursive nature to knock it offline and generate a massive bill. We will then inspect the vulnerable code that lacks essential safeguards and learn how to implement the rate limits, timeouts, and checks needed to prevent unbounded consumption.

Unbounded consumption in action

TravelGenie is a new AI travel assistant that helps users plan complex, multi-city trips. To provide the best results, it can call other services (tools) to check for flight availability, hotel prices, and local events. If it can't find a direct flight, it's programmed to recursively search for connecting flights.

An attacker, Mal, notices this recursive capability. Mal isn't trying to steal data. No, no, they just want to cause chaos and cost the company money. They give TravelGenie a paradoxical and impossible request:

"Plan me a trip from Anchorage to Paris, but I can only fly on airlines that exclusively fly east, and each connecting flight must land in a city further west than the previous one."

TravelGenie accepts the input. It starts by finding a flight east from Anchorage. Then, to fulfill the second condition, it tries to find a connecting flight from that destination to a city further west. This sends it on a wild goose chase across the globe. The bot enters a recursive loop!

This process continues, with TravelGenie making thousands of API calls to flight and hotel databases every minute. The system's resource usage skyrockets, the application becomes unresponsive for other users, and the company's cloud services bill starts climbing into the tens of thousands of dollars. The genie is stuck in its own lamp.

Unbounded consumption under the hood

The attack was successful because the developers of TravelGenie didn't put a limit on how many times the planning function could call itself. They trusted that users wouldn't make impossible requests and never planned for a recursive loop.

Here is a simplified Python code example that demonstrates the core of the vulnerability. It shows a function that calls itself without a "depth" or iteration limit.

The critical flaw is that plan_trip calls itself without any termination condition other than finding the final destination, which, based on the impossible rule, it never will. Each call to find_flight represents a real-world cost and use of resources.

The impacts of unbounded consumption

The consequences of this vulnerability are primarily focused on service availability and cost:

Denial of service (DoS): As resources are consumed by the attacker's request, the application slows down or becomes completely unavailable for legitimate users.

High financial costs: Every API call, every second of compute time, and every byte of data transfer costs money. An unbounded attack can lead to astronomical bills in a very short amount of time.

System instability: Continuous high resource usage can lead to cascading failures across an application's infrastructure.

Reputational damage: An unreliable or unavailable service can quickly erode user trust.

Unbounded consumption mitigation

To fix this vulnerability, you must enforce strict limits on resource consumption at every stage of the LLM's process. You cannot trust the LLM to know when to stop on its own.

The solution is to introduce safeguards like depth counters, input validation, and timeouts. Here is the corrected code, which adds a simple depth parameter to limit the recursion.

By adding the depth counter and the max_depth check, we ensure the function can never run away in an infinite loop. Other essential mitigation strategies include:

Input length limits: Reject user inputs that are excessively long before they are even processed.
Timeouts: Enforce a strict time limit on how long any single LLM generation or tool use can take.
API call capping: Limit the number of API calls a single user request can trigger.
Monitoring and alerting: Set up billing alerts and performance monitoring to immediately detect unusual spikes in resource consumption.

Quiz

Test your knowledge!

What is a common mitigation strategy for preventing unbounded consumption in large language model (LLM) applications?

Keep learning

The security of AI and LLM applications is a rapidly evolving field. To stay up-to-date, check out these resources:

OWASP LLM Top 10 – LLM10: Unbounded Consumption
CWE-400: Uncontrolled Resource Consumption

Congratulations

By completing this lesson, you now know all about unbound consumption in LLMs and can mitigate against it! Let's try to keep the system running and the bills low!

FAQs

What to learn next?

Misinformation in LLMs

Learn how LLMs can spread convincing misinformation and explore techniques to prevent harmful or false outputs.

Vector and embedding weaknesses in LLMs

Learn how weak embeddings in LLMs can confuse inputs, enabling attackers to bypass filters or poison models.

System prompt leakage in LLMs

Learn how users can manipulate a chatbot into revealing its system instructions, why that’s dangerous, and how to prevent this kind of leakage.

Don't let your LLM's enthusiasm empty your wallet

AI/ML

Unbounded consumption: the basics

What is unbounded consumption?

About this lesson

Unbounded consumption in action

Unbounded consumption under the hood

The impacts of unbounded consumption

Scan your code & stay secure with Snyk - for FREE!

Unbounded consumption mitigation

Quiz

Test your knowledge!

Quiz

Keep learning

Congratulations

FAQs

What to learn next?

Misinformation in LLMs

Vector and embedding weaknesses in LLMs

System prompt leakage in LLMs

Unbounded Consumption in LLMs

Don't let your LLM's enthusiasm empty your wallet

AI/ML

Unbounded consumption: the basics

What is unbounded consumption?

About this lesson

Fork bomb

Unbounded consumption in action

Unbounded consumption under the hood

The impacts of unbounded consumption

Scan your code & stay secure with Snyk - for FREE!

Unbounded consumption mitigation

Quiz

Test your knowledge!

Quiz

Keep learning

Congratulations

FAQs

What to learn next?

Misinformation in LLMs

Vector and embedding weaknesses in LLMs

System prompt leakage in LLMs