Excessive agency
AI overstepping its bounds: understanding and mitigation
AI/ML
What is excessive agency?
Language Learning Models (LLMs) interfacing with external systems have introduced a new vulnerability known as excessive agency (LLM08). This issue surfaces when LLMs, designed to automate tasks and interact with other systems, end up performing actions they shouldn’t. These unintended actions can range from sending unauthorized emails to issuing incorrect financial refunds, all due to the LLM misinterpreting prompts or simply executing tasks incorrectly.
About this lesson
In this lesson, you will learn about vulnerabilities stemming from excessive agency and how to protect your applications against them. We will step into the shoes of a hacker who successfully exploits an AI assistant with too broad permissions.
The incident with the AI assistant serves as a textbook example of excessive agency resulting from inadequate access control. The initial step of connecting the assistant to the store's backend account was intended to give it the ability to access customer-related information, enhancing its effectiveness in handling customer support queries. This integration was essential for the assistant to retrieve and utilize information like order details and the status of support tickets.
However, this is where the critical oversight occurred. While integrating the assistant with their backend account, the development team neglected to restrict the access rights. Ideally, The access token should have been limited to only those sections of the account suite relevant to the AI's role in customer support. Instead, the assistant was inadvertently granted more extensive access than necessary due to this oversight. This extended access included capabilities such as scheduling meetings, which far exceeded the assistant's intended functional scope.
What is the impact of excessive agency?
The impact of excessive agency in LLM-based systems largely depends on the context in which the LLM operates and the specific permissions it has been granted. When an LLM is given broader access or capabilities than necessary for its intended purpose, it can lead to unintended and potentially harmful outcomes. This situation becomes particularly problematic when users manipulate the LLM, intentionally or not, to perform actions outside its original design parameters.
For instance, consider a customer service LLM that interfaces with a payments system to issue service credits or refunds. The developers intended to restrict refunds to a maximum of one month's subscription fee. However, this limitation was only specified in the system's instructions to the LLM and not enforced in the refund API. A malicious customer, recognizing this vulnerability, could perform a direct prompt injection attack, convincing the LLM to issue a refund far exceeding the intended limit, such as 100 years' worth of fees. This scenario illustrates how relying on an LLM to enforce policy limits through system prompts rather than embedding these limits directly into the operational APIs can lead to significant issues.
Excessive agency mitigation
In addressing the challenge of excessive agency in LLM systems, as demonstrated by Max's interaction with the MegaStore assistant, several essential strategies highlighted by OWASP are crucial for prevention and mitigation. The complete list of mitigation strategies can be found in the OWASP publication.
Firstly, it is crucial to minimize the permissions granted to the LLM. This includes limiting the plugins/tools LLM agents can access to only the minimum necessary functions. For example, a plugin accessing a user's mailbox to summarize emails, for instance, may only require the ability to read emails and should not contain other functionalities like deleting or sending messages.
Secondly, avoiding open-ended functions wherever possible is essential. For instance, instead of using a plugin to run a shell command, use plugins/tools with more granular functionality, like a file-writing plugin that only supports specific functionalities. Additionally, limiting the permissions that LLM plugins/tools are granted to other systems to the minimum necessary is important. An LLM agent using a product database should, for example, only have read access to a 'products' table and not be able to insert, update, or delete records.
Implementing rate limiting is also a key strategy. Setting a maximum limit on the number of actions an LLM can execute within a specific timeframe acts as a control mechanism. This approach is particularly effective in averting situations where an LLM could initiate a series of rapid and potentially damaging actions due to programming errors or external manipulation.
Next to that, integrating human-in-the-loop control is a vital safeguard, requiring human approval for all actions before the LLM executes them. This control ensures human oversight, particularly for actions that could have significant impacts or involve sensitive data or operations.
Additionally, implementing authorization in downstream systems rather than relying on an LLM to decide whether an action is allowed is important. When implementing tools/plugins, enforce the complete mediation principle so that all requests made to downstream systems via the plugins/tools are validated against security policies.
Finally, while some options may not prevent excessive agency, they can limit the level of damage caused. Logging and monitoring the activity of LLM plugins/tools and downstream systems to identify where undesirable actions are taking place and responding accordingly is crucial. Implementing rate-limiting to reduce the number of undesirable actions that can occur within a given time also increases the opportunity to discover undesirable actions through monitoring before significant damage can occur.
Test your knowledge!
Keep learning
Learn more about excessive agency and other LLM vulnerabilities.
- OWASP slides about their new LLM top 10 list
- An OWASP publication on their new LLM top 10 list
- The general overview page of OWASP about the LLM top 10 list
- A Snyk publication on the OWASP LLM top 10 list