• Browse topics
Login

OWASP Top 10 LLM and GenAI

~2hrs 30mins estimated

Artificial Intelligence (AI) is now seen across multiple industries. Understanding the security implications of these technologies is more important than ever. The Open Web Application Security Project (OWASP) has identifyed and addressed the most critical security risks in software development. With the rapid rise of Large Language Models (LLMs) and generative AI, OWASP has extended its expertise to highlight the top 10 security concerns specific to these advanced AI systems. This learning path is designed to provide you with a comprehensive understanding of these risks and equip you with the knowledge to secure AI-driven applications.

By exploring these lessons, you will gain insight into the vulnerabilities inherent in LLMs, the potential consequences of these risks, and the best practices to mitigate them. Whether you are a developer, security professional, or AI enthusiast, mastering these concepts is essential to ensure the safe and ethical deployment of AI technologies in the real world.

Save your learning progress.

  • Track your learning progress
  • Keep up to date with the latest vulnerabilities
  • Scan your application code to stay secure
Sign up for free

LLM01: Prompt Injection

Prompt injection is a vulnerability where attackers manipulate the input prompts of a language model to alter its behavior, potentially causing it to generate misleading, harmful, or unauthorized outputs.

LLM02: Sensitive Information Disclosure

LLMs often have access to sensitive information. They may inadvertently disclose this information, either by mistake or through prompt injection. Sensitive information could include personally identifiable information, credit card information, or proprietary information. Having any of these disclosed could lead to security breaches, privacy violations, and a distrust of the system by its users.

LLM03: Supply Chain Vulnerabilities

Supply Chain Vulnerabilities refers to the risks that arise from the use of third-party components in software development. These components can include libraries, frameworks, and other dependencies that developers incorporate into their projects. In LLMs, this can be risks from third-party models, libraries, and other dependencies.

LLM04: Data and Model Poisoning

Data and model poisoning refers to the deliberate manipulation or corruption of the data used to train machine learning models. The goal is to introduce biases, vulnerabilities, or inaccuracies into the model. Once the training data has been poisoned, it can lead to biased or unethical behavior. The LLM could also make unreliable or dangerous decisions.

LLM05: Improper Output Handling

Improper output handling occurs when the outputs generated by a large language model are not properly managed, sanitized, or validated before being used in the application. This can lead to various security risks, such as injection attacks, data leaks, or the propagation of harmful content.

LLM06: Excessive Agency

Excessive agency surfaces when LLMs, designed to automate tasks and interact with other systems, end up performing actions they shouldn’t. These unintended actions can range from sending unauthorized emails to issuing incorrect financial refunds, all due to the LLM misinterpreting prompts or simply executing tasks incorrectly.

LLM07: System Prompt Leakage

LLMs operate based on a combination of user input and hidden system prompts—the instructions that guide the model’s behavior. These system prompts are meant to be secret and trusted, but if users can coax or extract them, it’s called system prompt leakage. This vulnerability can expose business logic, safety rules, or even sensitive credentials embedded in the prompt.

LLM08: Vector and Embedding Weaknesses

Vector and embedding weaknesses refer to vulnerabilities that arise from manipulating the mathematical representations of data (vectors) that LLMs use to understand meaning and relationships. LLMs convert text into numerical vectors called "embeddings." These embeddings are stored in a specialized database called a vector database, which allows the LLM to find semantically similar information.

LLM09: Misinformation

LLMs are powerful tools for generating human-like responses, but they don’t actually “know” the truth. Instead, they predict text based on patterns in their training data. This means they can generate convincing yet completely inaccurate information—often with unwarranted confidence. This phenomenon, known as hallucination, poses a serious risk when LLMs are used in sensitive applications.

LLM10: Unbounded Consumption

Unbounded consumption is a vulnerability that occurs when a Large Language Model (LLM) application is tricked into an excessive, and often recursive, use of resources without proper limits. This can lead to a Denial of Service (DoS), where the application becomes slow or unavailable for legitimate users, and can result in surprisingly high financial costs from API usage and computing power.