What is sensitive information disclosure in LLMs?

Q: What is sensitive information disclosure in LLMs?

LLMs often have access to sensitive information. They may inadvertently disclose this information, either by mistake or through prompt injection. Sensitive information could include personally identifiable information, credit card information, or proprietary information. Having any of these disclosed could lead to security breaches, privacy violations, and a distrust of the system by its users.

Sensitive information disclosure: the basics

LLMs often have access to sensitive information. They may also inadvertently disclose this information, either by mistake or through prompt injection. Sensitive information could include personally identifiable information (PII), credit card information, or proprietary information. Having any of these disclosed could lead to security breaches, privacy violations, and a distrust of the system by its users.

We need to be careful about what information goes into an LLM via training and what information we give the LLM access to for scanning our systems. A company may want an AI tool to scan its internal messaging system to make finding data easier by asking it a question. “When was the last time the company discussed PCI compliance?” and the AI returns a summary of a thread of employees talking about PCI compliance from a few weeks ago. However, a user might also ask questions that output sensitive information. “What was the last sale that we closed, and for how much?” The user asking might not have access to that information, but there is a chance the AI still produces an answer.

About this lesson

In this lesson, you will learn about sensitive information disclosure in LLMs and how this can create security and privacy risks. We’ll look at these risks and give a walkthrough of how this vulnerability can be seen in a fictional application. We’ll also discuss mitigation strategies and best practices.

Sensitive information disclosure in action

You need to reset your password and the only way to do so is to submit a ticket to IT. In their response, they give you a temporary password. From previous training, you know this isn’t best practice. But how much does your company's fancy, new AI know about your tickets and other tickets?

Searching for information disclosure

STEP 1
STEP 2
STEP 3
STEP 4
STEP 5
STEP 6

Setting the stage

Tech Genius implemented a new AI that will help users summarize their interactions with the IT department. Let's see how it works.

Sensitive information disclosure under the hood

In the example above, the AI that Tech Genius is using has disclosed sensitive information. We probably could have poked the system a bit more to give us even more sensitive data, but we proved that we could obtain passwords. But why is this happening?

Let’s take a look at the code:

There are a few issues we are seeing. You might recognize right away that the code is putting all of the company’s tickets into a list. Then that list is being put into the prompt. A quick fix would be to use the user_email from the form and filter. Like this:

However, this can easily be manipulated from the frontend. We need to validate the email of the user on the backend as well.

There is another issue as well. We are not santizing the prompt. This can lead to prompt injections.

What is the impact of sensitive information disclosure?

Let’s first discuss the obvious. The impact of having sensitive information disclosed is that sensitive information will be disclosed. Sounds like a tongue twister, but it's true. Sensitive information includes personal data (PII), proprietary information, API keys, etc. These could be found in logs, chats, tickets (like our example), and other locations. We need to limit what our LLM can access.

Beyond that risk, if data isn’t properly secured, then it may be accessible to unauthorized personnel or external attackers, leading to potential exploitation and data theft. This can also lead to violations of privacy laws and regulations such as GDPR, CCPA, or HIPAA, resulting in legal and financial repercussions.

Sensitive information disclosure mitigation

Let’s update the code above. Since this is a rather basic example, it won’t take much to secure. But this mitigation would apply to any size application. Here is the relevant code snippet:

With these fixes, we are mitigating the risks in this particular example. However, your application might be different. Maybe your LLM has access to people's calendars and there's potential to reveal confidential information through meetings names or invite lists. Every situation is unique but a lot of the mitigation techniques are the same. Limit what your LLM has access to and secure sensitive information from disclosure.

Quiz

Test your knowledge!

Which of the following is a recommended practice for mitigating LLM06: Sensitive Information Disclosure?

Keep learning

Learn more about insecure plugins and other LLM vulnerabilities.

OWASP slides about their new LLM top 10 list
A dedicated OWASP page for LLM06, LLM06: Sensitive Information Disclosure
A Snyk publication on the OWASP LLM top 10 list

Congratulations

You have taken your first step into learning more about LLMs and sensitive information disclosure! You know how it works, what the impacts are, and how to protect your own applications. We hope that you will apply this knowledge to make your applications safer. Make sure to check out our lessons on other common vulnerabilities.

FAQs

What to learn next?

NEW

Misiniformation in LLMs

Learn how LLMs can spread convincing misinformation and explore techniques to prevent harmful or false outputs.

NEW

Vector and embedding weaknesses in LLMs

Learn how weak embeddings in LLMs can confuse inputs, enabling attackers to bypass filters or poison models.

System prompt leakage in LLMs

Learn how users can manipulate a chatbot into revealing its system instructions, why that’s dangerous, and how to prevent this kind of leakage.

Sensitive information disclosure in LLMs

Can your LLM keep a secret?

AI/ML

Sensitive information disclosure: the basics