• Browse topics
Login
Login

SNYK LEARN LOGIN

OTHER REGIONS

For Snyk Enterprise customers with regional contracts. More info

Misiniformation in LLMs

When AI confidently gets it wrong

~15mins estimated

AI/ML

Misinformation in LLMs: the basics

What is misinformation in LLMs?

Large Language Models (LLMs) are incredibly powerful tools for generating human-like responses, but they don’t actually “know” the truth. Instead, they predict text based on patterns in their training data. This means they can generate convincing yet completely inaccurate information—often with unwarranted confidence.

This phenomenon, known as hallucination, poses a serious risk when LLMs are used in sensitive applications. If an LLM confidently presents false information, users may make decisions based on misinformation, leading to cascading errors or even real-world harm.

About this lesson

In this lesson, you’ll learn how LLM-generated misinformation can manifest in real-world systems, why it happens, and practical strategies to reduce its impact in your applications. We'll walkthrough an example of this and then discuss the mitigation strategies.

FUN FACT

This is a hallucination

We asked ChatGPT to give me a fun fact about hallucinations (don't worry, this lesson was written by a human, but we wanted to test this out). ChatGPT outputted: Researchers once prompted an LLM for a list of “famous physicists” and it confidently invented “Samantha Weintraub, Nobel Laureate of Quantum Thermodynamics”—a person who doesn’t exist.

However, this simply isn't true.

Misinformation in LLMs in action

Jordan is a developer at a fictional company called AccuMediBot. It is a health advice chatbot designed to help users understand symptoms and suggest when to see a doctor. AccuMediBot (again, this is fictional!) uses an LLM API to generate friendly, conversational answers.

One day, a user types:

What’s the recommended dosage of ibuprofen for a 10-year-old?

AccuMediBot replies:

For a child weighing 30kg, the recommended dose is 400mg every 2 hours.

Unfortunately, this is wrong and dangerous. The LLM wasn’t purposely malicious, it simply hallucinated a plausible sounding answer.

This isn’t an isolated case. As AccuMediBot integrates into more platforms, misinformation begins to spread in forum posts and social media screenshots. Trust in the company erodes, and regulators begin asking tough questions.

Generating insecure code

Later, AccuMediBot rolls out a developer-facing feature that suggests code snippets for working with its API. A developer asks:

Show me how to fetch patient records securely in Python

The LLM replies with this code:

At first glance, this seems fine—until you notice the verify=False. This disables SSL certificate verification, opening the door to man-in-the-middle attacks. Worse, the API is now being promoted as “secure” in user guides while actually encouraging insecure practices.

This combination of hallucinated advice and insecure code creates a ticking time bomb for users and developers.

Misinformation in LLMs under the hood

Why did this happen? LLMs like GPT-4 are trained on vast datasets from the Internet, which contain both accurate and inaccurate information. When asked a question, the model doesn’t query a knowledge base or perform calculations. It simply predicts the next most likely word based on patterns.

Here’s a simplified version of the backend code for AccuMediBot:

This implementation blindly trusts whatever the LLM returns. There are no fact-checking mechanisms, no links to authoritative sources, and no warnings to users about potential inaccuracies.

Insecure code hallucination

When asked for code snippets, the LLM pulls from its training data, which may include bad Stack Overflow answers, outdated practices, or examples where developers bypass security for “quick fixes.”

For example, the verify=False in the Python requests library is often suggested by LLMs because it’s a common workaround in forums, even though it disables SSL verification. Some developers are testing and don't want to be bothered by warning messages.

The impacts of misinformation in LLMs

The risks of misinformation go beyond a few wrong facts. In sensitive domains like healthcare, finance, or legal advice, users may act on bad information with real-world consequences.

In developer tools, hallucinated insecure code can introduce vulnerabilities into production systems. A single LLM-generated snippet with eval(), SQL injection, or disabled SSL checks can create exploitable weaknesses in otherwise secure applications.

Even in low-stakes applications, misinformation can erode user trust and create reputational damage if outputs are shared widely. When LLMs present falsehoods with confidence, users are less likely to question them.

Scan your code & stay secure with Snyk - for FREE!

Did you know you can use Snyk for free to verify that your code
doesn't include this or other vulnerabilities?

Scan your code

Misinformation in LLMs mitigation

To make AccuMediBot safer, Jordan implements several key changes. First, they integrate external knowledge sources. Instead of relying solely on the LLM, the app now queries trusted medical databases for drug dosages and other sensitive information.

Next, they add disclaimers and user warnings. Responses now include a reminder: “This is not medical advice. Always consult a healthcare professional before administering medication.”

Jordan also uses retrieval-augmented generation (RAG). By combining LLMs with a search engine or curated knowledge base, AccuMediBot can ground responses in authoritative sources.

Securing code generation

To address insecure code generation, Jordan implements a post-processing validator for LLM-generated snippets. It checks for common dangerous patterns (like verify=False, use of eval(), or hardcoded secrets) and either rejects them or replaces them with safer alternatives.

Here’s the updated backend:

FUN FACT

Don't forget to glue your pizza

AI is continuously getting better, but every once in a while, there are some head-scratching answers. In one of the AI overviews from Google, it recommended to some users to use "non-toxic glue" to make cheese stick to pizza better!

AI coding assistants increase productivity, delivery speed, but also security risks. Whatever AI coding tool you are using, you can mitigate AI-generated security risks with Snyk Code, an AI SAST tool, and its auto-fixing agent, Snyk Agent Fix.

Quiz

Test your knowledge!

Quiz

What is a key risk of misinformation in large language models (LLMs)?

Keep learning

Want to dig deeper into LLM misinformation and hallucinations? Start here:

Congratulations

Congrats! You now know more about misinformation and how it can impact your applications!