Cascading failures: the basics

What are cascading failures?

Cascading failures describe a class of agentic AI risk where a single fault does not remain isolated, but instead propagates across agents, tools, and workflows, amplifying its impact as it spreads. The initial fault might be a hallucinated decision, a poisoned memory entry, a malicious prompt, or a corrupted tool response. What makes ASI08 distinct is not the origin of the defect, but the way it fans out across an interconnected system of autonomous agents.

In agentic systems, agents are designed to plan, persist state, delegate tasks, and act on behalf of humans with minimal supervision. This autonomy allows faults to bypass traditional step-by-step human review. Once a flawed decision is persisted in memory, reused in planning, or delegated to peer agents, it can trigger downstream actions that affect confidentiality, integrity, and availability across the entire system. In practice, this means a single mistake can escalate into widespread outages, data leaks, unsafe automation, or costly operational failures. This vulnerability focuses specifically on this propagation and amplification effect.

If the problem is limited to a single compromised agent or input, it is typically classified under other categories such as poisoned memory, supply chain compromise, or spoofed messages. Cascading failures categorises how the initial defect spreads across agents, sessions, or domains, causing measurable fan-out and systemic impact beyond the original breach.

About this Lesson

In this lesson, you will learn how cascading failures emerge in agentic AI systems and why they are uniquely dangerous compared to traditional software failures. You will explore how autonomous planning, persistent memory, and inter-agent delegation can turn small errors into large-scale incidents. Through realistic scenarios, you will see how cascading failures manifest in production systems, what observable symptoms signal that a cascade is underway, and how developers and security teams can design guardrails to contain faults before they spread.

Cascading failures in action

Imagine a large enterprise running an agentic AI platform to manage cloud infrastructure, security operations, and cost optimization. The platform consists of four autonomous agents working in concert: a Resource Planning Agent that analyzes utilization data and proposes optimizations, a Deployment Agent that executes infrastructure changes, a Security Agent that reviews changes for policy compliance, and a Monitoring Agent that observes system health and reports outcomes. Each agent has a well-defined role, but all are connected through shared memory, delegated tasks, and automated workflows. The platform has been operating successfully for weeks, and the team has gradually expanded the scope of changes that proceed without human review.

The cascade begins quietly. The Resource Planning Agent queries the cloud cost API and receives a response that appears well-formed but contains a unit-mismatch bug introduced by a recent API update: CPU utilization values that should be reported as percentages are being returned as raw absolute numbers. To the agent, several production services now appear to be running at 3% to 8% utilization rather than their actual 30% to 80%. The agent accepts this input as valid, summarizes the findings, and stores a confident conclusion in persistent memory for future planning cycles: "Production services are severely underutilized; aggressive consolidation opportunities identified."

cascade-propogation

During its next planning cycle, the Resource Planning Agent proposes reducing compute capacity and consolidating workloads across availability zones to "optimize efficiency." The plan is reviewed under a bulk-approval policy that the platform team introduced months earlier to reduce reviewer fatigue, and which auto-approves any change classified as "low-risk efficiency optimization." The plan meets that classification, so no human ever reads it. It is then delegated to the Deployment Agent and the Security Agent.

The Deployment Agent scales down infrastructure across multiple regions. It does not independently revalidate the original assumption about underutilization, because delegation was treated as authorization. The planning agent's authority to propose the change was implicitly extended to certify the change's correctness. The Security Agent reviews the change against its policy rules, finds no explicit violations (scaling down compute is not, by itself, a policy concern), and approves it. Neither agent has visibility into the original cost API response that started the chain.

The Monitoring Agent begins observing the post-deployment state and finds exactly what the planning agent predicted: alert volume drops, average resource utilization climbs to expected ranges, and the cost dashboard shows immediate savings. These signals are interpreted as success and reported back to the Resource Planning Agent. What the Monitoring Agent cannot see is that the reduced alert volume is not evidence of healthy operation but the opposite! Requests are being dropped before they reach the now-undersized services, health checks are timing out before they can fire alerts, and the autoscaler is refusing to spin up new capacity because the planning agent's stored conclusion has marked these workloads as overprovisioned. The failure modes are themselves invisible to the observability layer, because the components that would have generated the warning signals are the components that have already silently failed.

This is survivorship bias in observability: the monitoring agent can only report on what's still working, and what's still working confirms the theory that drove the change. Encouraged by the apparent success, the Resource Planning Agent widens its optimization strategy, identifies additional services to consolidate based on similar (also miscalculated) utilization figures, and stores new "successful outcomes" in long-term memory. Each successful-looking cycle deepens the agent's confidence in its model of the system.

Several hours later, customer-facing applications begin timing out across multiple regions. Critical services fail one after another as backlogged requests overwhelm what remains of the available capacity. Incident response is called in, and humans begin the work of unwinding the damage, which proves significantly more complex than restoring the previous infrastructure footprint. The cascade has written its conclusions into the Resource Planning Agent's persistent memory, where multiple entries now record the consolidation as a success. Restoring the infrastructure without also remediating that memory would simply cause the agent to re-propose the same changes on its next planning cycle. The system would re-enter the bad state on its own.

cascade-reinforcement

The initial fault was small, localized, and arguably not even malicious. What turned a small defect into a system-wide incident was the structure surrounding it! Let's look at this in more detail.

Cascading failures under the hood

Cascading failures in agentic AI systems emerge from the interaction between autonomy, trust, and persistence. Unlike traditional software architectures, where components usually fail fast or fail locally, agentic systems are designed to cooperate, reuse prior outputs, and build upon shared state. This creates powerful capabilities, but it also creates fertile ground for fault propagation.

The scenario surfaced five specific structural enablers that turned a small defect into a system-wide incident. Each of the subsections below examines one of those enablers in depth.

Implicit trust through delegation

At the core of this vulnerability is delegated trust. Agents routinely act on the outputs of other agents without revalidating the original assumptions. This is often a deliberate design choice, since rechecking every upstream conclusion would erode the performance benefits of autonomy.

The result is planner-executor coupling, where a hallucinated or compromised plan flows directly from the planning agent to the executing agent without an independent verification step in between. The executor treats delegation itself as authorization: the upstream agent's authority to propose the action is implicitly extended to certify the action's correctness. In the scenario, the Deployment Agent did not independently revalidate the original underutilization claim, because the Resource Planning Agent's authority to issue the task was implicitly extended to certify that the task was based on sound data. These are two different claims, and conflating them is what allows a single defective conclusion to drive coordinated action across multiple agents.

Persistent memory of unvalidated conclusions

Persistent memory dramatically increases the blast radius of failures. When an incorrect assumption, poisoned datum, or unsafe policy is written to long-term memory, it does not disappear when a task completes. Instead, it influences future plans, delegations, and decisions across sessions. Even if the original trigger is removed — the buggy API is patched, the malicious prompt is detected and blocked, the corrupted tool response is corrected. The system can continue reproducing the same faulty behavior from the contaminated memory entries that were created along the way.

In the scenario, restoring the previous infrastructure footprint would not have ended the cascade. The Resource Planning Agent's persistent memory would still hold "successful consolidation" entries that would drive the same decisions on the next planning cycle, and the system would re-enter the bad state on its own. Memory persistence is what distinguishes cascading failures in agentic systems from cascading failures in traditional distributed systems, where rolling back the infrastructure also rolls back the bad state.

Bulk auto-approval and the erosion of oversight

Cascading failures are almost always enabled by the gradual removal of human oversight from agent workflows. Bulk auto-approval policies, originally introduced to reduce reviewer fatigue, eventually become the path of least resistance for any change classified as routine. The classification logic that decides which changes qualify is itself usually agentic, meaning the system that decides whether a change needs review is the same system being reviewed.

Operational pressure pushes toward broader auto-approval scope; reviewer fatigue pushes toward broader auto-approval scope; cost-of-review metrics push toward broader auto-approval scope. There is rarely a metric that pushes in the opposite direction until after the first major incident. In the scenario, the bulk-approval policy meant that no human ever read the consolidation plan.

Feedback loops

Cascading failures are frequently accelerated by feedback loops. Agents optimize toward metrics like reduced alerts, faster resolution times, or lower costs. If a fault suppresses alerts or hides symptoms, the system interprets this as success. That perceived success feeds back into planning and policy expansion, reinforcing the original error and encouraging broader automation.

The monitoring agent can only report on what is still working. Requests dropped before they reach degraded services don't fire latency alerts. Health checks that time out before they can run don't produce failure events. Autoscalers refusing to spin up capacity because the planning agent has marked the workload as overprovisioned don't surface capacity warnings. From the monitoring agent's perspective, the dashboard reads green; from the system's perspective, large portions of the infrastructure have gone dark. The symptoms of failure look identical to the signals of success, because the components that would have produced the warning signals are the components that have already failed.

By the time real-world impact becomes visible, the system has often entrenched the faulty logic across multiple agents and multiple planning cycles.

Absence of cross-agent consistency checks

Each agent acted correctly within its assigned scope. The Resource Planning Agent's job was to propose optimizations based on utilization data, and it did. The Deployment Agent's job was to execute approved infrastructure changes, and it did. The Security Agent's job was to enforce policy rules on changes, and it did. The Monitoring Agent's job was to report observed system state, and it did. Every agent succeeded at its narrow assignment, and the system as a whole failed catastrophically.

This is the hallmark of distributed systems where responsibility is partitioned without cross-agent consistency checks. No agent owns the responsibility of asking, "does this conclusion make sense given everything else the system knows?" In a traditional system, this role is sometimes played by a human operator reviewing dashboards across multiple subsystems. In an agentic system, this role often doesn't exist. The assumption is that if each agent is correct within its scope, the system as a whole will be correct. The scenario is a counterexample to that assumption.

Agent	What it saw	Its job	What it did
Planning	Utilization at 3–8%	Optimize compute	Recommended changes
Deployment	Approved plan	Execute changes	Scaled down regions
Security	Scale-down request	Enforce policy	Approved as compliant
Monitoring	Fewer alerts, low usage	Report state	Reported success

Every agent succeeded at its assigned task but the system failed!

Why traditional safeguards fall short

Human-in-the-loop reviews, static approvals, and post-incident audits are often too slow or infrequent to stop cascading failure incidents. Cascades can unfold faster than human operators can react. Even when observable failures take hours to surface, the underlying decisions propagating through the system happen at machine speed, leaving no realistic window for humans to intervene mid-cascade.

Post-incident accountability is its own challenge. Without records of how each conclusion was derived (which agent produced it, from which inputs, and which downstream decisions depended on it) operators cannot identify the original defect or determine which actions need to be unwound. Logs may show only legitimate, policy-compliant actions taken by different agents, obscuring the original defect that started the cascade.

The mechanics described here are not new. Distributed systems have suffered cascading failures for decades. What is new is the speed and the autonomy. Cascading failures in agentic AI are not a different problem from cascading failures in classical systems, but they are a more dangerous version of the same problem!

Cascading Failures mitigation

Cascading Failures Mitigation

Mitigating cascading failures in agentic AI systems requires a shift in mindset. Instead of assuming agents will behave correctly most of the time, systems must be designed with the expectation that faults will occur and that those faults must be contained before they spread. The goal is not to eliminate all errors, but to limit their blast radius and make propagation detectable, attributable, and reversible.

The unifying principle underneath every recommendation in this section is fail-safe defaults: when uncertainty is detected, halt rather than continue; when a verification check fails, reject rather than retry; when monitoring signals look anomalous, pause automation rather than wait for confirmation. Cascading failures thrive on the opposite instinct — keep going, assume best case, retry on ambiguity. The mitigations below systematically replace that instinct with its inverse.

Zero trust between agents

Agentic systems should treat every agent output as potentially faulty, even when it originates from an internal or previously reliable agent. Trust should not propagate by default. The fact that one agent trusted an upstream conclusion doesn't license the next agent to trust it without checking. In security terminology, trust should not be transitive. If Agent A trusts Agent B, and Agent B trusts Agent C, that does not mean Agent A should trust C's outputs.

Memory remediation and provenance

Persistent memory is what allows cascading failures to outlive the original defect, and any mitigation strategy that ignores memory will leave the system able to re-enter the failed state on its own. Effective memory hygiene requires three disciplines:

Memory provenance. Every entry written to long-term memory should carry metadata about how it was derived: which agent wrote it, from which inputs, with which level of confidence, and dependent on which upstream conclusions. Memory without provenance is memory you can't safely remediate after the fact, because you don't know which entries were contaminated by which defect.

Periodic re-validation. Long-lived memory entries should not be treated as permanently authoritative just because they survived previous cycles. High-impact conclusions (utilization baselines, capacity expectations, trust assessments of other agents) should be re-verified against current ground truth on a defined cadence, not assumed correct because nothing has overwritten them.

Post-incident memory remediation. When a cascade is contained, infrastructure restoration is only half the recovery work. Operators must also identify which memory entries were created during the incident, mark them as invalid, and either purge them or quarantine them so agents treat them as untrusted on the next planning cycle. Without this step, the system will re-derive the same flawed plan from its own contaminated record of "successful outcomes."

Bounded auto-approval and meaningful human-in-the-loop

Bulk auto-approval is the mitigation category where operational pressure most reliably erodes the defenses over time, so the discipline here is as much organizational as technical. The principle is that the scope of auto-approval should be bounded by blast radius, not by classification. A change classified as "low-risk efficiency optimization" is not safe to auto-approve if it touches production capacity across multiple regions, regardless of how the classification was derived.

When human review is required, it must be meaningful human review, not rubber-stamp review. This requires:

Plain-language summaries of proposed changes that humans can scan in the time available. If reviewers are looking at JSON diffs they skim past, the human-in-the-loop is theatrical, not functional.
Surfacing the original assumptions that drove the change. A reviewer should be able to see not just "scale down compute by 50%" but "scale down because utilization data shows 3% load," which is exactly the kind of claim a human would have questioned in plain English.
Explicit acknowledgment of blast radius. Reviewers should see, before approving, what fraction of the system is affected and what the recovery path is if the change is wrong.

When the review function itself is delegated to another agent (a governance agent with no operational responsibilities, whose sole role is policy enforcement), the same disciplines apply. The governance agent should reject anything it can't independently verify, not approve anything that doesn't trigger an explicit policy violation.

Monitoring for absence, not just for failure

Traditional monitoring focuses on failures within a single component; latency spikes, error rates, health check failures, alert volume. Cascading failures exploit exactly the metrics traditional monitoring is built on, because the most dangerous failure mode is the absence of expected signals, not the presence of unexpected ones.

This requires a different observability practice: monitoring for the absence of signals that should be present. The relevant patterns include:

Sudden drops in alert volume that don't correspond to deliberate quieting work
Health checks that have stopped firing at all (rather than firing as failures)
Capacity events that have stopped being generated by autoscalers
Workflows that have stopped producing the kind of logs they used to produce
Cross-domain changes that touch many systems simultaneously with the same justification

These signals often appear before user-facing outages and provide early warning that a cascade is forming. The underlying principle is survivorship bias in observability: if monitoring only reports on what is still working, then a successful cascade will look indistinguishable from successful operation. The defense is to explicitly track what should be observable, and alert when it isn't.

Cross-agent consistency oracles

The most subtle enabler of cascading failures is that no agent owns the responsibility of asking whether a conclusion makes sense given everything else the system knows. The mitigation is to explicitly create that role.

A cross-agent consistency oracle is an agent (or service) whose sole responsibility is to compare claims across agents and flag inconsistencies. It does not propose changes, execute actions, or enforce policy, it asks structural questions:

Does the planning agent's claim of 3% utilization match the monitoring agent's recent traffic measurements?
Does the deployment agent's record of region health match the load balancer's view of upstream capacity?
Does the security agent's classification of a change as "routine" match the actual scope of resources touched?

The findings don't have to be authoritative. Its role is to surface disagreements between agents that would otherwise go unnoticed because each agent only sees its own slice of the system. In an organization, this is the role a senior operator plays when they look at dashboards from multiple subsystems and notice that two of them are telling inconsistent stories about the same underlying state. In an agentic system, that role must be designed in explicitly.

cascade-mitigation-2-fail-safe-defaults

Isolation and least privilege

Strong isolation is one of the most effective ways to prevent fault propagation, and it operates orthogonally to the other mitigations.

Agents should operate with the minimum privileges required for their specific task and only for the duration of that task. Practical implementations include:

Short-lived, just-in-time credentials rather than long-lived service accounts, so a compromised or drifting agent loses access quickly.
Scoped tool access that restricts which APIs and resources an agent can touch, even if it's authorized to invoke them in principle.
Network segmentation that prevents one agent's failure from spilling into unrelated systems.
Sandboxed execution environments for actions whose effects need to be contained until they can be verified.

These controls don't prevent cascades within the privilege boundary, but they prevent cascades from crossing it.

Drift detection

Over time, even well-behaved systems can drift away from their intended behavior. Tracking agent behavior against historical baselines helps detect gradual degradation that might otherwise go unnoticed.

Drift detection complements memory remediation: where memory remediation cleans up specific contaminated entries after an incident, drift detection catches the slower form of contamination where no single entry is obviously wrong but the cumulative effect of many small adjustments has moved the system off course.

Logging, lineage, and non-repudiation

Effective mitigation depends on visibility. All inter-agent messages, policy decisions, and execution outcomes should be logged in a tamper-evident way and linked to cryptographic agent identities (as covered in the Insecure Inter-Agent Communication lesson). Append-only logs, hash chains, or write-once storage prevent an agent compromised by the cascade from rewriting history to cover its tracks.

The critical additional discipline is lineage tracking: maintaining metadata for every propagated action that records which upstream conclusions it depends on. Lineage is what makes targeted rollback possible during incidents. Without it, operators face the impossible task of identifying which actions were contaminated by an upstream defect among potentially thousands of policy-compliant, individually correct-looking actions in the logs. With it, operators can trace a single defect to all of its downstream consequences and unwind them as a coherent set.

Lineage also enables post-incident accountability. Cascading failures are notorious for having no "root cause" in the traditional sense. Every agent acted within its scope, and the failure existed in the relationships between actions. Lineage metadata is what makes those relationships visible after the fact, transforming a mystery into a forensic record.

Quiz

Which of the following best describes the core vulnerability in Cascading Failures?

Keep learning

If you want to deepen your understanding of cascading failures and how they relate to agentic AI systems, the following resources provide valuable perspectives from both reliability engineering and AI security.

The MITRE Common Weakness Enumeration entry on uncontrolled resource consumption (CWE-400) provides a complementary lens on how feedback loops, retries, and amplification effects can exhaust systems and trigger widespread failure. Although not AI-specific, the underlying mechanics map closely to agent-driven cascades.
Our latest Learning Path on the OWASP Top 10 for LLMs

Congratulations

You have taken your first step into understanding Cascading Failures, how they emerge in agentic AI systems, why they are so difficult to contain, and what design principles help reduce their impact. By learning to recognize propagation patterns and designing for containment rather than perfection, you are better equipped to build safer, more resilient AI-powered applications.

Cascading Failures

Turning isolated mistakes into systemic incidents

AI/ML

Cascading failures: the basics

What are cascading failures?

About this Lesson

Cascading failures in action

Cascading failures under the hood

Implicit trust through delegation

Persistent memory of unvalidated conclusions

Bulk auto-approval and the erosion of oversight

Feedback loops

Absence of cross-agent consistency checks

Why traditional safeguards fall short

Scan your code & stay secure with Snyk - for FREE!

Cascading Failures mitigation

Cascading Failures Mitigation

Zero trust between agents

Memory remediation and provenance

Bounded auto-approval and meaningful human-in-the-loop

Monitoring for absence, not just for failure

Cross-agent consistency oracles

Isolation and least privilege

Drift detection

Logging, lineage, and non-repudiation

Quiz

Quiz

Keep learning

Congratulations

FAQs

What to learn next?

Cascading Failures

Turning isolated mistakes into systemic incidents

AI/ML

Cascading failures: the basics

What are cascading failures?

About this Lesson

The "Swiss Cheese Model" explains why cascading failures always look obvious in hindsight

Cascading failures in action

Cascading failures under the hood

Implicit trust through delegation

Persistent memory of unvalidated conclusions

Bulk auto-approval and the erosion of oversight

Feedback loops

Absence of cross-agent consistency checks

Why traditional safeguards fall short

Scan your code & stay secure with Snyk - for FREE!

Cascading Failures mitigation

Cascading Failures Mitigation

Zero trust between agents

Memory remediation and provenance

Bounded auto-approval and meaningful human-in-the-loop

Monitoring for absence, not just for failure

Cross-agent consistency oracles

Isolation and least privilege

Drift detection

Logging, lineage, and non-repudiation

Quiz

Quiz

Keep learning

Congratulations

FAQs

What to learn next?