Multi-agent failures

Multi-agent systems, defined by how they fail

A multi-agent system is defined by its failure surface: agent, orchestration, coordination, shared state, and topology, each defined through the failure it enables, then routed to the research.

You already know what your agents can do. What decides whether the system ships is what happens the first time one of them is confidently wrong, and that behavior lives in the connections between agents. So define the system from that side. A multi-agent system is defined by its failure surface, not its capabilities: every structural piece you add to move work between models opens a new way for the whole thing to be wrong at the same moment it adds a new thing the system can do. This hub defines the five concepts you actually reason about through the failure each one enables, then routes you to the page that measures it.

The five concepts, defined by the failure each enables

Five structural concepts are enough to describe almost any multi-agent system, and each one is a failure surface first. The table gives the ordinary capability gloss, then the definition that matters once the system is in production: the specific failure the concept makes possible, and the property to watch so it does not.

ConceptThe capability gloss (what a 101 gives you)The reliability definition (the failure it enables)What to watch
AgentA model that plans, calls tools, and acts toward a goalA unit that emits fluent output whether or not it is correct, and drops its own uncertainty before the next reader sees itWhether an agent’s confidence is calibrated and its output is verified for meaning before anything downstream trusts it, since structural checks pass a confident error untouched
OrchestrationThe control layer that assigns work and sequences agentsThe single authority over what runs, when it stops, and which result is canonical, so any defect in it is inherited by everything it spawnsTermination conditions, and whether the orchestrator re-verifies a worker’s return before it redistributes it
CoordinationThe messages agents exchange to split and recombine workThe set of hand-offs where one agent’s guess crosses an edge and becomes another agent’s trusted premiseWhether each hand-off is typed and its key claims verified at the edge, or passed on faith
Shared stateThe memory or blackboard agents read from and write toThe surface where one write is read by many agents at once, so a single corrupt entry correlates their failures instead of touching one hopWrite validation, scoping into isolated domains, and versioned reads that expose a stale or poisoned entry
TopologyThe shape of the wiring: chain, hub-and-spoke, debate, blackboardThe property that sets the ceiling on how far any fault can reach before something stops itWhich amplification path your wiring has, because the same fault reaches one agent or all of them depending on it

These five are not an arbitrary cut. They are the minimal set where adding the concept adds a failure a single agent, running alone, cannot produce. One agent can be wrong, but it cannot derail a hand-off, poison a store other agents read, or fan a bad result across a hub, because there is nothing on the other end. The moment a second agent and a connection exist, every row above becomes reachable. That is why a capability tour of agent frameworks tells you so little about whether a system is reliable: the capabilities compose, and so do the failures, but they do not compose at the same rate.

Why the failure surface grows faster than the capability

Add an agent and you add one worker. Add the connection that lets it collaborate and you add an edge every fault can now travel. Capability grows roughly with the count of agents; the failure surface grows with the count of connections, which climbs faster as the topology fills in. That asymmetry is why the empirical record of how these systems break points at the seams rather than the models.

MAST, the first empirically grounded taxonomy of multi-agent LLM failures, sorts recurring failures into three categories: “14 unique modes, clustered into 3 categories: (i) system design issues, (ii) inter-agent misalignment, and (iii) task verification” (Cemri et al., Why Do Multi-Agent LLM Systems Fail?, arXiv:2503.13657, preprint, as of 2026-07). That taxonomy came from 150 execution traces read by expert human annotators with high inter-annotator agreement (kappa 0.88), then scaled to a 1,600+ trace dataset across seven frameworks through an LLM-as-judge pipeline. Two of its three categories, inter-agent misalignment and system design, describe the connections and the control layer rather than the reasoning of any lone agent. The failures concentrate in exactly the concepts the table calls surfaces.

The third MAST category, task verification, names the concept the table folds into orchestration and topology: the gate that decides whether a fault stays inside the system or leaves as a certified answer. That gate is where the failure surface meets the exit, and it separates a fault you can still contain from one a user has already acted on. A system read as a failure surface therefore has an inside and an edge, and every concept above is a place to put a check before a wrong answer reaches that edge.

Topology decides how far a fault in those seams travels. OWASP’s Top 10 for Agentic Applications files this as a distinct risk class, ASI08, describing how “[f]alse signals cascaded through automated pipelines with escalating impact (ASI08 – Cascading Failures)” (OWASP Top 10 for Agentic Applications, primary, as of 2025-12). The same corrupted hand-off is a contained incident in one wiring and a system-wide cascade in another. That gap between reach in a chain and reach in a shared-state blackboard is what turns topology from a diagram into a measurement, and it is where the lane’s vocabulary earns its place, starting with blast radius.

Read down by the failure you are chasing

The routes below are grouped by the failure you arrived to solve. Find the concept generating your incidents, then follow it down.

One scoping note before the index. This hub routes the failures that need a second agent and a connection to exist. Single-model reliability in isolation, the hallucination or tool-misuse a lone agent produces on its own, and the eval-methodology of scoring one model against a benchmark are upstream inputs to this cluster, and the research and glossary below hold them.

For the full set, the research index holds the long-form analyses and the glossary holds the canonical definitions each one links back to.

Where to go from here

Reading a multi-agent system as a failure surface first changes what you do when it breaks. Ask which of the five concepts above carried the fault: the agent that was confident, the hand-off that carried the guess, the store that spread it, the orchestrator that never stopped, or the topology that let it reach everything. Those are diagnosable places, and each has a page that goes deeper.

The depth is one link down. The failure-mode pillar carries the taxonomy and the containment levers; the glossary carries the metrics that turn each failure into a number with an interval. The instrument the site is building toward, a pre-launch reliability profiler that injects a controlled fault and measures how far it reaches across a topology, is design intent only, so nothing on this hub claims a measured containment number it cannot yet produce.