The Authorization Problem: Why LLMs Cannot Guard Authentication Boundaries

When Meta's AI support chatbot handed an attacker a recovery code last weekend, the failure was not in the model's training, its safety filters, or its prompt design. The failure was in the trust boundary drawn around it. An LLM was placed at the gate of an authentication system, and the gate opened when asked. Understanding why that was always going to happen requires understanding what authentication systems actually demand and what language models actually are.

What Authentication Requires

Authentication is the process of verifying that a claimed identity corresponds to a real one. Every authentication system, regardless of implementation, reduces to a single question: does this entity possess something that only the legitimate principal can possess?

The answer must be binary. It either does or it does not. This is not a design preference — it is the foundational requirement. An authentication system that sometimes lets in the wrong party, under sufficiently convincing circumstances, is not a degraded authentication system. It is no authentication system at all.

The mechanisms that implement this binary requirement are deliberately resistant to argument. A password hash does not care how plausible the requester sounds. A TOTP algorithm does not weigh contextual factors. A hardware security key performs a cryptographic challenge-response that cannot be completed by anyone who does not physically hold the key. The security guarantee in each case is mathematical. It does not have an exception path for edge cases that seem reasonable.

This is the product of decades of deliberate engineering. Human support staff can be socially engineered. Early password reset flows that relied on security questions could be defeated by research. Each failure mode produced a design response: reduce the role of judgment, increase the role of cryptographic proof. The direction of travel in authentication has been consistently away from behavioral assessment and toward deterministic verification.

What a Language Model Is

A large language model is a probabilistic text predictor. Given an input sequence, it produces an output by sampling from a probability distribution over possible next tokens. That distribution is shaped by training — the model learns which outputs are likely to be appropriate given which inputs — but it is never deterministic. The same input can produce different outputs. More relevantly, a sufficiently well-framed input can shift the probability distribution toward outputs the model would not produce under neutral framing.

This is not a flaw. It is the property that makes language models useful. The ability to respond appropriately to varied, ambiguous, and novel inputs is exactly what makes them effective support tools for open-ended queries. The problem is not that this property exists. The problem is what happens when a system with this property is given the ability to take consequential, irreversible actions.

When an LLM is connected to tools — functions it can call to interact with external systems — its outputs are no longer just text. They are instructions that execute. The model decides, based on its probabilistic assessment of the conversation, whether to call a tool and with what parameters. That decision inherits all of the model's susceptibility to reframing, context manipulation, and social engineering.

The Tool Access Problem

Modern AI agent frameworks, relying on tool-calling architectures similar to the Model Context Protocol, work by exposing a set of callable tools to the model. The model reads the tool descriptions, reasons about which tools are relevant to the current task, and calls them. The security posture of the entire system depends on what tools are exposed and what those tools can do.

This is a deliberate design. The developer writes a tool. The developer registers it with the agent. The agent can then use it. There is no mechanism in the framework itself that distinguishes between tools that are safe to call based on conversational context and tools that require verified identity before they can be called. That distinction is the developer's responsibility.

Meta registered a tool that could modify account recovery email addresses and send security codes to those addresses. The model had no way to distinguish a legitimate account holder from an attacker who had constructed a plausible conversational context. The model was not asked to make that distinction — it was simply asked whether the request made sense in context. It did, so the tool was called.

The gap is not in the model. The gap is between what the tool required to execute and what the model was capable of verifying before calling it.

The Identity Verification Gap

A deterministic authentication system closes the identity verification gap with proof. The user must present something — a correct password, a valid OTP, a signed challenge — before the system will act. The proof is checked outside the conversation, by a system that does not interpret intent or weigh plausibility. It either passes or it does not.

An LLM-driven support agent, without an equivalent gate, closes the gap with assessment. The model evaluates the conversation and makes a judgment about whether the request is legitimate. That judgment is probabilistic, context-sensitive, and manipulable. It is the wrong tool for the job.

The correct architecture is not to train the model more carefully or to write better system prompts. Those measures raise the bar for casual misuse. They do not provide a security guarantee, because a security guarantee requires a property that cannot be defeated by constructing a sufficiently convincing input. No language model has that property for conversational inputs, and no amount of fine-tuning will give it one.

The correct architecture is to interpose a deterministic verification step between the conversational interface and any tool that modifies account state. The model can initiate the verification — it can tell the user that proof of identity is required before the action proceeds. But the verification itself must be handled by a system that does not interpret the user's response, and it must rely exclusively on the authentication state that existed before the requested change. Sending a confirmation code to the address the attacker just convinced the chatbot to add is not a gate. It is a formality that the attacker controls.

This is not a novel concept. It is how password reset flows have worked for years. The AI layer does not change the requirement. It adds a new surface through which the requirement can be bypassed if the architect does not explicitly account for it.

Agentic AI at Authentication Boundaries

The rule is simple to state. Any tool that modifies authentication state — passwords, recovery addresses, MFA configuration, session tokens — must require verified identity proof that is checked by a deterministic system before the tool executes. The conversational agent can be the interface. It cannot be the verifier.

What the Meta incident demonstrates is not that AI support tools are inherently dangerous. It demonstrates that the architects who deploy them must understand the difference between a tool that answers questions and a tool that changes account state, and must treat those two categories with fundamentally different trust assumptions. The chatbot that explains how to update a profile photo and the chatbot that can change a recovery email address are not the same kind of system, even if they share the same interface.