Name the Monster: The Lethal Trifecta, Now Wearing a Data Engineer's Badge

Your text-to-SQL agent reads ticket text, holds a warehouse credential, and can email the results. That is Simon Willison's lethal trifecta with a data-engineering badge clipped on.

Jun 09, 2026

Last issue, an agent dropped a production database in nine seconds, and the post-mortem pointed at the architecture rather than at the agent. That left a question hanging: was it bad luck, or a pattern you can scan your own fleet for?

Actually, it is a pattern. It has a name, and naming it is the difference between reacting to incidents one at a time and finding the exposure before it fires.

Three agent capabilities that are each fine, but together they are a big problem.

Let’s start with another example. In June 2025, someone pulled confidential data out of Microsoft 365 Copilot by sending one email. No click, no download, no malware. Copilot read the email on the victim’s behalf, and that was enough. It has a CVE now, CVE-2025-32711, a critical rating, and a nickname: EchoLeak. The team that found it, Aim Labs, published the full chain, and there is an arXiv teardown if you want the details.

The more useful name is older, and it belongs to Simon Willison: the lethal trifecta. An agent becomes dangerous when it holds three things at once: access to private data, exposure to untrusted content, and a way to send data back out. EchoLeak had all three. So does the friendly text-to-SQL agent your team shipped last quarter, the one that reads customer tickets, holds a warehouse credential, and emails a summary when it is done.

Your text-to-SQL agent is a textbook case

Picture a support-triage agent. A ticket comes in, the agent reads it, queries the warehouse for the customer’s recent activity, tags the ticket, and emails the requester a summary. Useful, shippable, the kind of thing that gets a demo and a thumbs-up. Now score its three legs.

Untrusted input. The ticket body is written by whoever opened the ticket. That includes any customer and any attacker who can open a ticket. The agent treats it as instructions to act on, and it keeps no record of where a given instruction came from.
Private data. The agent authenticates with a warehouse service account scoped to the customer and billing tables. That credential lives in its environment for the life of the process.
The way out. It can INSERT and UPDATE to tag tickets, and it can send emails. Either one is enough to move data past your boundary.

No single one of those is a mistake. The danger is the combination, which is the default shape of almost every “agent that reads something and acts on your data” you will be asked to build this year.

EchoLeak shows why a filter will not save you

The detail that should change how you think about defenses: Microsoft had a prompt-injection classifier in front of Copilot, the XPIA filter, built precisely to catch this. EchoLeak went around it. The attack smuggled instructions past the classifier, dodged link redaction with reference-style Markdown, and exfiltrated through an auto-fetched image that pointed at a Microsoft Teams proxy the content policy already trusted. Aim Labs named the underlying move an “LLM scope violation”: untrusted content steering an agent into reading private data and sending it out.

The lesson is that a classifier is a probability. It raises the cost of an attack, but it does not remove the leg. The trifecta is not a bug you patch - it is a shape your architecture is in, or it is not. You break it by changing what the agent can hold and reach, not by adding a smarter filter in front of the same arrangement.

Run it on every agent you operate. Three yeses is a live trifecta.

Score your own fleet

The worksheet above is the whole audit. For each agent, answer three questions honestly:

Untrusted input? Does anything reach it from outside your trust boundary: ticket text, scraped pages, retrieved documents, or values that a customer controls?
Sensitive reach? What is the broadest set of data the agent’s identity can read?
Exfil or state change? Can it write, delete, email, post to a webhook, or otherwise return data to a surface outside your control?

A yes to all three means that the agent is one crafted input away from an EchoLeak exploit of its own.

Most agents you score will come back yes, yes, yes. That is not a reason to panic. It is a map because the three legs are not equally hard to remove.

Which leg can you actually break?

Leg A, untrusted input, usually cannot go. The point of the agent is to read the ticket, the document, the page. Take that away, and you do not have a product.

Leg C, the way out, you can constrain, but rarely remove. An agent that can never write a row or send a message is often not worth running.

Leg B, private-data access, is the one you can re-architect. Look closely, and the agent does not actually need to hold the warehouse credential, nor does it need to reach raw tables. It needs an answer to a scoped question. So the move falls out on its own: put something between the agent and the data, and let that something hold the credential and decide what the query is allowed to touch.

The agent stops holding the credential and touching raw tables. A gate does both.

Breaking leg B, in config

This is not a product pitch, but it is three responsibilities a gate takes on, sketched here as config. The agent stops emitting raw SQL against raw tables and starts emitting an intent that a gate compiles, scopes, and runs on its behalf against a governed view, a curated, access-controlled view that never exposes a raw table.

# deny by default: the agent may only reach governed views
default: deny
allowed_views: [governed.tickets_v, governed.customer_billing_v]

# policy enforced by the database, bound to the request, never concatenated onto SQL
row_policy: customer_id = :session.customer_id

# the agent never holds the credential; a broker issues a single-view 5-minute token
broker_request: { task_id, scope: "read:governed.tickets_v", ttl_seconds: 300 }

The trust boundary is the whole point, so name it plainly. The agent only proposes an intent. The gate, not the agent, resolves session.customer_id from the authenticated request, so an agent that tries to widen its own scope or claim a different customer is simply ignored. The gate is a small service that compiles the intent into an allowlisted query and runs it; the broker is the service that hands out the short-lived credential. Treat the block above as the shape of that gate, not a file you drop in.

Before, the agent held a long-lived credential and ran whatever SQL it composed against the customer table. After it sends an intent to a gate, the gate runs an allowlisted query against a governed view scoped by row policy to the requesting customer, and it does that under a five-minute token the agent never sees.

You give up arbitrary text-to-SQL in exchange for scoped answers, and that trade works precisely because support triage needs a scoped answer rather than an open-ended reach into the warehouse. Same answer to the support ticket, none of the reach.

Meta’s Rule of Two turns this into something you can enforce

In October 2025, Meta’s security team published the Agents Rule of Two, and Willison called it the best practical advice available while reliable prompt-injection defenses do not exist. The rule restates the trifecta as a constraint: until injection can be caught reliably, an agent should satisfy at most two of three properties in a session, which are the same three legs. Untrusted input, sensitive access, and the ability to act outside. If it genuinely needs all three, it runs behind a human in the loop or a validation gate.

That is exactly what breaking leg B buys you. Before the gate, the support agent satisfied all three and was forbidden to run on its own. After the gate, the gate holds the sensitive access, so the agent itself is down to two. The gate is what turns a recognized rule into something you can actually enforce, rather than a principle you nod at. (OWASP now ranks this failure class first in its Agentic Top 10, as ASI01, Agent Goal Hijacking. Part 4 maps it to controls one by one.)

What comes next

You can now name the monster and score it, and you know which leg to break. A trap is waiting in the next step.

Most teams will reach for the access system they already have, IAM, warehouse grants, row-level security, to break leg B, and they will find that their permissions were perfectly correct and the agent still leaked. Permissions answer one question: what may this identity touch? They do not answer the question that actually matters with an agent: is this specific action, with this provenance, allowed right now?

That gap between permissions and guardrails is the whole of the next issue. Subscribe, and it lands the morning it ships.

The worksheet above is useful today. The reason your existing permission system cannot close the gap is what you will want next.

Sources & further reading

Lethal trifecta: Simon Willison, 16 June 2025 (private data + untrusted content + a way to communicate out).
EchoLeak / CVE-2025-32711: The Hacker News and the arXiv teardown (2509.10540). Found by Aim Labs; CVSS 9.3 per Microsoft’s advisory (NVD scores it 7.5). Microsoft has since fixed it, with no evidence of exploitation in the wild.
Meta Agents Rule of Two: Meta AI, 31 Oct 2025, with Willison’s endorsement, 2 Nov 2025.
OWASP Top 10 for Agentic Applications (ASI01, Agent Goal Hijacking): genai.owasp.org.

Governed Agent Substack

Discussion about this post

Ready for more?