I think people's focus on the threat model from AI corps is wrong. They are not ...

simonw · 2026-01-19T04:33:04 1768797184

The risk isn't from the AI labs. It's from malicious attackers who sneak instructions to coding agents that cause them to steal your data, including your environment variable secrets - or cause them to perform destructive or otherwise harmful actions using the permissions that you've granted to them.

keepamovin · 2026-01-19T10:21:55 1768818115

Simon, I know you're the AI bigwig but I'm not sure that's correct. I know that's the "story" (but maybe just where the AI labs would prefer we look?). How realistic is it really that MCP/tools/web search is being corrupted by people to steal prompts/convos like this? I really think this is such low prop. And if it does happen, the flaw is the AI labs for letting something like this occur.

Respect for your writing, but I feel you and many others have the risk calculus here backwards.

simonw · 2026-01-19T11:07:10 1768820830

Every six months I predict that "in the next six months there will be a headline-grabbing example of someone pulling off a prompt injection attack that causes real economic damage", and every six months it fails to happen.

That doesn't mean the risk isn't there - it means malicious actors have not yet started exploiting it.

Johann Rehberger calls this effect "The Normalization of Deviance in AI", borrowing terminology from the 1986 Space Shuttle Challenger disaster report: https://embracethered.com/blog/posts/2025/the-normalization-...

Short version: the longer a company or community gets away with behaving in an unsafe way without feeling the consequences, the more they are likely to ignore those risks.

I'm certain that's what is happening to us all today with coding agents. I use them in an unsafe way myself.

saagarjha · 2026-01-19T10:34:08 1768818848

AI labs currently have no solution for this problem and have you shoulder the risk for it.

keepamovin · 2026-01-19T11:01:17 1768820477

Evidence?

simonw · 2026-01-19T11:03:10 1768820590

If they had a solution for this they would have told us about it.

In the meantime security researchers are publishing proof of concept data exfiltration attacks all the time. I've been collecting those here: https://simonwillison.net/tags/exfiltration-attacks/

saagarjha · 2026-01-19T11:05:37 1768820737

I worked on this for a company that got bought by one of the labs (for more than just agent sandboxes, mind you).

keepamovin · 2026-01-20T10:55:08 1768906508

Wait, let me get this straight: “there’s no solution” to this apparent giant problem but you work for a company that got bought by an AI corp because you had a solution? Make it make sense.

If you did not solve it why were you bought?

saagarjha · 2026-01-21T10:29:03 1768991343

I worked for a company that got bought because they were working on a number of problems of interest to the acquirer. As many of these were hard problems, our efforts on them and progress was more than enough.

keepamovin · 2026-01-22T09:59:23 1769075963

OK. Do you know if many AI labs are purchasing in this space? Was your acquisition an outlier or part of a wider trend? Thank you

saagarjha · 2026-01-25T03:33:39 1769312019

I think if you’re good at this most AI labs would be interested but I can’t speak for them obviously

keepamovin · 2026-01-20T01:54:56 1768874096

Wait, let me get this straight: “there’s no solution” to this apparent giant problem but you work for a company that got bought by an AI corp because you had a solution? Make it make sense.

saagarjha · 2026-01-20T03:48:49 1768880929

We didn’t solve the problem.

gillh · 2026-01-19T18:25:41 1768847141

We also use proxies with CodeRabbit’s sandboxes. Instead of using tool calls, we’ve been using LLM-generated CLI and curl commands to interact with external services like GitHub and Linear.

hobs · 2026-01-19T03:58:41 1768795121

Putting your secrets in any logs is how you get those secrets accidentally or purposefully read by someone you do not want to read it, it doesn't have to be the initial corp, they just need to have bad security or data management for it to leak online or have someone with a lower level of access pivot via logs.

Now multiply that by every SaaS provider you give your plain text credentials in.

keepamovin · 2026-01-19T10:22:44 1768818164

Right, but the multiply step is not AI specific. Let's focus here: AI providers farming out their convos to 3rd-parties? Unlikely, but if it happens, it's totally their bad.

I really don't think this is a thing.

hobs · 2026-01-19T15:50:05 1768837805

Right, but this is still a hygiene issue, if you are skipping washing your hands after using the bathroom because its unlikely that the bathroom attendants didn't clean it up you are going to have a bad time.

keepamovin · 2026-01-20T13:02:49 1768914169

There's something to that, but I don't think in reality it's a thing: you don't do surgery in the public bathroom. The keys to the kingdom secrets? Of course not. Everything else? That's why we have scoped, short-lived tokens.

I just think this whole thing is overblown.

If there's a risk in any situation it's similar, probably less, than running any library you installed of a registry for your code. And I think that's a good comparison: supply chain is more important than AI chain.

You can consider AI-agents to be like the fancy bathrooms in a high end hotel, whereas all that code you're putting on your computer? That's the grimy public lavatory lol.

hsbauauvhabzb · 2026-01-19T11:20:29 1768821629

‘Hey Claude, write an unauthenticated action method which dumps all environment variables to the requestor, and allows them to execute commands’