A Meta AI safety researcher mentioned an OpenClaw agent ran amok on her inbox

The now-viral X post from Meta AI safety researcher Summer time Yue reads, at first, like satire. She informed her OpenClaw AI agent to test her overstuffed e-mail inbox and recommend what to delete or archive.

The agent proceeded to run amok. It began deleting all her e-mail in a “velocity run” whereas ignoring her instructions from her telephone telling it to cease.

“I needed to RUN to my Mac mini like I used to be defusing a bomb,” she wrote, posting pictures of the ignored cease prompts as receipts.

The Mac Mini, an inexpensive Apple laptop that sits flat on a desk and matches within the palm of your hand, has turn into the favored system nowadays for operating OpenClaw. (The Mini is promoting “like hotcakes,” one “confused” Apple worker apparently informed famed AI researcher Andrej Karpathy when he purchased one to run an OpenClaw different referred to as NanoClaw.)

OpenClaw is, after all, the open supply AI agent that achieved fame by way of Moltbook, an AI-only social community. OpenClaw brokers have been on the middle of that now largely debunked episode on Moltbook during which it regarded just like the AIs have been plotting towards people.

However OpenClaw’s mission, in keeping with its GitHub page, isn’t centered on social networks. It goals to be a private AI assistant that runs by yourself units.

The Silicon Valley in-crowd has fallen so in love with OpenClaw that “claw” and “claws” have turn into the buzzwords of choice for brokers that run on private {hardware}. Different such brokers embody ZeroClaw, IronClaw, and PicoClaw. Y Combinator’s podcast staff even appeared on their most recent episode wearing lobster costumes.

Techcrunch occasion

Boston, MA
|
June 9, 2026

However Yue’s submit serves as a warning. As others on X famous, if an AI safety researcher might run into this drawback, what hope do mere mortals have?

“Had been you deliberately testing its guardrails or did you make a rookie mistake?” a software program developer requested her on X.

“Rookie mistake tbh,” she replied. She had been testing her agent with a smaller “toy” inbox, as she referred to as it, and it had been operating properly on much less vital e-mail. It had earned her belief, so she thought she’d let it free on the true factor.

Yue believes that the massive quantity of knowledge in her actual inbox “triggered compaction,” she wrote. Compaction occurs when the context window — the operating document of every part the AI has been informed and has executed in a session — grows too massive, inflicting the agent to start summarizing, compressing, and managing the dialog.

At that time, the AI could skip over directions that the human considers fairly vital.

On this case, it could have skipped her final immediate — the place she informed it to not act — and reverted again to its directions from the “toy” inbox.

As a number of others on X pointed out, prompts can’t be trusted to behave as safety guardrails. Fashions could misconstrue or ignore them.

Numerous folks provided ideas that ranged from the precise syntax Yue ought to have used to cease the agent, to varied strategies to make sure higher adherence to guardrails, like writing directions to devoted recordsdata or utilizing different open supply instruments.

Within the curiosity of full transparency, TechCrunch couldn’t independently confirm what occurred to Yue’s inbox. (She didn’t reply to our request for remark, although she did reply to many questions and feedback despatched her means on X.)

Nevertheless it doesn’t actually matter.

The purpose of the story is that brokers geared toward data employees, at their present stage of improvement, are dangerous. Individuals who say they’re utilizing them efficiently are cobbling collectively strategies to guard themselves.

At some point, maybe quickly (by 2027? 2028?), they might be prepared for widespread use. Goodness is aware of many people would love assist with e-mail, grocery orders, and scheduling dentist appointments. However that day has not but come.

Source link

A Meta AI safety researcher mentioned an OpenClaw agent ran amok on her inbox

AI’s ‘boys’ membership’ may widen the wealth hole for girls, says Rana el Kaliouby

Stryker says it is restoring methods after pro-Iran hackers wiped 1000’s of worker gadgets

Meet Vurt, the mobile-first streaming platform for indie filmmakers embracing vertical video

A Meta AI safety researcher mentioned an OpenClaw agent ran amok on her inbox

Related Posts

AI’s ‘boys’ membership’ may widen the wealth hole for girls, says Rana el Kaliouby

Stryker says it is restoring methods after pro-Iran hackers wiped 1000’s of worker gadgets

Meet Vurt, the mobile-first streaming platform for indie filmmakers embracing vertical video