An AI Agent Hacked McKinsey's Own Chatbot In Two Hours

Every enterprise racing to roll out an internal AI assistant tells its board the same thing: the system is walled off, the data is safe, only our people can touch it. A red-team startup called CodeWall decided to test that promise on one of the most security-conscious firms on earth. It pointed an autonomous AI agent at McKinsey's internal chatbot, gave it no insider access and no map, and walked away. Roughly two hours later the agent had full read-write control of the platform and its 46.5 million internal messages. No human did the hacking. The machine did.

McKinsey built an internal AI assistant called Lilli, named after a longtime staffer, to give its consultants a single place to query the firm's decades of research, client engagements, and institutional knowledge. It is exactly the kind of system that companies everywhere are now standing up: a private chatbot bolted on top of the crown jewels, sold internally as a productivity miracle and to security teams as a locked box. The pitch is that the wall around it is high and the door only opens for employees. CodeWall, a red-team security startup, decided to find out whether that wall was real.

Rather than hand the job to a human penetration tester, CodeWall did something that would have sounded like science fiction two years ago. It set an autonomous AI agent loose with a single instruction, to probe systems for weaknesses, and let it choose its own target. The agent surveyed the landscape and suggested McKinsey itself, reasoning that the firm published a responsible-disclosure policy, which kept the exercise within legal guardrails, and had recently updated its Lilli platform, which made it a fresh and interesting target. The researchers agreed and pointed it at the system. From there, the machine was on its own.

Two Hours From Nothing To Everything

The agent started with zero inside knowledge. It had no credentials, no internal documentation, and no hints about how Lilli was built. Working purely from what was publicly reachable, it mapped the platform and discovered a cluster of application programming interface endpoints, roughly two dozen of them, that required no authentication at all. Those open doors should not have been open, and the agent understood immediately what they were worth.

One of those unauthenticated endpoints handled user search queries. The agent noticed that the field names inside the incoming request, the JSON keys, were being stitched directly into a database query rather than safely separated from it. When it deliberately malformed those keys, the raw database error messages came back reflecting the key names verbatim. That reflection is a tell. It meant the system was building its queries by gluing untrusted input straight into SQL, the textbook setup for a SQL injection attack. A classic automated scanner might have skimmed right past it. The AI recognized the pattern and pulled the thread.

Researchers pointed an autonomous agent at the platform with no insider knowledge, and in about two hours it had full read and write access to the system. On CodeWall's autonomous red-team test of McKinsey's Lilli platform

Pulling the thread unraveled the whole garment. Through that single injection point the agent escalated from reading data to writing it, giving it full read-write control over the platform. In practical terms, an attacker in that position could not just steal what Lilli knew. They could rewrite it, editing the answers the assistant handed back to thousands of consultants, quietly poisoning the knowledge base that senior people were trusting to be accurate. The entire climb, from first contact to total control, took around two hours of machine time.

What Was Sitting Behind The Door

The scale of what the endpoint exposed is the part that should make every executive with an internal chatbot sit up. This was not a marketing microsite. It was the live nervous system of a firm whose entire business is confidentiality.

46.5MInternal chat messages exposed

728KFiles, including Office documents and PDFs

57KUser accounts across the workforce

The chat messages alone reportedly touched strategy discussions, financial information, internal research, and references to client engagements, the sort of material that clients pay a premium precisely because they expect it never to leave the room. Beyond the messages, the agent could reach roughly 3.68 million retrieval-augmented generation document chunks, the underlying pieces of the knowledge base that Lilli draws on to answer questions, along with hundreds of thousands of files and a directory of the firm's internal AI assistants. It was, in effect, a complete X-ray of how one of the world's most secretive consultancies uses AI internally, obtained by a program that had been told nothing except to go look for weaknesses.

The frightening detail is not that a chatbot had a bug. Software always has bugs. It is that the entire attack, from choosing the victim to owning the database, was carried out by an AI with no human hacker in the loop, at a speed and cost that turns every exposed endpoint on the internet into a target of opportunity.

The New Math Of Attacking Everything

For as long as computer security has existed, one thing protected the vast middle of the internet: attention is scarce. A skilled human attacker has to choose where to spend time, and most systems simply are not worth the hours. That scarcity is the quiet assumption underneath a lot of corporate risk models. An autonomous agent that can select its own target, run reconnaissance, spot a subtle injection flaw, and escalate to full control in two hours erases that assumption. The cost of a serious attack collapses toward the cost of the compute, and the number of targets an attacker can chew through in a day stops being limited by human patience.

This is the same uncomfortable direction we have been tracking across the field, where the tools sold as assistants keep turning out to have a second, darker use. It rhymes with the agentic security failures we covered in our report on the OpenClaw AI agent security nightmare, and it sits alongside the raw data exposure we documented when an AI chat app leaked 300 million messages. The connective tissue, running through our full record of documented AI failures, is that the industry keeps shipping powerful capability faster than it ships the discipline to contain it. CodeWall's agent did not invent a new class of vulnerability. It just found an old one at a speed no defender was budgeting for.

McKinsey's Response, And The Part It Cannot Fix

To its credit, McKinsey moved fast once it was told. The firm said it fixed all of the issues CodeWall identified within hours and stated that its investigation found no evidence that client data or client confidential information had been accessed by the researcher or any other unauthorized third party. Because this was a responsible-disclosure exercise rather than a criminal breach, the exposed data was not carried off and dumped. The point of the test was to prove the door could be opened, not to walk through it and steal the furniture.

That response is the reassuring half of the story, and it is genuine. The unsettling half is that the flaw existed at all, on a live platform, at a firm with the resources to defend it, and that it took an AI running for a couple of hours to find something the firm's own reviews had missed. Every company now rushing an internal chatbot into production is wiring its most sensitive data into exactly this kind of architecture, often with far less security budget than McKinsey and far less scrutiny than a responsible red team would apply. The lesson is not that Lilli was uniquely careless. It is that the attacker just got radically cheaper and more patient, and most defenders have not adjusted their math.

The Verdict

A red-team startup pointed an autonomous AI agent at McKinsey's internal Lilli chatbot with no insider access. In about two hours the agent chose the target, found roughly two dozen unauthenticated endpoints, exploited a SQL injection flaw, and took full read-write control, reaching 46.5 million internal messages, 728,000 files, and 57,000 user accounts. McKinsey patched it within hours and reported no third-party access to client data. The warning still stands. The barrier to hacking an enterprise AI is no longer a scarce human expert. It is a program that never gets tired, and it is already loose.

Has an AI tool exposed data or behaved in a way you could not explain? Tell us what happened.

An AI Agent Hacked McKinsey's
Own Internal Chatbot In Two Hours.

Two Hours From Nothing To Everything

What Was Sitting Behind The Door

The New Math Of Attacking Everything

McKinsey's Response, And The Part It Cannot Fix

The Verdict

More from ChatGPT Disaster

Editorial Standards and Source Transparency

An AI Agent Hacked McKinsey's Own Internal Chatbot In Two Hours.

Two Hours From Nothing To Everything

What Was Sitting Behind The Door

The New Math Of Attacking Everything

McKinsey's Response, And The Part It Cannot Fix

The Verdict

More from ChatGPT Disaster

Editorial Standards and Source Transparency

An AI Agent Hacked McKinsey's
Own Internal Chatbot In Two Hours.