Most of the AI failures we document leave a victim you can point to: a chatbot that fabricates a court citation, a health tool that misses an emergency, a customer service bot that invents a refund policy. The failure documented here is different because the victim is the software itself, and the harm spreads through every program built on top of it. When an AI coding assistant invents a software package that does not exist, the worst case is not that your code fails to run. The worst case is that someone else registered that invented name first, filled it with malware, and waited for the next developer who trusts their assistant to install it without looking.
This is no longer a thought experiment. It rests on a body of research and at least one demonstrated proof of concept, and it has a name that has already entered the security vocabulary. The mechanics are simple enough to explain in a sentence and serious enough to reshape how careful teams now treat AI-written code. The tools millions of developers lean on every day to move faster are, a measurable share of the time, recommending dependencies that were never real, and that gap between confident recommendation and nonexistent reality is exactly where the attack lives.
The Study That Put A Number On It
The foundational research was presented at USENIX Security 2025, one of the most respected venues in computer security, by teams from the University of Texas at San Antonio, the University of Oklahoma, and Virginia Tech. Their paper carried a title as dry as its findings were alarming: a comprehensive analysis of package hallucinations by code-generating language models. The scope was large enough to silence any objection that this was a fluke. The researchers generated and examined more than 576,000 code samples across 16 different large language models, then checked every package each model recommended against the real software registries those packages were supposed to live in.
The headline result: 19.7 percent of all recommended packages did not exist. Nearly one in five. The rate split sharply by model class. Commercial systems were the cleaner performers, averaging roughly 5 percent, with the best of them landing under 4 percent. Open-source models were far worse, averaging close to 22 percent, with some of the most popular code-focused open models hallucinating packages more than a third of the time. Across the full run, the study catalogued over 200,000 unique hallucinated package names, a vast invented vocabulary of software that does not exist but that an AI will hand you with total confidence.
Why Repetition Is The Dangerous Part
A hallucination that happens once and never again is a nuisance. A hallucination that happens the same way every time is a target. The most important finding in the research was not the headline percentage but the consistency underneath it. The study found that 43 percent of hallucinated package names repeated across multiple runs of similar prompts, and that 58 percent of them reappeared at least once within ten attempts. The models were not improvising fresh nonsense each time. They were returning to the same fictional names, predictably, on command.
That predictability is what converts a quirk into an exploit. An attacker does not need to guess what a model might invent. They can simply ask it the same coding questions a developer would, write down the nonexistent packages it recommends, and register those exact names on a public registry such as PyPI for Python or npm for JavaScript. From that moment on, every developer whose assistant gives the same answer and who runs the install command without checking is pulling the attacker's code straight into their project. The study even broke down where the fabrications came from: a little over half were entirely invented, more than a third were warped versions of real packages, and a small share were simple typos. Each category is a different flavor of the same trap.
Slopsquatting, And The Package That Got Downloaded 30,000 Times
The attack now has a name. In April 2025, Seth Larson, the security developer-in-residence at the Python Software Foundation, coined the term slopsquatting, a fusion of AI slop and the older practice of typosquatting. Typosquatting bets on human fingers slipping; slopsquatting bets on machine confidence, on the names a model will recommend that simply are not real. The label stuck because it described something practitioners were already watching emerge.
The clearest demonstration that this works in the wild came before the term even existed. A security researcher noticed that AI models kept recommending a Python package called huggingface-cli, a shortened, tidier-looking name for a real tool whose actual installation works differently. The package the models kept suggesting did not exist. To test the risk, the researcher uploaded an empty, harmless stand-in package under that hallucinated name to the public Python registry and waited. Over the following three months, that empty package was downloaded more than 30,000 times. Worse, a major technology company copied the hallucinated install command into the public documentation of one of its own repositories, effectively telling its own users to pull a package that, had a malicious actor claimed the name first, could have carried anything at all.
The package was benign by design, and no one was harmed. But the experiment removed any remaining doubt about whether the theory translated into practice. Tens of thousands of real installs of a package that began life as nothing more than an AI hallucination is not a hypothetical. It is a measurement of how readily trust in an AI suggestion converts into code running on real machines.
The selling point of an AI coding assistant is that you can move faster because you no longer have to check everything yourself. Slopsquatting weaponizes exactly that promise. The faster a developer trusts the suggestion, the cleaner the attack, because the entire exploit depends on a human not pausing to ask whether the recommended package was ever real.
The Agent Era Removes The Last Human Checkpoint
Until recently, slopsquatting still required a person somewhere in the loop. A developer had to read the AI's suggestion, notice the package, and choose to install it. That moment of human attention, however thin, was a checkpoint. It was the place where a careful engineer might pause and think, I have never heard of that library, let me look it up. The rise of autonomous coding agents, the kind that read a task, write the code, resolve the dependencies, and run the install commands on their own, threatens to delete that checkpoint entirely.
An agent built to complete a task without supervision has no instinct to be suspicious of a plausible-sounding package name. It will resolve the dependency the same way it resolves everything else, by doing what looks correct and proceeding. If the model that drives that agent hallucinates the same package name it hallucinated for the researchers, and an attacker has registered that name, the malicious code is fetched and executed with no human ever seeing it. The supply chain risk does not scale linearly with the number of agents deployed. It scales with how completely those agents are trusted to act without review, and the entire pitch of the agent era is that they should be trusted to act without review.
This is the throughline that connects package hallucination to every other failure we document. The promise of AI is the removal of human effort, of double-checking, of the slow verification that used to sit between a suggestion and an action. Each removed checkpoint is sold as a productivity gain. Slopsquatting is the reminder that some of those checkpoints were never overhead. They were the immune system, the moment a human caught the thing the machine got confidently wrong, and a model that hallucinates the same nonexistent package on demand is precisely the kind of error that immune system existed to stop.
The Verdict
Nearly one in five packages recommended by AI coding tools does not exist, and the hallucinations repeat predictably enough to be registered and weaponized. A benign empty package under one hallucinated name drew more than 30,000 downloads. The failure here is not a bad answer in a chat window. It is fabricated output flowing straight into real software, and the autonomous agents being sold as the next leap forward are designed to remove the last human who might have caught it.
Tracking how AI hallucinations cause real-world harm? Browse every documented problem.