Most of the failures documented here happen on a screen. A chatbot invents a court case, a search summary defames a real person, a support bot makes up a refund policy that never existed. This one happens in a parking lot, at a metal speaker, with a person in a car and an engine idling, and that is exactly what makes it worth your attention. The drive-thru is the most ordinary place in America. It is also a near perfect stress test for whether a language model can actually function in the messy, noisy, unscripted world it keeps being sold into. Two of the largest fast food brands on the planet ran that test in public, at scale, and the results were not subtle.

Taco Bell went big first. The chain put AI voice ordering into around 500 of its United States drive-thru locations, betting that a model could greet customers, parse their orders, and hand a clean ticket to the kitchen. McDonald's ran a parallel experiment with IBM, an automated order taking system tested in more than 100 restaurants over roughly two years. Both were exactly the kind of high volume, repetitive, narrow task that automation evangelists love to point at. Take the order. How hard can it be. The answer, it turns out, is hard enough that both companies pumped the brakes.

The 18,000 Cup Water Order

The moment that crystallized Taco Bell's problem was almost too on the nose. A customer rolled up and asked the AI for 18,000 cups of water, and the system, having no instinct for the difference between a real order and a person openly toying with it, took the request at face value. The clip spread the way these clips always do, because it captures something everyone already suspected. The machine has no common sense. It cannot tell that no human being has ever wanted, or could ever carry, eighteen thousand waters. It only knows that water is on the menu and a number was spoken, so it dutifully tried to ring up the impossible.

That was not the only viral humiliation. Another widely shared video showed the system stuck in a loop, repeatedly asking a visibly exhausted customer what he would like to drink with his large Mountain Dew, after he had already told it the Mountain Dew was the drink. The thing that was supposed to make ordering frictionless had instead trapped a man in a conversation he could not escape, at a speaker box, in a line of cars, over a soda he already ordered.

The experiences at the drive-through have been uneven. Dane Mathews, Taco Bell Chief Digital and Technology Officer, on the AI ordering rollout

That word, uneven, is corporate for something closer to chaos. Taco Bell's own digital and technology chief publicly acknowledged that the drive-thru experience with the AI had been inconsistent, and the company moved to reassess and adjust its strategy rather than keep barreling toward the original plan of putting the system everywhere. The technology was not ripped out overnight, but the confident march to full deployment stopped, and human employees were pushed back into the loop to watch what the model was doing.

~500Taco Bell drive-thru locations that received AI voice ordering before the pullback
18,000Cups of water a single troll order asked the AI to ring up
100+McDonald's restaurants in the IBM automated order test before it ended

Bacon In The Ice Cream

McDonald's hit a wall of its own. Its IBM powered system became a social media fixture for all the wrong reasons, racking up clips of orders that went sideways in ways no human worker would ever allow. In one widely circulated video, the AI cheerfully confirmed an order that included bacon added to ice cream, along with a pile of items the customer never asked for. Other posts described the system stacking on unwanted products, blending orders from adjacent lanes together, and ignoring corrections when a customer tried to fix it. The most infamous example had the AI ringing up what appeared to be hundreds of dollars worth of Chicken McNuggets that nobody requested.

In June 2024 McDonald's confirmed it was ending the IBM test and planned to pull the technology from the restaurants where it had been running. The company was careful to say it still believes voice ordering will be part of its restaurants' future, and it left the door open to other vendors. But the headline was unmistakable. The most operationally disciplined fast food machine ever built ran the experiment for two years, in over a hundred stores, and walked away. When McDonald's, the company that turned consistency into a science, cannot make your AI behave, the problem is not the franchisee. The problem is the technology.

A demo runs in a quiet room with a cooperative tester reading a clean script. A drive-thru runs in the wind, next to a highway, with a kid yelling in the back seat, a customer changing the order halfway through, and slang the model has never heard. The demo was never the hard part. The drive-thru was always the hard part.

Why The Speaker Box Breaks Them

It is tempting to laugh this off as fast food slapstick and move on, but the drive-thru is doing something genuinely useful here. It is exposing the exact failure modes that get quietly buried when AI is deployed somewhere you cannot see it. Background noise degrades the speech recognition. Regional accents and casual phrasing trip up the parsing. People do not order in clean sentences, they mumble, backtrack, talk over each other, and change their minds. And crucially, the model has no real model of the world. It does not know that 18,000 waters is absurd, that bacon does not go in ice cream, or that a person who says Mountain Dew once does not need to be asked about it five more times. It pattern matches. The world does not.

This is the same brittleness we have tracked everywhere these systems get pushed past the demo. It is the cousin of the shopping integrations that confidently surface the wrong product, a pattern we dug into with the Etsy and ChatGPT shopping integration breakdown. It rhymes with the agent security holes we covered in the AI browser prompt injection report, where letting a model act on its own in an unpredictable environment opened doors nobody wanted opened. The thread running through all of it, and through our broader record of documented AI failures, is the same. These models are dazzling in a controlled setting and fragile the moment reality stops cooperating.

The Honest Lesson Is Not "AI Is Useless"

It would be dishonest to claim the drive-thru proves AI cannot take an order, because that is not what happened. Both companies reported that the systems handled a real share of clean, simple orders just fine, and neither swore off voice automation forever. McDonald's explicitly said the technology would return in some form. The truthful lesson is narrower and more useful. The systems work in the easy middle of the distribution and fall apart at the edges, and a drive-thru is nothing but edges. A busy restaurant cannot afford a tool that is excellent ninety percent of the time and a viral disaster the other ten, because the ten percent is what ends up on the internet and what sends a frustrated customer to a competitor.

What both chains rediscovered is the value of the thing they were trying to automate away. A bored teenager at the speaker box has a lifetime of common sense baked in. They know to laugh off the troll order, to catch the obvious mistake, to hear the difference between a real request and a joke, to ask a clarifying question instead of looping forever. That judgment is invisible right up until you remove it, and then it is the only thing anyone misses. The most likely future is not the human or the machine. It is the model handling the simple orders while a person stands ready to step in the second the conversation leaves the script, which is precisely the hybrid arrangement that companies keep arriving at after the all-in version blows up in public.

The Verdict

Taco Bell put AI voice ordering in roughly 500 drive-thrus and got buried under troll orders, including a request for 18,000 cups of water and a customer stuck in a loop over a single Mountain Dew, before its own technology chief admitted the experience had been uneven and the company pulled back. McDonald's ran a two year automated ordering test with IBM in more than 100 restaurants, watched it put bacon in the ice cream and ring up phantom McNuggets, and ended the partnership in June 2024. Neither failure means AI cannot take an order. Both prove that the real world is full of edges, that a speaker box has no common sense, and that the cheapest, oldest piece of the system, a human who can tell a joke from an order, is the part nobody could replace.

Watched an AI tool fall apart the moment it left the demo? Tell us what happened.