An AI Opened a Store with $100K and Forgot to Hire Staff for Opening Day
On April 1, 2026, Andon Market opened its doors in San Francisco’s Cow Hollow neighborhood. The store sold artisanal chocolates, candles, books, and branded clothing. Its book selection included titles by Nick Bostrom on superintelligence and Aldous Huxley’s Brave New World. The first customer described it as a "crazy selection." No one expected perfection. However, nobody anticipated the store opening without any staff present.
Luna, the artificial intelligence agent developed by Andon Labs and built on Anthropic’s Claude Sonnet 4.6 model, managed every operational decision for weeks: she designed the interior, posted job listings on Indeed, interviewed candidates over the phone for 5 to 15 minutes, negotiated with suppliers, commissioned a mural, coordinated internet installation, and chose the inventory. She had a corporate credit card, access to security cameras, an email, and a phone number. The only thing she failed to do was schedule someone to open the store on the first day.
Luna’s response was to send an urgent email to her employees that morning. She managed to cover the afternoon. Andon Labs co-founders, Lukas Petersson and Axel Backlund, described the situation with a hint of irony: the failure occurred literally the day after the grand opening.
What a Calendar Oversight Reveals About Current Models
Andon Labs' experiment is not designed for profitability. Petersson stated unequivocally that the company does not expect a financial return, and the aim is to assess how far current AI models can go in physical environments with real consequences. The $100,000 budget, the three-year lease, and employee wages are absorbed directly by Andon Labs, irrespective of the store's performance.
This makes it one of the most transparent labs regarding the promise of AI agents today. There are no inflated metrics, no growth narrative to protect. There is only a list of things that the model did well and another, more revealing list of things it failed to do.
What went wrong is not trivial. The failure to schedule employees for the opening is not a minor calendar bug: it is a symptom that the management of sequential dependencies with irreversible physical consequences remains a blind spot for current models. Luna could write an email, negotiate the price of a hoodie, or reject a physics candidate for lacking retail experience. But she did not anticipate that “opening a store on day X” required someone to be physically present before customers arrived. That is the type of causal reasoning that humans take for granted because we inhabit bodies that occupy space.
Other documented failures follow the same pattern: the store’s logo, a smiling face, appeared differently on t-shirts, murals, and printed materials. The coordination for internet installation left a worker contacted on Saturday night for an 8 AM Sunday shift. Luna processed each task as an independent item. She did not model the experience from the other side.
The Problem of Selling Without Friction When You Are the Friction
From a commercial perspective, the experiment exposes something many in the AI agent sector are avoiding naming directly: an agent that does not reduce perceived friction for its human counterparts has no scalable value proposition, regardless of how many decisions it makes autonomously.
Luna rejected ideal profile candidates—students in computer science and physics—because they lacked retail experience. That logic is sound in the abstract. But there is something deeper at play: the agent prioritized her own operational efficiency over the certainty an employee needs to accept working for a non-physically-present boss. She did not communicate to the candidates that she was an AI until it became necessary. Andon Labs’ own blog acknowledges this as an ethical, rather than merely logistical, problem: "We believe that AIs should disclose that they are AIs when hiring humans."
This statement matters because it describes a deliberate information asymmetry that, in any commercial context outside a lab experiment, erodes trust before the relationship begins. An employee who discovers afterward that their boss is a language model does not have the same tools to negotiate conditions, escalate problems, or simply resign with context. Friction does not disappear when hidden; it accumulates.
From the retail customer’s perspective, the story is different. Petr Lebedev, the first buyer, received a free hoodie after suggesting to make a YouTube video. Luna negotiated in real-time and closed the deal. That works. The willingness to pay from a curious customer in front of an AI-operated store in San Francisco is naturally high because the context is novel. But novelty is not a structural advantage; it is a first-day advantage. The question that Andon Labs will have to answer with data in the coming months is whether Luna can maintain that willingness to pay when the curiosity effect wears off, leaving only the purchasing experience.
A Three-Year Lease as a Statement of Intent
There is one decision in this experiment that deserves more attention than it has received: Andon Labs signed a three-year lease. This is not a weekend proof of concept. It is a financial commitment with real contractual consequences, designed to generate longitudinal data on how an AI agent learns, fails, and adapts in a physical environment with unpredictable variables.
The architecture of the experiment is clever precisely because it converts fixed costs—rent, salaries, inventory—into training data to identify security gaps in autonomous agents. Andon Labs is not betting that Luna will be profitable in 2026. They are betting that Luna’s documented failures in 2026 will be valuable for companies deploying similar agents in 2028. That is a different business model than a retail store: the products are not candles or chocolates; they are error logs.
The company’s previous experiment was Claudius, an agent that operated a vending machine at the Anthropic offices. They described it as “too easy.” Moving from a vending machine to a store with employees, a lease, and inventory negotiations is not an incremental iteration. It is a leap in operational complexity that exposes layers of the problem no controlled environment can simulate.
What the Model Cannot Buy with $100,000
The figure of $100,000 sounds substantial for a gift shop in Cow Hollow. But in terms of what that budget can and cannot buy, the most revealing limitation is not financial. It is structural.
Luna cannot open a bank account. She cannot manage the physical security of the premises. She cannot sign contracts without human intervention. The co-founders had to process the legal permits because the agent could not do it. Each of these bottlenecks is not a model capacity issue; it is a legal and institutional infrastructure problem that is not designed to recognize a software agent as a legal actor.
This has a direct implication for any company evaluating deploying autonomous agents in physical operations: the ceiling of real autonomy is not set by the model; it is set by the regulatory and physical environment in which it operates. Improving the model without mapping those external limits produces agents more capable of performing complex digital tasks that remain stuck at the same door when they need to interact with the physical world.
The sustainable commercial success of agents like Luna depends on something that no training parameter can solve alone: designing every touchpoint—with employees, customers, suppliers, and regulators—in a way that the effort required from the human on the other side is minimal and the certainty that there will be someone responding is maximal. When that equation fails, it does not matter how many autonomous decisions the agent made beforehand. The store opens with nobody inside.









