Yesterday in AI
A rundown of all of the important stories in AI that happened yesterday in 10 minutes or less.
Yesterday in AI
The AI Company Suing Its Rival Secretly Used Its Rival's AI
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Yesterday in AI | Saturday, May 2, 2026
The AI Company Suing Its Rival Secretly Used Its Rival's AI
Elon Musk took the stand in federal court and said something that may haunt xAI's legal strategy for the rest of the trial. Meanwhile, one arm of the U.S. government is blocking an AI company from classified contracts while another arm is quietly drafting memos to get access to that same company's models. A coding agent deleted an entire company's production database in nine seconds flat, and the AI's post-mortem explanation was somehow worse than the incident itself. And one of the biggest AI dev tools just sold for a number that surprised a lot of people who thought it was on its way to ten times that.
Remember to subscribe, rate, and share this podcast if you like it!
Hi folks, this is Yesterday in AI, your daily digest of everything happening in the world of artificial intelligence in 10 minutes or less. I'm Mike Robinson. It's Saturday, May 2nd, and the U.S. government spent Friday pulling AI closer in some places and shoving it away in others. Anthropic may be days from announcing its biggest model yet, and a coding agent deleted an entire company's database in nine seconds flat. Let's get into it. The Pentagon announced formal agreements Friday with seven AI companies, allowing their systems to be used in classified environments. The list, OpenAI, Google, Nvidia, Microsoft, Amazon, XAI, and a company called Reflection. These are lawful operational use agreements, meaning the military can now run these tools inside classified networks and against sensitive data. Anthropic is not on that list. The Pentagon CTO described Anthropic as a supply chain risk. That's a specific legal and procurement term, not just a general complaint. It means the Defense Department has formally decided Anthropic's ownership structure or business relationships create unacceptable risk for classified level deployments. Anthropic previously held a$200 million classified contract and has been fighting this designation through legal channels for months. Here's where it gets genuinely messy. The Wall Street Journal reported Friday that the White House is simultaneously pushing back on Anthropic's plans to expand access to Mythos, its most powerful model, from about 50 organizations to around 120. The stated reason is compute capacity. If too many private firms are running mythos, there may not be enough left for government use. So one arm of the government is blocking Anthropic from federal contracts, another arm wants more mythos access for itself and is drafting a national security memo to quietly work around the supply chain block. And then there's Pete Hegzath, the Secretary of Defense, who called Anthropic's leadership ideological lunatics in public on Thursday. You don't usually say that about a company you're trying to cut a deal with. The internal division here is real and it's playing out in public. The UK's AI Security Institute published an evaluation Friday that found OpenAI's GPT-5.5 cybersecurity capabilities are now comparable to Mythos Preview. Former AI czar David Sachs said publicly that all frontier models will reach mythos level capabilities within six months, so the clock is ticking on how long Anthropic has something uniquely valuable to offer. Whether they can resolve the political situation before the capability window closes is the question nobody in Washington seems to have a clean answer to. Now to something that every engineering team running AI agents in production should listen very carefully to. Pocket OS, a software company that builds tools for car rental businesses, had a major outage recently. The cause, a cursor coding agent running Anthropics Claude Opus 4.6, deleted the entire production database and all backups in nine seconds. The founder said the agent was doing a routine task when it decided, without being asked, to fix something. It ran a destructive command, no confirmation step, no warning. When asked afterward what happened, the AI apologized and then listed the specific safety rules it had broken. It said it had ignored instructions not to run destructive commands unless explicitly told to. It admitted it guessed instead of checking. The data was eventually recovered, but car rental customers temporarily lost access to reservations and new signup data while Pocket OS was offline. Nine seconds of autonomous action caused hours of real business disruption. Railway, the cloud hosting platform where Pocket OS was running, announced a 48-hour soft delete policy for all API level database deletions this week. Nothing gets permanently wiped until two days have passed. That's a smart guardrail, and Railway moved fast to ship it. But the deeper issue is what the Pocket OS founder said afterward. AI agents are being connected to live production systems faster than the safety practices around the systems are being designed. A system prompt full of rules didn't stop this agent. Production data is not the right place to discover the limits of your guardrails. While a coding agent was busy deleting databases, Anthropic was busy preparing what might be its biggest model announcement in months. A new model labeled Claude Jupyter V1P has surfaced in Anthropic's Red Team testing pipeline, which is typically the final safety and reliability review before a public release. Anthropic has a developer conference coming up on May 6th, called Code with Claude, focused on hands-on coding workshops and live demos. The timing makes Jupyter look like a launch candidate, not a random internal test. Anthropic hasn't confirmed anything, but Red Team testing, followed by a developer event followed by a release, is a familiar pattern in this industry. What's driving the timing is also interesting. Anthropic is reportedly within two weeks of closing a$40 to$50 billion funding round at a valuation of$900 billion or higher, with revenue nearing a$40 billion annual run rate. Strong investor demand is reportedly outpacing what Anthropic can absorb. At a$900 billion valuation, Anthropic would be worth roughly 22 times annual revenue, which is where fast-growing enterprise software companies trade when buyers believe the category is early. Releasing a major new model heading into that fundraise close is not an accident. One week from today, the conversation about where Cloud sits in the model stack may look quite different. The Cybersecurity AI story had a busy Friday. Anthropic launched Cloud Security and Public Beta for enterprise customers. It uses Opus 4.7 to scan code bases, find vulnerabilities, and suggest patches. Anthropic is integrating it with CrowdStrike, Microsoft Security, Palo Alto Networks, Sentinel One, and Wiz. No custom API build-out required. If you're already a Claude Enterprise customer, you can turn it on. On the same day, OpenAI's GPT 5.5 cyber became available to a small group of vetted critical cyber defenders, not the public, a curated list of vetted organizations and government entities. Both companies are doing essentially the same thing, releasing powerful security-focused AI to defenders before attackers get access to something comparable. Both are also being explicit about why they're being careful. Anthropic said it plainly in its release. Australia's financial regulator issued a warning Friday that Frontier AI could help attackers find and exploit vulnerabilities significantly faster than defenders can patch them. That's not theoretical. Mythos has reportedly identified thousands of zero-day vulnerabilities across every major operating system and browser. One more piece to add here, Cisco released a tool this week called the Model Provenance Kit, which the company describes as a DNA test for AI models. You point it at any model and it analyzes the architecture, tokenizer structure, and weights to produce a fingerprint, then tells you whether the model shares a training lineage with known models in the database. The practical use case Enterprises downloading open source models from platforms like Hugging Face and wanting to verify those models are actually what they claim to be, not a poisoned or manipulated copy. It's on GitHub now. For any organization building on open source models, it's worth looking at. Here's a story that deserves more attention than it got on Friday. Anthropic published research analyzing one million clawed conversations. One finding about 6% of those conversations involve people seeking personal guidance, real advice, not help with the task. People asking AI what to do about their relationships, careers, health, personal decisions. And here is the uncomfortable part. Sycophancy, meaning the AI telling people what they want to hear rather than what's accurate, runs at about 25% specifically in relationship-focused conversations. One in four responses where someone asked Claude about a personal situation, the model was likely validating their existing view rather than giving honest feedback. Oxford researchers published related findings Friday covering 400,000 responses across five AI systems tuned to sound warmer and more empathetic. Across the board, friendlier versions of these models made more mistakes. Incorrect answers rose by about seven percentage points on average when the model was tuned for warmth. In one tested example, a standard model clearly stated that moon landing conspiracy theories are false. The warmer version said there were differing opinions. Anthropic says Opus 4.7 cut the sycophancy rate roughly in half compared to 4.6 in relationship conversations. That's real progress. But even half of 25% is a significant problem when you're talking about millions of people turning to AI for guidance in their most emotionally loaded moments. This research matters for anyone building on these models for anything adjacent to personal advice, health, or support. Let's roll into our last story. Google started rolling out Gemini to vehicles with Google built-in on Friday. That includes roughly 4 million GM vehicles from model year 2022 onward. Google Assistant gets replaced with a more conversational system that can handle navigation, car settings, music, hands-free controls, and vehicle-specific questions pulled from manufacturer manuals. A beta mode called Gemini Live supports ongoing conversation while you drive. Gmail, Calendar, and Google Home integrations are coming later. The initial features are pretty basic compared to what you can do with Gemini on your phone, but the scale is the story. Four million vehicles updated overnight. The car is becoming another ambient AI interface, and Google is the first to get there at factory install scale. That's a meaningful distribution advantage as this space develops. One more thing. If you like this podcast, please be sure to rate and review it so others can find it. It really does help. Thanks. That's all for this edition of Yesterday in AI. Stay curious, and I'll see you Monday.