Yesterday in AI

Yesterday in AI - OpenAI just doubled the price of intelligence...and the benchmarks might actually back it up

Mike Robinson

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 8:46

Yesterday in AI | Friday, April 24, 2026

OpenAI just doubled the price of intelligence...and the benchmarks might actually back it up.

Something happened Thursday involving the White House, China, and a technique that lets you steal billions in AI R&D without anyone noticing until it's too late. One AI product has been down for 48 hours while its status page insists everything is fine. A state just passed an AI bill that could reshape how the entire industry operates, and barely anyone is talking about it. Today's episode connects the dots between the model race, the geopolitics, and the governance questions that are starting to move faster than the headlines.

Send us Fan Mail

Remember to subscribe, rate, and share this podcast if you like it!

SPEAKER_00

Hi folks, this is Yesterday in AI, your daily digest of everything happening in the world of artificial intelligence in 10 minutes or less. I'm Mike Robinson. It's Friday, April 24th, and Thursday was the day OpenAI doubled the price of intelligence. The White House drew a hard line on Chinese AI theft, and Connecticut passed a bill that may be the most consequential state AI law in the country so far. Let's get into it. Let's start with OpenAI's GPT-5.5, which dropped Thursday and came fast enough that some people may not have finished forming an opinion on GPT-5.4 yet. The model is live now for Plus Pro business and enterprise users in both ChatGPT and Codex. Internally, OpenAI codenamed it SPUD. The positioning is unambiguous. This is a model built for agentic work, coding, operating software, web research, and completing multi-step tasks without a human watching every move. The benchmarks back that up. GPT 5.5 scored 82.7% on Terminal Bench 2.0, which evaluates command line workflows. On SuiteBench Pro, which tests real GitHub issue resolution, it hits 58.6%. And on GDP Val, a benchmark that tests model performance across 44 real professional occupations by comparing outputs directly to domain experts, GPT-5.5 matches or outperforms those experts 84.9% of the time. At that level of performance on real-world professional tasks, the conversation about AI in the workplace stops being about when and starts being about where. The price moves significantly to match the capability. API access will be $5 per million input tokens and $30 per million output, double what GPT 5.4 cost. A higher performance GPT-5.5 ProTeer runs $30 for the input and $180 for the output. OpenAI says the API will be available very soon after the ChatGPT rollout. It wasn't live at launch Thursday. The thing we're sitting with here isn't just the benchmarks, it's the tempo. This release comes six weeks after GPT-5.4, which itself came less than two months after GPT-5. For enterprise teams making architecture decisions, the cadence is now faster than most procurement cycles. A model decision made in February is already two generations old. The practical pressure this creates is toward model agnostic infrastructure, rather than betting on a specific provider's release, and that structural shift may be as significant as any individual benchmark from Thursday's announcement. If the next release is also six weeks out and the release after that, the companies getting ahead aren't the ones picking the best current model. They're the ones building systems that don't depend on which model is best today. Now to a story that moved from the industry level to the government level Thursday, and one that has context worth understanding. Michael Kratzios, the director of the White House Office of Science and Technology Policy, released a formal memo accusing China of running, quote, deliberate industrial scale campaigns to distill U.S. frontier AI systems, end quote. The technique at the center of this, distillation, is worth explaining carefully because it doesn't always get unpacked in the coverage. The way distillation works, you run a frontier model millions of times, collect those outputs, and train a new model on that data. Done systematically at scale, you can approximate the capabilities of a system that took years and billions of dollars to build without doing the original training yourself. The White House memo describes Chinese entities running exactly this process against Claude, the GPT-5 family, and Google's models, using tens of thousands of proxy accounts to evade detection and jailbreaking techniques to extract proprietary information. This isn't new territory for the AI labs. In February, Anthropic specifically named DeepSeek, Moonshot AI, and Minimax as running distillation campaigns against Claude. What Thursday adds is formal government weight behind what labs have been filing in their own enforcement actions. The memo carries a threat of a government crackdown, and its timing, ahead of a U.S.-China summit scheduled for next month, means it won't land quietly on either side. The broader context is worth focusing on. We've covered several Chinese model releases this week, Quen 3.5 Omni from Alibaba and Kimi K 2.6 from Moonshot, both of which benchmark competitively with U.S. frontier models at a fraction of the cost. That compression and capability gaps is real and fast. Thursday's memo is the government asking, in a very public way, whether some of that compression was built on capabilities that weren't originally theirs to use. Quick update on Meta Thursday, building on what we flagged last weekend. We covered on April 19th that Meta was planning to cut roughly 8,000 employees, about 10% of its workforce, with cuts starting May 20th. Thursday brought the official confirmation, plus the number that puts it in context. Meta's projected capital spending for 2026 is 115 to 135 billion, up from 72.2 billion in 2025. That's an 85% increase in infrastructure investment in a single year. The primary driver is Meta Superintelligence Labs, the internal AI research unit Zuckerberg announced earlier this year. In January, Zuckerberg said 2026 would be, quote, the year AI starts to dramatically change the way we work, end quote. Thursday's announcement confirmed that sentence is no longer forward-looking, it's now a budget line. Also on Thursday, Grok has been effectively down for more than 48 hours, and how XAI has responded is a story in itself. Grok started throwing high-demand errors Tuesday. As of Thursday, both free users and paid SuperGrock subscribers were locked out or getting intermittent access at best. Unofficial tracking services flagged the outage within hours of it beginning. XAI's official status page has shown service fully operational throughout. No incident declared, no acknowledgement of any problem. The apparent cause is a feature rollout. Custom templates and smarter video extensions went live this week, and the surge in traffic exceeded what the infrastructure could handle. That's a fixable technical problem. The pattern underneath it is harder to fix. This is not the first time Grok has had extended reliability issues in 2026, and the gap between what the official status page reports and what users actually experience has been consistent throughout. The timing makes it pointed. We covered Tuesday that XAI is entering a $60 billion arrangement with SpaceX and Cursor, with Colossus as the compute backbone. The pitch to that partnership is the ability to scale AI capacity at a level rivals can't match. The flagship consumer product has simultaneously been down for two and a half days without so much as an acknowledgement. Enterprise trust is built across thousands of small reliability moments, and this was not one of them. Let's close with a story that doesn't get the same attention as the model releases, but may matter more in the long run. Connecticut's state legislature passed one of the most comprehensive AI bills in the country this week, and it's worth understanding what's actually in it. The Connecticut Senate voted 32 to 4 on Tuesday for Senate Bill 5, a 40-section omnibus bill that regulates AI across a wide range of areas. The headline provision targets developers of frontier AI models, the high capability systems we cover in this show every day, and creates new accountability requirements for the companies building them. But the bill goes considerably further than that. It creates a state AI sandbox, a formal testing environment where companies can develop new AI products under regulatory oversight before full public deployment. It establishes new rules around employment decision-making by AI, meaning a company can't use an algorithm to determine hiring or termination without accountability mechanisms, and it creates new protections, specifically around youth facing AI chatbots and social media. There's a real policy design question embedded in that last provision. States have been trying to pass meaningful AI laws since California's SP-1047 debate two years ago, with most efforts stalling in committee or dying on a governor's desk. Connecticut's bill actually cleared a full Senate vote. At 32-4, it wasn't close. It now heads to the House. Some lawmakers push back on the bill as too broad and too likely to chill the state's economic competitiveness. That's the standard objection to every state level AI bill that gets real traction, and the concern isn't groundless. But the 32-4 margin suggests it didn't carry enough weight to stop it. Connecticut's bill isn't federal law, and it isn't California. But when a state Senate passes a 40-section AI bill with that kind of margin on the same week the White House is formally accusing China of industrial scale AI theft, it's worth noting the government gap around AI is being felt at every level of government simultaneously. Washington is focused on geopolitics. States are focused on workers, kids, and economic competitiveness. And the companies building the models are operating mostly between those two conversations for now. That's all for this edition of Yesterday in AI. Stay curious, and I'll see you tomorrow.