AI Mornings with Andreas Vig

Your daily AI news briefing in under 10 minutes. New models, product launches, research breakthroughs, and industry shifts, explained clearly, no hype.

All Episodes

AI Mornings with Andreas Vig

ARC-AGI-3 Resets AI Scoreboard & Inside Sora's $1M/Day Collapse

April 02, 2026

0:00 | 5:43

François Chollet's new ARC-AGI-3 benchmark stumps every frontier model below 1%. Plus: inside Sora's million-dollar-a-day burn rate, Anthropic's GitHub takedown blunder, and the startup using AI to design chips.

SPEAKER_00 0:01

Hey, welcome to AI Mornings with Andreas Vig. It's April 2nd, 2026. Francois Cholet's Arc Prize Foundation just dropped a reality check for anyone claiming AI is approaching general intelligence. They released ARC AGI 3, a dramatically harder version of their reasoning benchmark, and the results are striking. Humans solve every task on the first try, but the best Frontier models are scoring below 1%. Gemini Pro leads at 0.37%, GPT 5.4 high hit 0.26%, Claude Opus, 4.6 managed 0.25%, and Grok 4, 20 scored 0. The test presents game-like scenarios with zero instructions, agents have to discover rules, form goals, and plan strategies entirely from scratch. Labs spent millions training on earlier ARC versions and pushed scores from 3% to around 50% in under a year. Now the question is whether this new test will reveal genuine reasoning progress or just expensive brute force pattern matching. There's a$1 million prize backing the challenge, and the foundation says Frontier Labs are paying far more attention this time around. We also got new details about OpenAI's abrupt Sora shutdown, and they paint a picture of a product burning cash at an alarming rate. A Wall Street Journal investigation found Sora was consuming roughly 1 million US dollars per day in compute. When OpenAI pulled the plug, Disney found out less than an hour before the public announcement, despite having a$1 billion partnership in the works and an enterprise pilot already running for marketing and visual effects work. That relationship is now effectively dormant. The FreedUp Compute is going to a model codenamed SPUD, targeting coding and enterprise applications as OpenAI pivots to compete with Anthropic. Speaking of Anthropic, they had a rough week on the execution front. After source code for Clawed Code leaked, the company tried to issue DMCA takedown notices. But they accidentally hit about 8,100 GitHub repositories, including legitimate forks of their own public code. Boris Cherney, who leads Clawed Code, called it an accident. The targeted repo was part of a fork network, so the takedown cascaded much wider than intended. They retracted most of the notices and narrowed it down to one repository and 96 forks. Not a great look for a company reportedly preparing for an IPO. A startup called CogniChip just raised 60 million US dollars to have AI design the chips that power AI. The premise is compelling. Advanced chips currently take three to five years from conception to mass production, with the design phase alone taking up to two years. Cognich claims its technology can cut development costs by more than 75% and reduce timelines by half. Intel CEO Lip Boutan is joining the board. The company has now raised$93 million total since 2024, and they're competing against both incumbents like Synopsis and Cadence plus well-funded startups. Alright, a few more things worth knowing about today. Meta is going all in on natural gas for its AI infrastructure. The company announced funding for seven additional power plants in Louisiana, bringing the total to 10 plants that will generate about 7.5 gigawatts for its$27 billion Hyperion data center. That's slightly more electricity than the entire state of South Dakota uses. The plants will emit an estimated 12.4 million metric tons of CO, two annually 50% more than Meta's entire 2024 carbon footprint. Google Research introduced TurboQuant, a compression algorithm that shrinks AI model memory by more than six times without any retraining, while delivering up to eight times speed gains on Nvidia H100 chips. It scores perfectly on tests that bury key details in large amounts of text. AI memory company stocks dropped 3-5% on the news. Google also upgraded Liria, its music model, to generate full three-minute songs with intros, verses, and choruses rolling out now across Gemini, Vertex AI, and Google Vids. A startup called Enrich Labs launched Helena, an autonomous marketing agent that takes a company's URL, researches positioning and competitors, generates a strategy, and posts assets online. The launch video has over 3 million views. Apple is reportedly testing a standalone Siri app and a new Ask Siri chatbot experience for iOS 27, slated to debut at WWDC on June 8th. The assistant will read across messages, emails, and notes for context and execute actions in third-party apps. Microsoft released Critique and Counsel, features that let Claude review ChatGPT drafts and run both models side by side to see where they agree or disagree. Reddit CEO Steve Huffman outlined a plan to label automated accounts with an app tag and use pass keys or World ID for human verification. And Figure AI founder Brett Adcock launched Hark, a$100 million stealth startup building personalized AI paired with dedicated hardware, with an ex Apple iPhone Air designer leading the hardware work. That's it for today. See you tomorrow.