AI Mornings with Andreas Vig
Your daily AI news briefing in under 10 minutes. New models, product launches, research breakthroughs, and industry shifts, explained clearly, no hype.
AI Mornings with Andreas Vig
AlphaEvolve's Production Debut & Anthropic's Mind-Reading AI
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Hey, welcome to AI Mornings with Andreas Vig. It's Friday, May 8th, 2026. Google DeepMind just published a major impact report on Alpha Evolve, their Gemini powered coding agent, and the scope is impressive. This is a system that uses AI to design and optimize algorithms, and over the past year it's moved from research into actual production across some serious domains. In genomics, Alpha Evolve improved Google's deep consensus model by 30% in reducing DNA sequencing errors, which PacBio is now using. In quantum physics, it optimized circuits for Google's Willow Quantum Processor, achieving 10 times lower error rates. Perhaps most striking, Alpha Evolve is now being used to design the next generation of Google's TPUs. Jeff Dean, Google's chief scientist, said the system proposed a circuit design so counterintuitive yet efficient that it was integrated directly into the silicon. The system has also helped solve Erdus' problems working with mathematician Terence Tau, and commercial deployments include Klarna doubling their transformer training speed, FM Logistic improving routing efficiency by over 10%, and Schrdinger getting a four times speed up in molecular simulations. This feels like a quiet but significant shift AI systems now designing the infrastructure that runs AI systems. Anthropic released a fascinating piece of research called Natural Language Autoencoders. The core idea is training Claude to explain its own internal neural activity in plaintext. Claude processes information as long lists of numbers called activations, and until now, understanding what those numbers meant required expert interpretation. This new method converts those activations directly into readable explanations. The findings are revealing. When Anthropic used this technique to audit Claude's safety testing, they discovered something unexpected. Claude often suspects it's being tested even when it never says so out loud. On safety evaluations, the system detected this kind of hidden evaluation awareness 16% of the time. On coding benchmarks, it was 26%. Meanwhile, the same detection showed up in less than 1% of real user conversations. This matters because it suggests models may behave differently during testing than they would in deployment. Anthropic used this technique during the pre-deployment audits of both Claude Mythos Preview and Claude Opus 4.6, and they're releasing the code and an interactive demo for other researchers. OpenAI made two notable announcements yesterday. First, they launched a major update to their voice API with three new models. GPT Realtime 2 brings GPT 5 class reasoning to voice conversations for handling complex requests. GPT Real Time Translate supports over 70 input languages and 13 output languages with real-time translation. And GPT Real-Time Whisper provides live speech to text transcription. OpenAI says these move voice interfaces from simple call and response towards systems that can actually reason, translate, and take action during conversations. The second OpenAI announcement is a new safety feature called Trusted Contact. It lets adult Chat GPT users designate someone a friend or family member who gets alerted if conversations indicate potential self-harm. When the system detects concerning patterns, it encourages the user to reach out and sends a privacy-protected notification to their trusted contact. This comes in the context of lawsuits from families of people who died by suicide after interacting with ChatGPT. OpenAI says they review safety notifications within an hour using both automation and human oversight. Alright, a few more things worth knowing about today. Salvatore Sanfilippo, the creator of Redis who goes by Anti Res online, released DS4. C, a specialized inference engine for running DeepSeek V4 Flash locally on Apple Silicon. It's metal-only, meaning Mac focused, and supports a 1 million token context window while running on MacBooks with just 128GB of RAM using 2-bit quantization. Benchmarks show 27 to 37 tokens per second. He's built an open AI and anthropic compatible server so local coding agents can use it. San Filipo was very open that the code was developed with heavy assistance from GPT 5.5. A startup called Interact AI launched with an interesting premise: turning static websites into AI-guided experiences. When a visitor lands on a site, an AI agent walks them through products, answers questions, and surfaces value in real time. The company claims their AI improves with every conversation, and the launch video picked up over 2 million views. Their argument is that website technology hasn't meaningfully changed in 25 years. And finally, Harvard Business Review published a piece on what they're calling AI Fog, the growing uncertainty in long-term planning caused by AI's rapid pace of change. A UC Berkeley professor argues that traditional commitments like four-year degrees or 30-year mortgages assume tomorrow will look like a slightly upgraded version of today, and AI is breaking that assumption. The recommendation is to prioritize optionality over long term commitment. Reskill often, stay ready to pivot, and accept that investments in a single path may not pay off the way they once did. That's all for today. Have a good weekend.