Infinite Curiosity Pod with Prateek Joshi
The best place to find out how AI builders build. The host Prateek Joshi interviews world-class AI founders and VCs on this podcast. You can visit prateekj.com to learn more about the host.
Infinite Curiosity Pod with Prateek Joshi
Diffusion LLMs - The Fastest LLMs Ever Built | Stefano Ermon, cofounder of Inception Labs
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Stefano Ermon is the cofounder of Inception Labs and an associate professor at Stanford. Inception is developing a new type of AI models called Diffusion LLMs.
Stefano's favorite book: If on a Winter's Night a Traveler (Author: Italo Calvino)
(00:01) Introduction
(00:38) What are autoregressive LLMs and how do they work
(02:28) How diffusion LLMs rethink generation
(04:02) The ceiling of autoregressive LLMs: cost, latency, reliability
(06:19) Why diffusion LLMs are commercially viable now
(09:12) Parallel refinement: how diffusion models generate text
(12:05) Understanding diffusion steps and efficiency
(13:49) Hardest engineering challenges at Inception
(15:23) From research to production: the power of data
(16:24) Where diffusion LLMs still lag behind
(18:18) Evaluations and benchmarks for diffusion LLMs
(20:20) Developer experience and OpenAI-compatible API
(21:47) Economics and GPU efficiency
(23:38) Hardware and runtime stack
(24:58) Competition and the evolving diffusion LLM landscape
(27:01) Where diffusion will win first — coding and agentic systems
(30:13) How diffusion changes infra, serving, and hardware design
(33:04) What’s next at Inception: reasoning and multimodality
(35:20) Rapid Fire Round
--------
Where to find Stefano Ermon:
LinkedIn: https://www.linkedin.com/in/ermon/
--------
Where to find Prateek Joshi:
Research column: https://www.infrastartups.com
Newsletter: https://prateekjoshi.substack.com
Website: https://prateekj.com
LinkedIn: https://www.linkedin.com/in/prateek-joshi-infinite
X: https://x.com/prateekvjoshi