Infinite Curiosity Pod with Prateek Joshi
The best place to find out how AI builders build. The host Prateek Joshi interviews world-class AI founders and VCs on this podcast. You can visit prateekj.com to learn more about the host.
Infinite Curiosity Pod with Prateek Joshi
From 0 to $15M ARR in 3 months | Mukund Jha, CEO of Emergent
Mukund Jha is CEO of Emergent, an agentic vibe-coding platform. They've raised $23M from Lightspeed, Y Combinator, Together Fund, and Prosus. He was previously the cofounder and CTO of Dunzo, a hugely popular ecommerce company in India.
Mukund's favorite books: The Hard Thing About Hard Things (Author: Ben Horowitz)
(00:01) Intro
(00:07) State of vibe-coding and where we are today
(01:42) Emergent in plain English: what the product delivers
(03:07) From prototype to traction: the first 90 days
(06:03) What changed in the last 24 months (models + infra)
(08:13) Early infra bets that enabled speed
(12:07) Precision vs. control: editing and debugging without code
(14:21) One-click to production: the unglamorous infra behind it
(15:55) Points of failure across prompt → plan → code → test → deploy
(17:53) Models division of labor: planning, codegen, tests, commits
(20:05) What “reasoning” means and how they evaluate it
(22:13) Context & memory strategy (beyond naive RAG)
(24:22) Representing large codebases so agents don’t hallucinate structure
(27:03) Orchestration walkthrough: adding SSO end-to-end
(29:40) Agent coordination protocols (how agents talk)
(31:05) Debugging long-running agents and trace observability
(32:37) Company-building lessons from Dunzo to Emergent
(36:10) Philosophy: offloading decisions to models
(36:57) Rapid Fire Round
--------
Where to find Mukund Jha:
LinkedIn: https://www.linkedin.com/in/mukund-jha-a1596413/
--------
Where to find Prateek Joshi:
Newsletter: https://prateekjoshi.substack.com
Website: https://prateekj.com
LinkedIn: https://www.linkedin.com/in/prateek-joshi-infinite
X: https://x.com/prateekvjoshi
Prateek Joshi (00:01.855)
Mukund, thank you so much for joining me today.
Mukund (Emergent) (00:04.706)
Super excited to be here. Thanks for being here.
Prateek Joshi (00:07.687)
Let's start with the state of play for for white coding products. And obviously, white coding took off big time. A lot of people are building applications. They're building software. So where is the world today in terms of white coding platforms available?
Mukund (Emergent) (00:26.584)
Right, I mean, white coding, it got started when these LLM models came out. And the whole concept is that you just talk to an agent or an application just like you would talk to a developer. And it sort of goes and builds the software for you. And you never look at the code. You just look at the output. You do white checks. And that's how the term sort of got coined. And we are building for non-technical users, so non-developers.
I think we are today in sort of GPT-2, GPT-3 era of white coding, would say. And I think we are at an inflection point where you'll see suddenly, a lot of these apps today are building prototypes, building MVPs. But over the next six months, nine months, you'll start seeing a flurry of these apps running in production and truly getting a lot of usage and a lot of people building their dream apps and getting them monetized. So the way I view it is that we are just in the beginning and ends of
of this today. And I think wipe coding will just become coding very soon. And most of the people would just wipe code. And to us, it feels like GPT-3 era of things. And we are probably a couple of model upgrades away from a really extreme vertical takeoff.
Prateek Joshi (01:42.215)
Yeah, I think that transition is going to be amazing where white coding is just coding. Because at one point, assembly language coders, the real coders and everything else was. But now we don't do that. I think that the definition keeps shifting and abstractions keep going up and more people come into the ecosystem. So maybe we can start with, in plain English, what is Emergent's offering? Basically, new user comes to you, what does the product...
Mukund (Emergent) (02:11.49)
Yeah, so we allow anybody on the planet without any coding knowledge to come to the platform and just describe what they want in natural language. They can talk to the platform or they can just type and describe what they want, a web app, a website, or a mobile app. And the platform goes ahead and it out for them. And they get a full stack production ready app that they can actually launch and monetize and get usage on. And essentially, what happens behind the scene is we have this.
very sophisticated coding agent that works autonomously behind the scene to put together everything that your app needs, whether it's a backend or a frontend or database or integrations. And what you get is a real usable app. And that's the gap in the market that we're trying to fill, where instead of just getting a prototype, a UI prototype, you actually get a full production-ready, real app which is secure, which has data privacy, everything built in. So you can actually go and launch this product and have usage on that.
Prateek Joshi (03:07.467)
Amazing. Let's go back to the first initial few days. you saw a gap, you have this idea, you launched the company. So walk us through the first 90 days from prototype to traction. Like what were the maybe two or three decisions that unlocked this huge adoption that you're seeing?
Mukund (Emergent) (03:29.55)
Yes, I mean, our journey has been a bit longer. And we sort of launched three and a months back. But we have been working in stealth for like the last eight months before we launched the product. And when we sort of started off, we actually were initially building testing agents, which could actually go and test your mobile app, web app for enterprise. And then we slowly realized that the problems that we were solving were very general enough that we could actually build a general coding agent.
So we started building general coding agent. And there's a benchmark called Sweet Bench, which is essentially where all the coding agents are benchmarked. We became world number one on that multiple times. And that allowed us to really, really spend our time to think through the problem really well in terms of what would it take for a non-developer to go from an idea to an actual usable app in production.
And those six, eight months really allowed us to be deep in research mode, really figure out how agents work, what would it take to build the world's best coding agent, and then build the entire infra around it to support the app building process. And our realization was that to really get a production-ready real usage app.
what you would need to do is embed all of the best engineering practices within the platform. So our platform acts like just the world's best engineering team encoded within the platform and agents. And that sort of takes you to a production-grade app. And so we were in stealth for a long time, built this out. And obviously, we were seeing this huge influx of demand from people on these platforms. And
And we started testing our app building platform and just saw that it was working better than most of the platforms. So we decided to launch in June this year. And since then, I've been sort of just growing like crazy. And over the last three and a half months, we've grown to almost 2 million users now on the platform, almost 2 and 1 million apps built. And lot of these apps are going into production, are getting monetized. And the first sort of like,
Mukund (Emergent) (05:36.879)
90 days of launch, we really, really focused on getting the message out to the user. And we wanted to go as broad as possible, get as many users as possible. And what we saw that eventually, the product was working really well, that we started getting a lot of users on the platform. And they started referring to each other. Every time an app goes live, it attracts more users to the platform. And that's been the journey so far.
Prateek Joshi (06:03.051)
Amazing. And if you look at the model or the tooling stack, the infra, if you had done the same thing, in 2023, two years ago, would this be possible or rather like what got unlocked in the last 24 months that now you can do this with trust and reliability and it just looks great, smooth, it works. So what changed in the last 24 months?
Mukund (Emergent) (06:27.278)
I mean, obviously, model capabilities have improved. think the 3.5 was an inflection point in agentic possibilities. The hardest problem that we were trying to solve was, how do you get a model to give you a JSON output? And we had built a this around it. And the 3.5 sonnet came out, like 99 % of the time, the model would just give you what you asked for. So I think, obviously, the model capability increase is sort of the, which is leading to this big inflection point where now models can actually
Prateek Joshi (06:43.881)
Alright.
Mukund (Emergent) (06:57.166)
almost match human level coding expertise and can work for hours long duration without any human assistance. And I think what was also important is that you have these model capabilities, but how do you of harness them? How do you sort of build the right infrastructure around it, build the right architecture around it that maximizes what you can get out of these models. And essentially we built out this whole multi-agent architecture.
where different sub-agents would come in and do specialized work. For example, we have a testing agent which will come test your app. We have a design agent which will come and design your app for you. And so we built out this whole architecture. also figured out how do we sort of variably give more compute to a harder problem. So when we see an app that sort of, let's say, a complex bug, we are able to dynamically scale up compute on those problems. And then we built out the whole memory layer, which allows
models or the agencies will remember what they have done in the past and not repeat those mistakes. So I think one vector is obviously intelligence going up. And the second vector is essentially how do you harness them to their best potential. And building up this whole infra which is sort of optimized for app building really allows us to extract the most out of these models today.
Prateek Joshi (08:13.791)
You know, and that's amazing. And when you build a build infra like this, and obviously you can, we can spend forever building the most perfect thing, but in startups, velocities, everything. what infra bets did you make early that has shaped your velocity? And when I say infra bets, mean, models and context management, memories, evals, like what, what bets did you make to allow you to move so fast?
Mukund (Emergent) (08:39.278)
So there were a few early things that we did which are paying off really well for us right now. One, think we just spent a lot of time understanding the models. So we spent a bunch of time just reading through the traces of every single model output and building a really strong intuition of where the models are good at, what models are good at where, and what models are not that great at different things. And today, if you know,
If I read a trace, can tell you, hey, Claude is not behaving as expected. So we built this intuition, and that really allows us to peek into the future a little bit, saying which model capabilities should be solved for today versus which should be not solved for today. So that was, at base level, set the right foundation for us. Second thing, we took an asymmetric bet where, in order to move fast, most people were relying on third parties to, let's say, super base to.
provide a database, like an E2B for running their code. And we had this notion that the agents are going to be as good as the feedback loops that you will give them. So it can't actually look at an error. If it can't look in your database, it's not going to be able to solve the problems. Because invariably, it'll make mistakes. But the good thing is that can it actually improve on the mistakes? Can it solve the mistakes?
So we took this approach of building everything in-house, completely the entire stack. we built everything on native Kubernetes so that we are able to scale them, keep them cost efficient, but in the end, provide agents with the same dev environment that a regular developer gets. And so everything is packaged, the back-end authentication, front-end databases. Everything is packaged within the local machine. And that allows for a really, really superior app building experience, because agents can now, just like a regular developer, look into a database.
figure out an error and solve for it. And we have this very strong belief that we need to control every single piece of the infrastructure to really, one, iterate really fast, and second, to be able to close all of these feedback loops. And that has paid off really well for us. The other big bet that we took was to platformize the hard parts. So example, authentication payments, one of these things where it's very common and you need to keep them secure. We built out almost like a first party infra for those.
Mukund (Emergent) (11:00.594)
supporting multi-tenancy, supporting authentication, a of those things on the platform. And largely, third thing was essentially spending a lot of time on the agent quality, which is where we sort of invented the whole multi-agent architecture, solved for the memory bits, and not just the short-term memory, but also long-term memory. So our agents actually learn across multiple apps. Every time an app is getting built on the platform, they're able to sort of understand and learn from those mistakes and make them better.
And also on the compute side, I mean, one of the core thesis that, I mean, there are two or three core thesis that we have internally, one of them being compute solves all the problems. So how do we sort of leverage more and more compute in an efficient manner is something that we focus a lot on. And secondly, we focus a lot on is giving agency to the LLMs. So we think that, I mean, we are betting big that LLMs capability will continue to increase exponentially. So we want to make sure that all of the agencies with the LLM and they get to decide.
What's the next step? And that's what keeps us in the right direction of LLM capabilities growing on the platform.
Prateek Joshi (12:07.293)
Right. And when you think about app building, many times there are common, not complaint, but people say, hey, if you describe everything well, initially one shot, and then what comes out is great. But after that, debugging, like it takes a hammer approach versus like a surgical precision where I just need to change one element and then you describe something, the whole thing changes. how do you shape your product on giving enough control?
to the user, but at the same time making sure like, not making them write detailed code. How do you think about that product decision?
Mukund (Emergent) (12:44.512)
Yeah, so I mean, I think it's one of the hardest problems to solve today, Like, where precision at its end. And the way sort of we have stuck with things, I think the, it's almost no more that it could give like a very good prompt upfront, like things are gonna be better. It's actually much better to sort of build it incrementally, like so start with the smaller thing and then sort of build it out. And so what we have done internally is a few things. One is that,
We are very test-heavy. So every time the agent writes the code, it's also going to write the test. every time you ask it to make a small edit, it's going to read on all of those tests to make sure that the previous functionality is not breaking. It's not perfect, but it works in practice much better than other platforms do. And second thing that we are also now incorporating into the product already is
providing, allowing users to give lot more feedback. So when you say, my login is not working, you automatically extract all the logs and the session data from what the user has done on the platform and pass that to the agent so that it has all the context to solve the problem. And that really, really helps us narrow down the problems. The other thing we have done is essentially, again, figure out how do we scale up compute for tricky bugs. So whenever we see a doom loop, like for example, a thing is not getting solved.
even after multiple chat, I mean, we'll then sort of, you know, fan out a lot more compute and sort of, you know, find out different models and try to isolate the problem. And that sort of really helps us, you know, solve some of these problems today. It's not perfect, but I think it's improving every day.
Prateek Joshi (14:21.183)
And when you think about taking an app to production, one click to production, it sounds amazing as a user, I love it, but I'm sure there's a bunch of hard info stuff you have to do to make it look nice and awesome. So what are these unglamorous parts, info parts that you had to do to make it, hey, let's go to production?
Mukund (Emergent) (14:44.428)
Yeah, it's been a journey. mean, when we started, we separated only one kind of app. So you could only build a fast API, Mongo, and React app. And we had built the whole pipeline to automatically package your code and take it to production and run security checks on top of it. What we discovered, the variety of requests that we get is fairly large. And people would install a different library that we didn't expect. People would, you know, like,
use a different sort of infrastructure. For example, they'll put a red S or something in middle. So what we have now done is what we call agentic deployment, where essentially because we have this lineage of your build process, so what all steps the agent has taken, we sort of monitor that and extract out of that infrastructure as a code. So basically now we convert all of that infrastructure as a code and then deploy this into production. your app is scalable, your app is secure. And no matter what the build process looks like, we're able to extract that and
and get it to the production. It's still, again, not 100 % perfect. But we are getting very close to almost taking any kind of app and making sure that it runs in production in scale in a secure manner.
Prateek Joshi (15:55.787)
Right. Amazing. Now, going into the architecture of going from prompt, plant, code, test, deploy, keep it running, what are the points of failure here? For somebody as a user who's like, you don't expect anything from the user because they shouldn't have to worry about infra. So for you, what are the points of failure?
Mukund (Emergent) (16:20.686)
Yeah, I think every single step has its own failure points. the question is, how do you recover? mean, it's general software engineering principle. People spend almost 40%, 50 % of their time debugging and testing. If you look at a general software engineering as well, they spend like, and this is something that we are educating our user also about that typical software process involves 20 % design, 30 % coding, and the rest 50 % is essentially testing and debugging. So I think there are various failure points. For example, like,
Prateek Joshi (16:24.139)
Mm-hmm.
Mukund (Emergent) (16:51.15)
getting the requirement right from a user is hard. And in fact, we don't think the prompt or the text is the right way to start. We are experimenting with multiple different modalities on how do we take the input from a user. And second, obviously, is the build phase, where is your app getting built in the correct way? Is it maintainable? Does it scale well? I think all of those points of failure come in. And then obviously, testing and deployment are big failure points.
But I think what is important is, one, is your agent able to understand the feedback and then sort of self-correct on it? And that's the power of agents, that they can actually look into the database, they can actually look into the logs, figure out what is going wrong, and then sort of self-correct. And it's an iterative build process. So you'll describe it in one way, and then you'll change your idea as you're going along. I would want this to be a different color, or I would want this also to be there. And how do we handle that? That becomes really important.
Prateek Joshi (17:40.234)
Yeah.
Prateek Joshi (17:53.759)
Let's go into models and reasoning capabilities. Obviously, you experimented a whole lot with what's available across all the offerings so that you can build and shape your product. So which models handle planning versus code gen versus tests and commit messages? How do you divide the work? What's the division of labor? And also, what was the decision-making process behind?
Mukund (Emergent) (18:12.322)
Right.
Mukund (Emergent) (18:19.598)
Yeah, so we experimented a lot with all sort of models, including open source, including fine-tuning our own models. And today, what we use today in production, we have this fine-tuned custom router which figures out which request to route to which model, depending on the complexity of the request, depending on the state of the app and everything. And we use a mixture of all the models today. And that's one of the reasons that the application layer has an advantage today, that they can actually pick the best model for the best
job and sort of build for that. For example, we use Gemini for all of the multimodal stuff or whatever longer context is required. For example, your context lens gets really big, we use Gemini to of compact the context. And then we use OpenAI's GPT-5 in thinking mode for lot of the debugging and planning today.
And typically what we do is we will generate multiple plans and then have a verifier, custom fine-tune verifier model that will figure out what is the best plan that we want to execute because planning is one of the most complex things. And a lot of these things happen behind the scenes, so users sort of get to know about the complexity. And obviously the workhorse is the cloud. So most of the code is written by cloud and Sonnet 4.5. And then we have a mixture of our own fine-tune models on Quen and to...
take care of simple steps, for example, looking into a log file and, for example, parsing a large file and extracting the right information. And increasingly, what we are seeing is that as model capabilities are improving, these models are getting more rounded. And you'll probably, in the future, be able to pick and choose on these models to work.
Prateek Joshi (20:05.256)
And when it comes to reasoning, maybe a two part question then, like just if you have to explain reasoning to a new user who doesn't, who's not into AI. So how do you think about reasoning capability? And also internally for your product, how do you assess reasoning capabilities of these models so that you can choose one versus the other?
Mukund (Emergent) (20:26.542)
Yeah, think reasoning, the way I would describe it is essentially to take observation and facts and hypothesis and then figure out the right hypothesis for that. And typically, for example, let's there's a bug. And your agents would hypothesize that a bug could be because.
a variable is missing in a code, or it could be because you're calling a wrong function somewhere. So generally, how do we generate hypotheses and then let agents validate them? And usually, this is the chain of thought reasoning where the model will internally think of a hypothesis, reason through it, through some logical deductions, and then either discard it or pick that to test it. And that's where the whole concept of test time compute is very effective, where the more
of compute you're able to spend during the reasoning phase, the better answers you would get in the end. so essentially, the way we do it today is we largely index on the end output. So we don't measure the reasoning parts of things that much, but we index very heavily on the end output of the.
the whole app building journey and see what the quality is and sort of parameterize along those lines. And I think the model capabilities of reasoning are sort of improving exponentially, and that's what we have been observing. And what has worked really well for us is, again, using this mixture of model approach where we are able to use multiple models for different operations. For example, GPT-5 could spit out a plan, but then Claude can verify it and things like that.
Prateek Joshi (22:13.781)
Let's talk about context management, memory management knowledge. So there are many, things you may have to do, like rag over repos, docs, and how to make plans, long context models. So what's your view on context engineering? Like, how do you do it internally for the product?
Mukund (Emergent) (22:34.126)
Yeah, context engineering is super important. It's one the most important things that we get right to build extremely reliable agents. And especially because our agents run over 10,000, 20,000, 50,000 steps more than the So context engineering becomes really, really critical for us. And so we actually started by using rags initially. But now we have given up on rags. We think, especially with the coding space,
they don't work really well. what works really well is, again, like what we internally call an agentic search. So essentially, an agent sort of goes and looks through files, just like a human would. Like, know, a little file, you need a function, you go to that function, and then explore. And this explore and exploit sort of thing in an agentic way is what we have seen works really, really well. And of course, as the code base gets bigger, we'll introduce swag as well. But that's sort of what's been working really well for us right now. And in terms of context,
like the way we do extremely active context management. one principle that we have is, one, we let the context grow as long as it can grow so that everything is in context for agents. But as soon as we see limits getting reached, we actively prune out the bad context. So there's a parallel LLM thread that runs almost like an outside observer of the process and figures out, OK,
which context needs to be taken out, which context needs to be inserted back in. There are a of tricks that we do under the hood to make sure that the context is actually very, very pristine. And we run some of the micro evals on these things. I think we're working on the best parts in terms of context engineering today.
Prateek Joshi (24:22.379)
Amazing. Yeah, think that's context engineering. It's becoming more and more important. And I think as more people use it, they expect a lot from these products, which means that you better have your context well engineered. And I think the agent search part is actually a very, important one. When you look at a large code base, how do you represent a large code base so that agents don't
hallucinate like structure. Structure is pretty basic and as a user they're like hey it's all here like why you why you making stuff up so how do you how do you do this?
Mukund (Emergent) (25:01.73)
Yeah, mean, today it's done largely.
mean, there are a few ways that we do things. But again, largely, we fall back to basics. Our general principle is, what would a human developer do in this stage? And can we get the agent to do the same thing? And again, a human developer has tools. For example, if you're working on an editor, VS Code or something, you can actually just click on a function, go to its definition, click on a function, and see all of its usage and things like that.
So we have provided agents the same exploration capability of the code base. they can actually, instead of reading a bunch of files, it can actually explore through the code. And then there are some agent-friendly stuff, for example, how do you explore using abstract syntax tree, AST. And then we also have LSP, is essentially language server protocol, which essentially are all of the non-LLM or static ways of exploring and, you know,
finding the structure in the context of code. That really, really helps agents actually explore the code really well. And a lot of our focus has been on building these really good general purpose tools for agents, which is customized for our use case, but generally enough that it can be applied on any software engineering task. And we spend a lot of time in just making sure that these tools are really high quality and works really well. And in general, the principle that we sort of keep is
OK, what would a human developer, smart developer do at this point? And can we get close to that? And then we start thinking about, because LLMs or agents have a different property, for example, they can look at multiple files at the same time. How do you use that to your advantage? example, how do we do parallel tool calling to get multiple things in the context at same time? It's still evolving as a process we learning along the way.
Mukund (Emergent) (26:56.206)
But so far, think keeping things simple has been good for us.
Prateek Joshi (27:03.755)
move into orchestration. Basically, you have multiple agents. You need them to talk to each other and not run into clashes. And it's a fascinating topic. So as a starting point, let's say I have an app. It's running. It's working well. And now I want to add a simple thing, like, hey, add single sign-on. So once I type that, can you just walk through what happens? I described that. I want to do this. And then what tools are invoked?
What does the agent do and how does the work get done?
Mukund (Emergent) (27:36.787)
So the moment you describe a new thing, new sort of instruction to an agent, we first figure out, we have a classifier that runs that figures out what kind of request it is. Is it a complex request? Is it a simple request? Is it a bug? Is it a feature? And this is that we sort of trigger different sort of a custom agent to take care of your request. And so suppose like a single sign-on is a feature that you would want to do.
We'll probably say, it's a complex feature request as a classification. And for example, let's say you say, hey, I want to single sign on, and I want my database to be bagged and bunch of things. So we break it down into micro instructions first, and then we start executing those instructions. The second thing that's happening is the code exploration. we have agent will either re-explore the code if it already has a context, and probably can skip that part.
to just build the context of the code, like what is happening so far in the code base. And then you sort of fan out into a planning phase, it will actually start planning how to execute this feature for you. And generally, then we would sort of come back with a plan and validate with the user, saying that, this is what I'm going to build. And does that make sense? Because what we have seen sometimes is that users, especially we are working with people with no coding background, like their description of things.
might be very different than what a technical user would describe. So we take this confirmation step. And then we figure out, what are the testing steps for this? And then the agent goes ahead and builds it, tests it autonomously on its own, all the basic tests. And then it will pass back the control to the user to test again and take more feedback. And the way it works, it's a lot more iterative. So just like you would talk to developer. So you would tell a developer, saying, hey, build this. They'll come back with, you know, a.
working product and then you look at it and say, okay, this is not working or this is something different that I want. And that iteration, the process is where most of the software gets built.
Prateek Joshi (29:40.371)
And when you have multiple agents working, how do agents coordinate? In simple terms, how is that done internally?
Mukund (Emergent) (29:50.703)
Yeah, it's complex. mean, because we have experimented with lot of, you know, we have our internal agent to agent communication protocols. And we have experimented with a bunch of protocols where agents can communicate in natural language, where, you know, or agents can communicate in a structured language, or, you know, would, agent one will write a file, agent two will read that file. And so we have experimented with a bunch of things, and for different...
Subagents, use different protocols. For example, let's say if a simple request, for example, you're requesting a library document, then it's a search query. It happens in natural language. If it's a testing request, we generally give a more structured output. So there is no common recipe right now. But through experimentation, we have figured out what works in which context and are working on that. But there is a need to unify all of these protocols and to make it more standard.
So far I think the most common thing that works is essentially writing to a file. So you write to the file, the agent read the file. But there are some new clever ways that we have discovered and maybe we shall the next time.
Prateek Joshi (31:05.267)
And when you look at how agent runs, how it does a piece of work, how do you debug an agent's chain of thought? Like what happens?
Mukund (Emergent) (31:16.632)
Yeah, I think it's one of the hardest problems today. Especially because there is one non-intervenism in LLAM. So if you run the same thing again twice thrice, you'll get a different output. And second thing is essentially that these traces generate a lot of data. So you'll have millions of tokens, sometimes like 100 million tokens within a run. And then the running gets very difficult. We have built a lot of internal tools that are
We first looked outside, tried to find if there's somebody solving this problem. We couldn't find anybody who was solving this problem for these long-running agents and for complex outputs. So we have internally built a lot of tools to analyze the traces of agents, figure out what are the tool errors, and also how do we do micro A-B tests. So we let the agent run to the point and then swap the tools and see whether it performed differently.
So a lot of this is, we have spent a lot of time in instrumentation and analysis tool that we have built internally to monitor and constantly improve the agents. And this is a combination of micro evals, combination of some of the A-B tests that we run in production to figure out, OK, what's going to work, what's not going to work. And very, very hard problem to solve today.
Prateek Joshi (32:37.951)
move into this company building and you as a a founder, as a leader. So this is not your first rodeo. Obviously you built a company before. Danzo, very well known in India. Obviously it's hugely popular. So now what best practices are like? What are you bringing into Emergent? And also what's something that you are doing differently this time?
Mukund (Emergent) (33:06.19)
I mean, there are a lot of things that, a lot of the learnings that just carries forward from Danzo days. Like, primary being just like insane amount of focus on users. I think when we built Danzo, it sort of became one of the most loved consumer brands in the country, became a work. And largely because we were maniacally focused on, hey, what does the user want? And we'll do everything to make sure that the user reaches the result that they want. And I think the same kind of focus and user.
So the obsession is sort of, you what drives us every day today at emergent. And I think the other thing that sort of, know, like, and again, some of these, you know, like are in hindsight, but one of the other things that we are very focused on emergent is solving like really the hard problems, right? you know, and because I think that's where most of the alpha is like, you like you solving a hard problem.
I mean, whether it's building our entire infrastructure ground up, even though it took us a little bit of time, even though it's fairly complex to manage those infrastructure.
And also like going very deep into you know, like how to build like really world-class agents How do we solve hard problems of like memory and agent-to-agent communication? So I think we really take a lot of you know pride in like one identifying Okay, what is the hard problem that we can move the needle on and then sort of really really? going really really aggressive on that problem and and trying to solve that problem because That's what we think that you know, we'll move the needle for us and and for example right now, you know, like we
We are very, very focused on improving the code quality. Because what's happening on the platform is that apps are getting bigger and bigger and bigger. typically, the typical app size today on the platform would be like a 35,000 to 50,000 line of code. And on top end, we have like 200,000 line of code, 300,000 line of code apps.
Mukund (Emergent) (34:53.198)
And just to give you context, emergence entire code base is 100,000 lines of code. So now we're really focused on, the game has shifted to, can you get something functioning too? Can you get something that will actually work as you add more and more features on it? So we're really focused on that as a hard problem right now, and especially in the context of non-technical users. So how do you solve for them?
So I think these are the core principles around which we are building. We're also like, know, betting really, really heavy in the direction of AI progress. you know, like every time we discuss something, like we try to imagine, okay, what would the models do in six months, right? And do we need to sort of really solve for that right now? Or let this model catch up on some of these capabilities today. And the third thing is that I think just, you know, like bringing in people who...
are obsessed about problem solving and are really, really top of their field. So most of our team is top IT rankers, Olympiad winners, repeat founders. so the talent density is extremely, extremely dense today at Emergent. And that really helps us crack some of the hard problems today.
Prateek Joshi (36:10.313)
Amazing. Yeah, I think the view on, there's like a thread or discussion earlier where, you can use the foundation models for stuff, but keep the decision making to yourself versus the architecture is like, hey, offload the decision making to the model because that's going to move way faster than what you or your team can do because it's just the entire world is pushing it forward. even though control, some people think control is lost, but net net you'll still gain because
You have to ride with the model. The model is getting really, really good, really, really fast. So you'll miss the boat if you want to hold on to just for pride to hold on to the decision making. So think that's a great point. All right. With that, we're at the rapid fire round. I'll ask a series of questions and would love to hear your answers in 15 seconds or less. You ready? All right. Question number one, what's your favorite book?
Mukund (Emergent) (36:57.846)
Yeah, let's go for it.
Mukund (Emergent) (37:02.99)
I don't read that many books, but would say Hard Things About Hard Things is probably one of my favorites.
Prateek Joshi (37:08.873)
Which historical figure do you admire the most and why?
Mukund (Emergent) (37:13.614)
I think Steve Jobs, as much as we call him, historical figure. I just, growing up, heard his speech and then saw his presentations. It had a really, really deep impact on me in terms of how one person can change the world so much, in terms of bringing the best product out. It's been true inspiration for me.
Prateek Joshi (37:35.241)
Yeah. Right. What's the one thing about infra that most people don't get?
Mukund (Emergent) (37:46.103)
I think infra is very, hard to build, but once you build it, it gives you so much leverage that, you know, like it almost becomes like a mode that is hard to compete against. And good infra takes a lot of dedicated effort to build.
Prateek Joshi (38:03.103)
Next question, what separates great AI products from the merely good ones?
Mukund (Emergent) (38:10.158)
I would say attention to details and truly understanding where your customer is and meeting them there. And for example, maybe models does not give you to do X or you sort of bridge that gap. And also, think just solving the hard problems for those models.
Prateek Joshi (38:28.999)
What have you changed your mind on recently?
Mukund (Emergent) (38:34.906)
I mean, I have sort of reaffirmed my faith in like, most of the alpha is just building a great product. like, if you just continuously, and I think the great product compounds. I think that's something that I've sort of flipped my mind before and then flipped again, yes.
Prateek Joshi (38:51.979)
What's your wildest AI prediction for the next 12 months?
Mukund (Emergent) (38:59.982)
Okay, I have made a lot of wild predictions and often AI has actually superseded those predictions. I would say one, you would have coworker level companions very soon, I think over the next 12 months. And I also think everybody will have a digital twin. most likely you and I will be doing this podcast, our AIs will be doing this podcast and then we'll do for a start. But this didn't work.
I would say those two. And third probably is that one of the application layer would sort of break through, companies would break through and complete the foundation models.
Prateek Joshi (39:38.571)
That's interesting. That would be very interesting because it's usually been the other way where infra companies, foundation model companies, they're like taking over apps and killing apps. I'm hoping like the pressure should also come from the other side. All right, final question. What's your number one advice to founders who are starting out today?
Mukund (Emergent) (39:59.778)
I mean, I would say that like, you I feel today like every idea is a trillion dollar idea. And I think I was talking to a friend the other day and he's very passionate about toys. And I was telling him that, hey, if you build a great AI toy, you know, like the next generation would trust that AI more and you like you'll be the default AI. So I everybody, think even though it seems very competitive, it just feels to me that every single position is open. We are so early right now.
that I would just go all in and pick the problems that you really, really want to solve and love solving and just go hard at it today.
Prateek Joshi (40:35.915)
Amazing. Mukund, this has been a brilliant discussion. Loved your insights, loved your harder knowledge. And also personally for me, very exciting to talk to the founder of Danzo because obviously it's a very, very familiar name. As you said, it's a verb almost. So thank you so much for coming onto the pod and sharing your insights. All right.
Mukund (Emergent) (40:50.947)
Thank you for having me.