Futureproof by Xano
Futureproof by Xano is a podcast for technical builders, entrepreneurs, and engineering leaders who want to stay ahead of what’s next.
Hosted by Xano’s CEO & Co-Founder Prakash Chandran, each episode features conversations with innovators and industry experts who are shaping the future of technology, business, and product development.
Futureproof by Xano
The Right Answer Isn't Enough—with Karthik Narayan (Komodo Health)
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
If an AI agent gives you the correct answer but took the wrong path to get there, can you actually trust it?
In this episode of Futureproof, Prakash Chandran sits down with Karthik Narayan, Director of Product Management at Komodo Health, where he leads Marmot, an enterprise AI product for life sciences. Marmot's promise is that life sciences companies no longer need to send their data to McKinsey and wait months for an answer—they can ask complex healthcare questions and get answers directly. The catch? The answers aren't binary. Together, Karthik and Prakash unpack why grading an agent on whether it got the right answer is only half the story, how Komodo uses parallel critique agents and friction detection to close the gap between AI confidence and analyst rigor, and what changes when AI makes product leaders more powerful than they've ever been.
Topics covered include:
- Why the path matters more than the answer: How an agent can arrive at the correct number through the wrong query, pass traditional evals, and then fail catastrophically on the next question—and why trajectory evals are the real measure of trustworthiness.
- Steering, not just answering: How Marmot uses research plans, follow-up questions, and full code transparency to give analysts maximum control over subjective healthcare methodology decisions.
- Friction detection over thumbs up/down: Why users rarely use explicit feedback mechanisms, how Komodo infers dissatisfaction from behavioral patterns, and how that drove a complete platform rewrite at the six-month mark.
- Build vs. buy when AI makes prototypes easy: Why a junior engineer's weekend demo isn't the same as a production system with fallback models, context compaction, token optimization, and continuous evaluation—and how to think about total cost of ownership.
Episode ID: 19252007-the-right-answer-isn-t-enough-with-karthik-narayan-komodo-health
Subscribe to Futureproof wherever you get your podcasts.
From Xano - The fastest way to create a production-ready backend for any app or agent. Xano unifies AI speed, code control, and visual clarity, so you never trade reliability for velocity. Sign up for free today.
The path that the AI takes is very important. It is easy for the AI to take, take a take a bad path, an expensive path, token and also like user time. The other part is it could have taken the wrong path and gotten to the right answer. And it could have been from its knowledge base and not from the data that you are pointing it to.
SPEAKER_01Komodo is a 300 million revenue data company with longitudinal claims data covering the U.S. population. Marmot's promise is that life sciences companies no longer need to send their data to McKinsey and wait for months for an answer. They can ask complex healthcare questions and get answers directly. The catch, the answers aren't binary. They're subjective, nuanced, and high stakes. And the technology generating them is inherently probabilistic. We're going to talk about this. Kartic is a rare breed, a product leader who's deeply technical, architecting AI systems, not just defining the user experience. This is a conversation about what it actually takes to make AI trustworthy in a regulated industry. Why grading an agent on whether or not it got the right answer is only half the story. And what changes when AI makes product leaders more powerful than they've ever been? Kartick, thank you so much for being here today.
SPEAKER_00Thanks for having me. Excellent topics.
SPEAKER_01Fantastic. Well, before we dive right into it, I'd love to just get a very high-level 50,000-foot view of you and your background. So maybe you can share with the audience the last couple of years of your career and what brought you to what you're doing today.
SPEAKER_00Yeah, yeah, definitely. So about last decade, I've been I've spent in healthcare, different parts of healthcare. Healthcare itself was like really broad. I've spent a lot of time in the payer space, digital health space. Um and now I'm in uh life sciences. Um and it's it's great to see the healthcare ecosystem from different vantage points. And um currently I'm working, like you said, I'm director of product uh and I'm leading the product team for Mornet. And it's a very unique product, and it's probably one of the most exciting things uh that I've worked on for in a long time, uh, because like the product market fit, the the unmet need, and the technology are all in this, are all aligned now. Like the problems were always there. People wanted to ask complex questions on the healthcare data. Yeah, but you always needed like an analyst who understood the nuances to answer them. But now the technology is catching up. It's still not exactly there, which is what makes the problem interesting. If it's fully there, then it's trivial, right? So this is an exciting time for people like me and fan fan for Komodo. Uh it's an exciting opportunity.
SPEAKER_01For sure. So, you know, I want to talk more about that. You were recruited specifically to kind of build out Marmitt. I'd love for you to paint a picture. Like tell us what problem are you solving and why is it so hard? You know, we touched on a little bit of it, but maybe you can unpack that for us.
SPEAKER_00Yeah, yeah, yeah. So healthcare data is inherently complex, and healthcare data itself is broadly have like genomics, et cetera. I'm mostly talking just about claims data, uh, right? Like claims data, simple things or or seemingly simple things for outsiders, like how many people have um gone through this procedure or how many people have this condition, lung cancer. Outsiders would think, oh, that should be a simple query, but it's not, right? Uh it's like there are so many ways to code. There are so many rules between the code. There are different types of codes. Like there are like uh procedure codes or drug codes, uh, and uh there are medical claims and pharmacy claims. The codes themselves are complex. Finding the road same correct code set is itself complex. And then you have the methodology. How many such claims do you mean within what period of time? What is the look back period for those claims? All those are actually subjective decisions, right? And constructing like the right population is a complex problem. Uh and in many uh academia, it's a publication-worthy problem, uh, right? Defining like the cohort definition is publication worthy in many situations. So if a life sciences company wants to ask questions like, hey, I'm I'm launching this drug, who are the top 50 providers I need to target, that becomes a very complex uh analysis problem. So customers who used to buy claims data would typically go to third-party consultants, or or um, if they have like a big enough anal analysis team, they will they will go to that, go to them. And it takes multiple weeks, multiple months to actually get to the right answer. And obviously, like the market, like the business cannot wait for that long. Like they want to get the answers fast. And now with with where AI is, varmint, the product, gives the experience for the user where they can ask a question and it gives a very reliable answer. The challenge is still like the methodologies you want to choose, et cetera. That's where we focus not on just giving you the right answer, but giving the user complete transparency on how varmet arrived at the answer and provide enough steering capacities so the analyst can say, oh, I see what you the decision you made there. I don't want to make that decision, do this instead. I want to use a different look back period, I want to use a different set of codes, et cetera. And VAMET would uh accommodate all of that. And that's the that's a challenging thing with LLMs.
SPEAKER_01It's really interesting. Uh, there's a couple things to unpack there. But one of the things that I want to emphasize is that the nature in which a question is asked is very nuanced, right? Like being able to, I think you said, steer how the response is given is very important. And um, things like the look back period and the nature of what is being returned is something that, you know, yes, you can modify those filters, but then what gets narrowed down has to be correct. So I I mentioned this up at the top, and this is something that I think is hard for me to even wrap my head around. But using AI, which is inherently probabilistic, customers need deterministic answers. So, how do you actually go about bridging that gap?
SPEAKER_00Right. So, so what we do a few different things, like like one is once once you ask a question, um So an analysis question, um, unfortunately is not the same as uh us asking a question on ChatGPT, right? Typically it's a couple of lines, I'm this is what I want. You can start and then you can have a back and forth. But because but that's not how a business stakeholder typically goes to an analyst. Right? They have like serious uh requirements. The analyst asks like 10 questions, there's a bunch of follow-ups. That's how they'd actually even write the question, so to speak, like define the problem, right?
SPEAKER_01So you're saying even historically, like when they do go to like one of the McKinsey or something, they're generally being very opinionated on how they ask the question.
SPEAKER_00Exactly. And it needs the prodding from the analyst to actually get the right question out. That's a skill, that's an analyst skill, right?
SPEAKER_01Got it.
SPEAKER_00How do you get the AI to do that? Right? That's that's that's very hard, right? Because uh LLMs are tuned to give you a quick answer, right? Like confidently. Um so we have introduced a lot of checks and balance. One of the first things we do is we call the feature a research planner. Yeah, they ask a question and then say, we ask like five, three to five follow-ups. It's like just so I understand what you want, and it's like multi-choice, so they can easily select. And users love it. It's like, oh my God, I didn't even think about all of these things. Um, right. And that helps frame the question a lot better for the LLM. It still not doesn't go all the way, but it it it provides the uh it sets the framework for the analysis really well. They make the choices and then it we go create a plan, and we have like independent agents critiquing the plan. So it makes sure that we get all the uh methodologies, right? And we go back and give a summary back to the user and say, hey, this is the plan, this is how I'm gonna execute it, you're good. Right. And then if they say yes, we go execute. The great thing with Marmette from the beginning has been that complete transparency. It's like every line of code that is written is available to the user to introspect. They can go see the query, they can run it, they can go change individual parameters. So it's not a black black box, right? We also have like important uh log of important decisions that the AI made. So the so the user can go and say, okay, I don't agree with some of these. Right. And that gives maximum steering capacity. Uh uh, I think that is the best we can provide when there is a lot of subjectivity, right? And I was just uh looking at an example earlier today where the market size uh for something could be anywhere ranging from 1,300 patients to 33,000 patients, depending on what choices were picked, right? It's like a large DV, large um variation. So um so providing the maximum transparency and maximum steerability is what right now with with is the best way uh to solve that problem.
SPEAKER_01It's really interesting. I think about kind of the world of software engineering that we're in right now and the world of specs and scopes and harnesses. You're kind of doing something similar, but in terms of like playing that analyst job of prodding how the question is being asked, then storing that and making sure that that then becomes, I guess, the harness or the spec for what then gets returned. I'm curious, as you have these artifacts and you learn with Marmid over time, I'm assuming what you have are a series of artifacts and opinions on how things should be asked, depending on the claim request that is trying to be made. Is that is that fair? Is that something that you're doing?
SPEAKER_00Um we don't by choice do training uh on on questions because a lot of the questions themselves are proprietary, uh intellectual property customers. So we can't we we can't play too much with that. Um but we do um look for patterns. Um so so one of the things that we do, uh, for example, is we have something called a friction detection, right? Because in users hardly ever use the thumbs up, thumbs down, right? That's that's barely like good. So we look for patterns where user look at user responses and see when they constantly like question or they sound frustrated, and then we will see if there are patterns, meta patterns there. It's like, okay, these kind of questions, we users are not happy with what they are hearing, and then look for patterns and then say, okay, this is probably what they need, right? And um, that is in that is an important learning. And we've gone through a completely um major re-architecture of the entire platform six months after launch, right? Something that you typically happen at the 18 month or 24 month mark is now at six, six months. It's like a complete re-architecture uh in terms of like how the uh analysis artifacts are produced, stored, what are the experiences that the user has to navigate them. Uh we have also introduced things like uh deep research mode for more advanced analysis. Um we have ability to do multiple chats, advanced methods for compaction when uh when uh uh when we when users run out of context. Those are all things that we are user problems that we kind of inferred from meta patterns on how users react to answers. So that's how we make the platform um better.
SPEAKER_01Yeah, that's really interesting. This will go into the build versus buy question later, but I'm gonna table that for now. Um I am curious, these patterns, because I totally understand there's obviously uh the proprietary nature of what actually you can store and train with. How do you get those patterns? Like the patterns that tell you, hey, user's not really liking this. What is the shape that you look for? And tactically, how are you um looking at that to then ask better follow-up questions in the future?
SPEAKER_00The great thing is that Komodo has a lot of domain expertise in-house. We have a very large analyst organization who, for the most part, represent the user uh in in more way, shape, or form. So we have access to them and we have access to their questions and their chat. So we there's like a very robust internal uh uh knowledge and uh user activity base there for us to work work off of. So that that's that's number one. Um and then the uh the the other thing is um well, we keep track of failure modes um behind uh behind the analysis. Like when does uh what kind of things cause um um um uh uh a chat going out of uh context, right? Which models are do better, uh the things of that sort. And we also have uh evals, uh, and it uh that's a that's a that's a conversation in uh itself. We have a very robust uh eval suite that we we spend a lot of time on in getting right and making sure that it is representative of the kind of questions uh users ask and also the kind of answers they expect, right? Those are all the things that we do uh to find signals on like, hey, this is where, these are the soft spots, uh so to speak. Uh, and um um and on how do we how do we get uh bet better at them, right? Uh so I'll give you one example. Uh we found that one of the uh areas that was causing a lot of friction was we provide a plan, we ask the user to validate that, but some most many times it doesn't cut it because the plan itself is complex. Use users will not, it's not, it was not easy for the user to engage and make these three changes, right? Most often it'll be like, yeah, I agree, just execute. And then they are not happy, right? So we did like uh this notion of like a parallel critique. So we had like five parallel agents, look at the plan, critique the plan, and then the main agent puts the, puts all the critiques together and then uh tells these are the out the it's a it's a union of like 10 different uh critiques. I'm going to take seven, I'm not going to take three, and here's why, and then refines the plan. And we just saw that the eval scores just shut up where when when that was done. Right. So that was like a like a friction point we detected and and and a tool that we used to address it, uh, which is kind of in retrospect feels intuitive. Um, but I don't but but we are probably one of the one of the uh few ones that have done it so far.
SPEAKER_01Yeah, I've heard about, you know, obviously in the context of consumer like perplexity doing this like model council on certain things, but it totally makes sense within uh this context as well. You know, you you obvious you started speaking about evals. Um I wanted to kind of frame it a little bit because you wrote, I mean, one of the reasons we got connected is you wrote something uh that was really uh it's it struck a chord. It kind of was really interesting. You said that we're obsessing on whether an LLM got the final answer right, but ignoring how it actually got there. And an agent that gives the correct answer but burns 10x the compute, brute forcing instead of uh the right tools, still can pass most evals. So you, I think, coined this trajectory evals, and I'd love for you to you know frame it and walk through what that means.
SPEAKER_00Right. Yeah, so so a couple of things. Like one, one, one, uh the path that the AI takes is very important, right? Yeah. Because we provide a bunch of tools uh to the AI. And if you don't set up the right observability, you don't know what exactly the AI is doing, right? And you if you in many places, you you s give a question, I see the answer, and then we evaluate based on the answer. And first of all, to see the trajectory, you need full, you have you should have set up your platform uh to actually have everything. You have to log every LLM call address, so on and so forth. Um, the thing is that it it is easy for the AI to take, take a, take a bad path, an expensive path, token, and also like user time, right? Like uh nobody wants to wait for like 30 minutes for a simple question, right? So I think it is very important to keep track for uh of um the time it takes and the tokens it consumes. The other part is it could have taken the wrong path and gotten to the right answer, uh, right? So it's it's possible even in the life sciences example, how many people are in this cohort could have gotten like a ballpark right answer, right? And it could have been from its knowledge base and not from the data that you are pointing it to, right? Then it'll be like, how you need to know the path. Uh uh or or it it used your data, but it constructed the wrong query. It used the wrong quotes that somehow got to the same number. Um, so what will happen is that it'll pass the e-valve, but the next question that you ask, give me a demographic distribution of that cohort, and it'll be like, okay, that's that's ridiculous. That doesn't make sense. Um, right? Um, so so it's very important to track both uh how the AI got to the answer, and in in most cases, that's even more important than the answer itself. Um in life sciences, uh in in the domain that we are in, it's definitely more important uh because the answers are subjective, but like are you is it making the right choices? And it's also it's and also measuring, is it following instructions, right? If a user says, I want you to do this, and this is the route I want to take, it's very important to see if the AI is actually doing it. The the newer models are getting better and better at following instructions, which is fascinating. Like six months ago, that was a huge problem. But with like Opus 4.7, et cetera, that problem is becoming uh uh uh it's getting better and better.
SPEAKER_01You said something that's interesting that I want to unpack a little bit. You know, obviously the traditional way to do this, as you mentioned, was to go to one of a consulting or an analyst firm, uh, you submit a pretty complex query, and then you wait weeks, months to get an answer back or a data set back. Obviously, AI has compressed uh things quite a bit, and uh that also includes our expectations as consumers. I'm wondering, as you talk about the observability and figuring out like the right path to give you a quick answer rather than waiting 30 minutes, like what have you seen in terms of expectation in response time and the windows that you can play within? Like, for example, asking a simple question, how quickly you return that, versus you mentioned a deep research mode and how much time you actually have until you frustrate the user in their unreasonable expectations, the unreasonable expectations of today.
SPEAKER_00Right, right, right. It's it's it's it's it's funny that you asked that. Like one of our um favorite customers, one of our, I mean, all our customers are great, one of the users that I talk to a lot. He was telling to his colleague, um, that the colleague was frustrated with the response time, and and this person went, do you know how long an analyst takes? Why are we complaining about a few minutes? Right, which is uh which was like funny that that he he brought that up. That perspective is very hard, in all fairness, because their frame of mind is user's frame of mind is never like what would have taken an analyst. It's like, how much does Chat GPT take? Right, because it's a similar interface for all intents and purposes, and and that's the frame of reference. And that's a little bit unfair, right? Uh, but that's what we have to deal with. Uh and we are constantly working on uh um like like like the deep research mode is one such one such thing. We are also uh inherently changing the tooling itself, um, where uh we are creating more faster tools and APIs to actually browse the healthcare map, where there are there are a category of questions where I'm just interested in exploring the data to see if I have enough coverage, so on and so forth. And there is a slightly wider margin of error, right? So there you can get like uh instant responses. So we have like applications for data exploration that is sub-second responses using LLMs. Um so we we do uh so there are ways that we we uh cover all of this. So one more thing that you asked that I was going to hit, and but I I lost track. Um it's user expectation management. Um one more thing. There was a second part in your question that I that I that I lost track of.
SPEAKER_01No, that's okay. I think it was largely around how do you manage those expectations and you know what is kind of like the windows that you can operate within. There's obviously certain questions that I remember.
SPEAKER_00Uh so the other thing uh is basically what they see while they wait, right? And um, that's very important. And and it gives is what what users want, they obviously want a fast answer, but they want engagement, right? Uh what we don't want when we say, hey, wait, we will create the plan and give you an analysis, uh, the answer in 10 minutes, they do not want it. They want to see, hey, this is what I'm doing now, right? I'm doing cohort exploration. This is an artifact I created. Do you want to go browse like the distribution? You know, so long as users have something that they can keep interacting with and and and fine-tune, they are okay. Right. And and you see even Cloud, when you ask like really complex questions, they show you like this is all that the AI is doing, and you can keep click around. Um, so that there is the engagement that gives an illusion, but that makes the weight tame bearable, right? It's not like hearing a like the um wait in tone when you call uh totally.
SPEAKER_01Yeah, it it gives you insight. Like now I'm doing this, and you can really uh in uh inspect what's happening. I'm curious, do you also? Allow follow-up in between uh think time and does that then refine um kind of steering of that response?
SPEAKER_00Uh we we do not not fully. Um so you can obviously pause uh an analysis mid step, which was a complicated problem for us because you have to like maintain state uh on a complex problem. So you want to pause and continue was was hard, but that's a problem we've solved, and we do that very well. We can also have um multiple chats that all share a common uh um context, right? So if you want while this is running, if you want to ask a parallel question, you can. Uh and we are also talking about queuing a bunch of questions, right? You ask a question, and you know, I don't want to wait till this finishes, but as soon as this finishes, just run this, right? And we've we've not implemented it, but those are all the things that we want so users can uh engage in many different ways while the AI is thinking.
SPEAKER_01Yeah. You you said something that you know leads me to the next question, which is really kind of around like a build versus buy. Um, you said that, you know, there's a frame of reference where people, for example, aren't thinking about, oh, this is how long it would take an analyst. Uh they're thinking, oh, ChatGPT does this. Why, why wouldn't you be able to return in the same response time? What is also true is what can be built today in something like clawed code. Someone, I'm sure you might go to a customer and they might say, Well, you know, why isn't this something that I can just build in clawed code, right? I can just build this on the weekend. I can have one of our engineers do the same thing. Why should we buy Marmot? Um, this is something we certainly encounter, my contemporaries encounter. Uh, I'm curious, like, number one, do you encounter this? And number two, how do you think about answering it?
SPEAKER_00Yeah. And we definitely do. Uh, and everybody wrestles with that. We wrestle with that with our vendors, right? So it it flows all the way. Yeah. Um, but like, I think the way to think about is uh understand like the total cost of ownership. And that's it's the same, whether it's AI or not AI, right? Like if I want to like own this part internally, what is the investment that's needed? Understand a good understanding of that uh is what is needed. And it's since the technology is still nascent in adoption, that's not very clear, right? For example, we spend a lot of effort in making sure that um uh the right domain knowledge is inserted the right way that is palatable for the AI. And and within the AI, this particular model, right? Uh and that's that's a lot of uh human effort. Um and then we have to set up like the platform and the observability to see how um uh how it's performing, right? I I gave an example, uh, like there are many examples, like like when we when there's a model upgrade, what do you do? When the model um uh is not available, have you set up a fallback option? You have like friction detection, um uh uh and you have use you have protections against catastrophic failures, so on and so forth, right? And so building and operating the system and continuously evaluating if it's doing the right thing, because the evals are it's it's a constantly changing system, right? So you have to take into account all of those things, token cost optimization, context compaction, so on and so forth, right? So if you take into all take all of that into account and then say, okay, this is what it will take for me to do fully own this, and then they have to make a call. Is that my business? Right? Or uh is that my core business? In Komodo's case, yes, it is. Uh right. In may in in many cases it may not be. It's like, okay, I'm better off going to somebody to solve this problem. Um so that that's that's still the frame of reference. Uh at least that's the calculus that that still holds in the AI world. It's just like finding those parameters is not very obvious.
SPEAKER_01Yeah, I think the what you just said is is uh spot on. It's like you always want to kind of operate from the point where you have the highest leverage, you're putting your energy towards something that you are best in class at, and um, and you know, trying to uh reinvent the wheel on something that uh you're not going to be able to maintain uh will very quickly break down. And I think that it goes to the fact that people overestimate what they can do in a prototype and severely underestimate the maintenance and the investment.
SPEAKER_00Uh I see, right? It's just like, hey, my like junior engineer brought this up that looks very similar to what what you're showing. It's not not not quiet, uh, right. And I and I do the same. I build a lot of prototypes. And and you know, so so it's it's it's it's very easy. And it's uh that's a great thing, right? It it it opens up the aperture so anybody can come and contribute. But then the decision making, you know, that it has to it has to take into account a lot of other things as well, in terms of should we go proceed with this or not?
SPEAKER_01You know, I think um one of the things that I kind of also mentioned at the top was um, you know, you're a product leader, you're director of product. I know you're deeply technical, um, but it's very cool to see um you who's very like empathetic to the customer experience and have like kind of product training also being opinionated around how the AI model should work and really getting down to the bare bones of how the system operates. And you said something interesting, I think, when we were we were talking uh in prep for this show, you said you thought like product managers would be wiped out. But the more now that you're leveraging AI, you think engineers are the ones that might potentially be uh at risk. I'd love for you to talk about your views there, especially being a product leader that's in that space.
SPEAKER_00Right. I I think I will put like a couple of lenses. One, well, what makes a product manager like a great product manager? One is like um kind of incessantly looking at value. What is the highest value thing to solve? What is the highest leverage problem to solve, right? And the other is like agency, it's like uh being willing to put up with a lot of obstacles to go solve with, right? And if you have those two, uh you're going to have been great in the uh the pre-AI value, you would be great in the AI world uh too, right? Um so the the the why I was concerned was um with with AI when when it just came out, was it was very good at synthesizing complex things, right? And I thought that was always um uh my differentiation. It's like, hey, I can synthesize something like complex in a way and present it in a and talk about it in a way that that resonates with people. And I was like, oh my God, this can do it better than me, and I'm we are gone, we are done. But then I realized, okay, that was not really like the differentiation. That's table stakes, but that's it's it's it's it's it's even lower where where where the differentiation comes from. If you know where the value is, and if you have agency to go execute uh against it, yeah, it gives you so much tools to actually go do it yourself. You are now less dependent on like other other other parts to actually go be able to do it. So so that that's where I came from. So this is like a very, very big boon for uh great product managers.
SPEAKER_01Yeah, I'm curious, like tactically speaking, I'm curious like if another product manager or owner is listening to this, and they, you know, I think all of us feel a little bit like we can't keep up. There's so much that's changing, there's so much news. Um, have you figured out a good cadence for yourself around how you learn, how you improve, how you're leveraging AI to, like you said, um get the highest leverage from the time and attention that you put towards something.
SPEAKER_00Right. Uh for me, I'm very fortunate because my I it's part of the job uh with this product. It's like you have to be at the forefront, you have to look at um how to solve uh the biggest customer problem. So um for me, it's it's very easy. I'm I'm like deep in the middle of the of the AI problem. I'm not working at the periphery. Uh right. So that that makes it a lot easier uh to stay up to date. Um but then like um being able to like, even as a user, uh that's very important. Even if if it's not Marmad, I'm using it for clot code or or wherever, whatever you're using, looking at how different models are behaving and uh looking at like the token consumption of it, like how do you make, how do we make sure that you get the reliable answer in the fastest way possible with um optimizing for for for the cost and time, et cetera, helps you think about like the AI harness the right way. Uh right. And that that's very basic, but uh it's it's but it's I don't know how many people how many of us like pay attention to like what it's doing and how uh behind the scenes, how much is it costing? Is this model really that much better uh for the question I'm asking, uh for the value, for the additional value, et cetera. Uh so I keep a lot track of all of this, and then that that helps with my with my with my uh work too. Uh obviously, like then you know, you if you look at like the updates that cloud the cloud and the AI first companies are doing, uh the approaches they are taking, and on podcasts like like like uh yourself, like how different industries are using this. Um there was one company in the legal side that talked about their AI hardness, and it was like fascinating. And one of the points they made was um chat is not going to be the right interface for many AI agents, and whoever is going to come up with the next best interfaces uh for users to interact with is going to make the uh biggest thing. And that resonated immediately with me. It's like, yes. Uh, you know, things like that, right? So you you keep uh look at who's at the forefront, see what they're doing. Uh, and and basically if you even try to keep up, you're probably good.
SPEAKER_01Yeah, that that makes sense. I I want to ask a follow-up question, but I am interested what what immediately resonated with you about that legal, like maybe it's not the chat interface, there's something else. Where did your mind go? And how do you see that evolving in your environment?
SPEAKER_00All right. So so so because like um when you think about LLMs, because ChatGPT, like Gemini, all of them have the same interface, right? It's it's a it's a box where you type a question and you get an answer. Um so we've not innovated beyond that uh uh in most cases. And and the this the the this company basically said our users uh look at things, documents in like structures and tables, right? It's like asking a question to construct that is unintuitive for them. So we give them a table and they highlight the things that they want and we make changes, so their output is always a table, right? Which obviously that doesn't apply to everybody else, but it probably works great for that particular uh uh use, right? And I've and I've seen looking at like what the kind of questions users ask, chat is probably not the right way uh for many questions. And the cohort exploration is an example, right? They want like they want to like users typically like tweak the cohort criteria, like what is the age group, what is the race, ethnicity, and they want to see the counts. And they kind of tweak the criteria as they see the counts and the distribution, right? You can't do that in chat. Yeah, it's an it's a different experience. So, like, so what would be like a year from now would probably be very, very different than how it is.
SPEAKER_01Very interesting. Very, very interesting. Um, I want to talk to you something related to what you were saying in terms of how you learn and kind of think about like this harness first mentality. When you're building your team and recruiting, what are the attributes that you look for in someone? Um, and how has that changed from, for example, even a year uh ago?
SPEAKER_00It's a great question. Um I think it's it's it's like um if you're looking for like tight, like core AI, it's very, very hard in the market, uh, especially if you're a startup. Like if you are uh if you are in Meta and you know Tesla, it's probably like like easier. But if you are a startup like like Komodo, competing in the market is very hard, right? Because that the real um AI AI talent in engineering or product management can make a very big difference, is one good hire, uh, right? Um so what I've seen is like hands-on experience. Um like what is the big delta between the prior, prior world and now is uh like the brand talked a lot. Um like if you worked at AWS, for example, you kind of automatically assume that okay, you can get come with these skills, you're probably okay, right? Like the the confidence in that higher grow goes up, that's not true anymore, right? And I with and I don't know how to how to like explain it. And I've seen like candidates from different places, and I was like disappointed with like how hands-on they've been, because not all companies, even the tech companies, are as forward in their uh AI adoption as I I as I had thought. Um so if you had done even at like a smaller company, but you've been very hands-on in the stack, and you can think about the AI in different angles, uh, you'll probably stand out. And that's what I look for when I talk to candidates. It's either in their job or it's outside like a project that they've like actually gone in, built like a hardness, and they can talk intelligently about like the trade-offs uh of one approach versus the other and how they will evaluate it, uh, is still a very hard scale in the market.
SPEAKER_01Yeah. Um I totally hear that. And I think like that's kind of what I tell everyone. You just have to go back to your roots of building, uh being a builder again. And you have to roll up your sleeve, get your get your hands on things, and uh everyone should be doing it. It doesn't matter what what role or position that you hold. Um just in wrapping up, you know, I I always like to try to pull out like tactical advice around um, you know, there's a lot of application development leaders and uh engineering thinkers here that might be starting projects. And with so many tools, with so many options, it's hard to know where to begin. I want you to try to take me back to when you started Marmot and even with all the lessons that you learned that has brought you up to today. If you were to give a few pieces of advice around how to get started in the age of AI with all the tools that we have available to us when you're trying to build, for example, something in a regulated high-stakes must-be correct type space like you play in, um, what would your advice be?
SPEAKER_00Um, so a couple of areas come to um top of mind. One is uh, like you said, you have to get hands-on. Like the setup work was still hard, right? Like I had not actually coded in a while, in a decade. Uh so I was skeptical, but like like an engineer helped me set up like the Git repo and like you know, cloud code uh and like connecting with AWS, so on and so forth. You have to go through that hurdle first, right? Painful as it may be. And if it's Marmette, luckily is like AI first, right? We that's so it was very simple. Like setup was simple. In many cases, it's actually very hard, right? It's it's not, it's not, it's not, uh, and you also have like compliance hurdles to get it up. But if you can set that up, then you go uh work with the code base. That that's the best way, right? Then you understand the inner workings, what works, what's easy, et cetera. Well, because it was very counterintuitive for me for the first few months, what is easy and what is not. Because that's that's what I had always prided myself. It's like I know exactly like the cost of building something. And that that intuition did not translate at all uh in the AI world. I would say, okay, God, this looks complex, and the engineer would go, oh no, no, that's trivial. It's like one command and it's done. Uh, you know, so that that's one. So you need to go start building yourself. Uh the other thing is um work close with the harness, right? Like even like the the like the system prompts on uh on how how how things should work. Like one small change makes like very different uh the uh Drupal effects that you could not see, right? Like, for example, like users complain that, hey, this two verbos, make it concise. If you make make, if we try, and that's why it's like hard when even with like ChatGPT and Cloud to make the answers concise, because if we try to make it concise, it will look miss something important. Right. So you have to like be close to the hardness to see how the AI works and how different models work. They all have like a little bit of a personality. Uh, right. So being doing that is also very important uh in terms of like getting uh a good intuition of what will work and what will not. Um and then you obviously have to uh build that intuition for the market and the users, and then you know connect connect the dots together.
SPEAKER_01Well, this has been an awesome conversation. I I have definitely learned a lot. I've I've taken a lot of notes as we've been talking, um, but I really appreciate your time. Um, is there anything else that you would like to share with the audience or leave with them uh before we close?
SPEAKER_00Uh no, like like this has been like a pleasure, uh Prakash. I I actually had a lot of fun talking about this. Uh, thanks for having me.
SPEAKER_01Fantastic. And finally, if people want to learn more about what you're building and if they're interested, where should they go?
SPEAKER_00Uh connect with me on LinkedIn, uh, Kartek Narayan. Um uh I'm happy to like talk through uh about Komodo or like like AI. Happy, happy to