Infinite Machine Learning: Artificial Intelligence | Startups | Technology

Solver Agent

December 11, 2023 Prateek Joshi
Infinite Machine Learning: Artificial Intelligence | Startups | Technology
Solver Agent
Show Notes Transcript

Mark Gabel is the cofounder and CEO of Laredo Labs, where they are building an AI agent for software engineers. He was previously the Chief Scientist at Viv Labs and an assistant professor at University of Texas at Dallas.

In this episode, we cover a range of topics including:
- Current state of generative AI in software development
- Solver agent
- Characteristics of a good AI assistant
- Context awareness
- Tradeoff between ease of generation and software quality
- The role of programming languages
- Future of Generative AI in software engineering

Mark's favorite books:
- Gödel, Escher, Bach (Author: Douglas Hofstadter)
- On Food and Cooking (Author: Harold McGee)

--------
Where to find Prateek Joshi:

Newsletter: https://prateekjoshi.substack.com 
Website: https://prateekj.com 
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 
Twitter: https://twitter.com/prateekvjoshi 

Prateek Joshi (00:01.495)
Mark, thank you so much for joining me today.

Mark Gabel (00:04.566)
Thanks for having me, Prateek.

Prateek Joshi (00:06.495)
Let's get right into it. Can you describe the current and the state of play of how generative AI is being used in software development?

Mark Gabel (00:19.786)
Yeah, yeah. It's a, I like to think that generative AI is a really, really nice match for software development right now. Cause you know, the task itself, you're synthesizing like tons of information all the time. And so AI is really good at that. And as a software engineer, you're always kind of working sort of like in a text in text out way, right? You're reading tickets, you're writing code, even the docs that you check into the Git repository or text. And then the sort of third piece that you need is

a good amount of accessible data to sort of bootstrap what's happening with generative AI. And that's, you know, fortunately there with software and it's not there with other domains. You know, like there's a lot of, for example, a lot of really interesting stuff we could do in the legal domain if the data weren't all siloed. But anyway, you know, when you're looking at the generative AI space, there's a few big categories emerging. So one of them is the most obvious. It's the code completion products.

and their GitHub copilot, that is the king. I think people really, really underestimate the level of polish and refined UX that they have, especially by virtue of them being first to market. Like there's little things in that product that the competitors don't have, like that little model that they have, the little logistic regression model that just decides how often to show completions. They built that with a ton of usage data. And that's the thing that...

makes the product a great product and a lot of really neat stuff like that. I also wanted to mention that I like, uh, codium as well. I think they're very competitive and I think it's clear from their dev blog that they're developing their product really deliberately, like with a lot of thought. I like the way they think. So kind of outside, oh, go ahead.

Prateek Joshi (02:08.479)
Yeah, no, I just want to ask, and you made a very interesting point about how generative AI and software development, that it's a great match. So there are certain things that LLMs are great at, and there are certain things that, no, they're still getting better at. So if you look at all the tasks that a developer has to do in an average day, what tasks are...

way more likely to benefit immediately from infusing their day with LLMs.

Mark Gabel (02:43.726)
Do you mean within software engineering or within sort of like the entire domain of knowledge?

Prateek Joshi (02:48.931)
Actually, let's do both. Let's start with the entire domain of knowledge workers and then let's do software developers.

Mark Gabel (02:54.442)
Yeah, I got to dug my own grave there, didn't I? All right. So I like to think that LLMs are particularly awesome at anything that I'm going to really sort of abuse terminology right here, but I'm going to say that it's like anything that's isomorphic to summarization. Anytime you have to take a whole lot of information, infuse a new little insight, and then create a much smaller artifact as the output, they can be really, really good at that kind of stuff. And so

You know, with software engineering, I just think that that's an absolute perfect match. In fact, like software in a sense is, we're sort of like. Summarizing the real world. I used to kind of say that like, uh, to my students, uh, back when I was a professor that like, I used to describe it as the agony of the software engineer. Like what our jobs are is basically to take the entire messy real world. And distill it down into a.

kind of broken, but still useful model that people can use. And that's what software is. So I think that kind of structured, knowledge summarization is what they're absolutely perfect at and software is really, really great for that.

When you kind of take a few steps back and say, all right, knowledge workers as a whole, well, there's a really obvious answer right there in terms of a creative professions. I'm trying to come up with a blurb for a website, give me 10 different rewardings to sort of break my creative process out. I actually think there's a lot of neat stuff you can do with software with that too because sometimes we hit the same kind of writer's block and all we need is like, just give me a couple of good variable names and I can get going on my task, which is a little bit odd.

But there's all sorts of other tasks too, where you're summarizing information from multiple streams, and I think LLMs are gonna be very, very good at that.

Prateek Joshi (04:48.091)
Amazing. Maybe it's a good stopping point to talk about Laredo. And I know you're just coming out of stealth. So maybe quickly for listeners who don't know, can you explain what Laredo Labs does?

Mark Gabel (05:04.846)
Sure. Yeah, we're a brand new company. We're just more recently funded. And I talked about code completion just a few minutes ago, and there's sort of another category in there. There's actually a third category, which I like to kind of call the chat knowledge and search category. And everyone's sort of like the chat-based IDEs and things like that. But let's talk about what we're doing. And that's what I'm trying to call the low touch coding category.

which is one that I like to think that we're pioneered, but you know, there's a bunch of stealth companies out there. I don't know about, you never know who did what first. And so to put it colloquially, we call this sort of the level four, full self driving software engineering category. And the problem set up is really, really general. It's just like, you give me a task. Like a JIRA ticket, GitHub issues, something like that. You hand me your entire project repository, maybe some of the project documents and stuff like that.

And you say, do it. And so we have a couple of peers in the space that have emerged over the last year. There's a smaller YC company called Sweep, a company that announced a funding round called CodeGen a few weeks ago. And GitHub is working on something internally now called WorkSpaces that looks a little like in this space, too. But it's still very new and emerging. Our product, which we're calling the Solver,

is a brand new user experience. It's its own app, and it's powered by brand new foundation models that are purpose-built for this level 4 FSD problem. So as far as we know, all the competitors, including GitHub themselves, are using GPT-4 or some variant of it. We're completely off of that stack. And we purpose-built these models for this problem, which we call solving. And I can tell you, it's been a long day.

A very, very difficult year, basically trying to build absolutely everything from scratch to an ML company, the data, the models, the team, and the actual user experience around all that. But the freedom to experiment and actually the independence from external forces has been worth it. It's really been actually been sort of like raising the ceiling on what we can do too.

Prateek Joshi (07:20.011)
Amazing. Early during the early parts of the discussion and also even before we started recording, we were talking about how this is not a code completion tool. It's not an AI assistant that just auto fills whatever's coming next. So can you just quickly explain that? Basically compare and contrast an AI code completion tool versus...

the solver product.

Mark Gabel (07:51.534)
OK. Well, with code completion, it really is, up until very recently with GitHub CodePilot, code completion was more about trying to save your fingers. It was essentially more of like a keystroke saving thing. And it was only when we started modeling a lot of stuff that, back in the day when you would build code completion models, like the Ngram models and stuff like that, we would build them like we would build compilers. You filter out all the comments. You abstract away the variable names.

What we learned with LLMs is if you leave all that stuff in there, there's a ton of really interesting knowledge that's actually captured in all of that. And now code completion tools are actually doing more than just saving keystrokes. They're actually taking some of the mental load out and sort of keeping you in the flow as you're working. But they are very much focused in the IDE. You have a cursor. This is where I'm working. The task context is usually not given by natural language, although...

Copilot with ES code, you can actually now just hit a hotkey and give it a little hint of something you wanted to do. And you always could be able to put a little fake comment in there that says, by the way, please rewrite this a little faster or something like that. But it was a little kludgy. But still, you're in an IDE. You're actively working on a task. The context is very apparent from the tabs you have open and what you've scrolled to. With the solving kind of task, you don't have any of that context. You're working.

outside of the IDE, you're working directly from the task specification that you are given. Well, what's kind of interesting though is solving doesn't always have no context. So there's a term that we like to use internally at the company, which we call the continuum of specification. And I'll try to explain that a little bit without getting too boring about it. The basic idea is that like it's sort of just a truism.

that all software tasks are under specified. Like as your work, you're a working software engineer. How often did you ever get a ticket where you're like, oh, wow, I can completely just do this right now without talking to anyone else. It's completely, yeah, there's absolutely no vagueness to this whatsoever. It's like, yeah, that's almost 0% of the time. But the funny thing is, and like, if you kind of draw an analogy to sort of like the mean value theorem, by the time you make that commit at the end, it's been fully specified.

Prateek Joshi (09:51.547)
Hehehehe

Mark Gabel (10:07.35)
Whatever you made, that's the model. That's the task. So somewhere in the middle between when the bug was filed or when the issue was filed and when you actually finish, well, the task got specified. And it sort of accumulates context as you work on it. And that context accumulates more through time and with the actions that you're doing. So for example, what we're doing with the solver is

Mark Gabel (10:37.354)
we're directly modeling that type of dynamic context. So what's kind of fun with that is that as you start working on a task and as the AI agent actually works with you, you're leaving breadcrumbs throughout the whole project. I did a little bit of more work here. I added a comment to the bug here. I did this, I opened up this file. I created this directory structure. That context accumulates like a little snowball and our models are trained very natively at tasks in all sort of...

points along that continuous specification, and we can get more and more accurate the more that you use.

Prateek Joshi (11:10.839)
Amazing. I love that description. And I want to go deeper. So earlier you mentioned how a task, a ticket, right? It's like, it's usually in all my years, being a software engineer, small and big teams running a company, it's always been the case. And it's almost like, you're like describing the

the location of an electron, it's always a probability cloud. You'll never be able to pin it down. And this is exactly that. So not exactly that, but it kind of feels like that. You live in a probabilistic world. So taking a step in that direction, you mentioned how you capture these little breadcrumbs along the way, and that becomes the basis to create something that's more and more accurate, more and more powerful, more and more useful as you use it.

Can you just talk about this agent native approach that you've taken to building out this product, not just for software development, but how do you take like an agent native approach to just design and build any kind of software?

Mark Gabel (12:18.423)
Yeah, so in the software world, Agent Native adds just one very distinct component. So if I don't have a cursor, you need a component that puts the cursor somewhere. My academic background, actually, I studied and published within the academic field of software engineering.

which is an interesting little academic field because it is computer science, but it's not computer science. And it has a lot of sort of like formal verification, mathematical people, but also it has like social scientists that study dev teams. It's really, really interesting. And in that community, the name for that term of like finding the cursor where you need to do your work is called change localization.

So in the agent native approach for software engineering, you've got to have a change localization component, or you've got to have a big, you know, sort of a one component fits them all that actually does that as a piece of it. So you have to say, given this work and these breadcrumbs that I've seen so far, where am I going to be working next? And then the agent needs to be able to navigate there, observe what's going on.

and then implement the most likely next action that you would have taken as a human software engineer there. So this is generally writing code, deleting code, inserting new code, all that kind of stuff. But what's actually really interesting of stuff that we're just now beginning to start put together is what is it gonna mean when we give that agent even more autonomy? Like, what if we allow the agent to, for example, self-orchestrate and say, I'd like to run the tests now.

Or I'd like to run the linter. Or if it's a nicely statically typed language, like, I'll just run the compiler. I mean, imagine how hard it is right now for AI just to even generate Rust code that actually compiles. It's actually quite difficult. Imagine if the agent-based approach has had the ability to run the front end of it and just run it. A lot of really amazing things that are possible there. But really, the essence of the agent approach is change localization plus teaching it to behave like a human.

Prateek Joshi (14:14.693)
haha

Prateek Joshi (14:30.043)
amazing. And this premise of evaluating the state you're in and figuring out the best action to take. Obviously, it's reinforcement learning, it's like the fundamentals. And I'm glad that the premise is becoming more and more useful. Because for a long time in the machine learning world, people talked about supervised learning and unsupervised learning and reinforcement learning, even though it was very good and important because of

You can build robots with it. That was the most popular application. But in the rest of the world, you're either using the two big pillars. And I'm really glad that premise is finally coming in. And when you think about this premise of, hey, look at where you are. Figure out the next best action to take. In the software world, how does that

Mark Gabel (15:02.551)
Mm-hmm.

Prateek Joshi (15:27.759)
translate, meaning I am doing my work and you'll say, let's say I'm working on a ticket, right? What possible actions will you allow it to take? Like what's within the boundaries and what's something that, hey, I don't want the agent to even touch that because that could lead to chaos.

Mark Gabel (15:44.982)
Yeah. Well, you never really need to worry about what actions the agent can take if you build the right user experience around it. As long as you have guardrails, the ability to observe, and the ability to steer, then it's just as innocuous as like, hey, if I'm composing an email to my boss using ChatGPT, it's completely safe because ChatGPT doesn't have access to the Send button. So as long as this thing is not literally hitting Commit and pushing to Main, almost anything is pretty safe to do.

So the field is open. It can do work, it can propose edits, it can read docs, it can open up tabs, all those types of things.

Prateek Joshi (16:18.083)
All right.

Prateek Joshi (16:24.891)
That's amazing. All right, so let's talk about big companies that are trying to adopt something like this. So let's say you go to a big company, they have a thousand developers. What will make them give this a shot? And also, again, this is more of the founder CEO hat that you're gonna put on. Like what we do tell them to go from, hey, give it a shot, a couple of developers from two to 20 to 100 to wall to wall.

all thousand people using.

Mark Gabel (16:57.386)
That's a great topic. I wish I could say everything that I know about this too, because I've had some really in-depth conversations with some very interesting high-level CTOs recently and heard much more about their problems, but, you know, confidentiality and that kind of stuff. I will say we're starting small, so we are launching a pilot program essentially imminently right now. We're installing it at a first potential customer site. Everything's going right now. But the question is, where do we see this going? Like, what does success look like with this? And...

You know, we're building a very, very broad thing. We're building like an AI horizontal for software engineering, not necessarily, you know, a vertical for one domain, like generating front end code from a Figma. You know, and for horizontals, you know, we are, we hesitate to say what the ideal customer looks like, because we don't want to necessarily chop down our market too much. But that being said, you know, a lot of the genesis of this company and thinking about the dent that we could make in software engineering.

actually came from our experience at Samsung, which was our previous company, Vivlabs, which was the follow-on to Siri. That company was acquired by Samsung and eventually became a lot of the technology that underlies its Bixby virtual assistant and stuff like that. So our startup code ended up shipping and running and being accessed by hundreds of millions of devices. And what we witnessed during building and productizing that was just...

an amazing philosophy of software engineering that a lot of American consumers do not realize is going on. We see Samsung, we see some of the ads on TV and stuff like that, and kind of like just like, oh, it's just another company making phones. Like, no, that's 350,000 employees. There are thousands and thousands and thousands of engineers. We would just watch changes flying through and you make like a bug fix and now you've got to do it for 28 different

phones that you're shipping in India and 32 ones that you're shipping in the US. You're seeing all these different pull requests being open and all this kind of stuff and they're all different, but they're all the same. And that's the exact kind of stuff that like this product that we're making right now could eat up. So repetitive, but still domain specific work. And the nice thing about it is you don't really need to worry about hurting its feelings.

Mark Gabel (19:19.062)
So it can sit there and run overnight and process that massive intake of tickets at your thousand per thousand engineer company and do its best shot and basically start on every single ticket that you have in the queue. And the next morning when the engineers start working, everything's been sort of kickstarted.

Prateek Joshi (19:36.42)
Right.

If you look at what generative AI is doing for software development, do you think it will render certain programming languages obsolete or maybe certain practices obsolete? And let me tell you where I'm coming from. So if the language is very strong and structured and very, you know, the purists would love it. Then by definition, the average many people.

may not, they don't rush into it because it's like it's hard to learn. Whereas if the language is very open and loose and simple, then more people will rush into it. But by definition, it's not like a strong language. So if you look at how GenAI will impact this, all these programming languages, what do you think will happen here?

Mark Gabel (20:28.95)
That's a good question. Um, it's actually kind of funny too because

It was sort of like a kind of trite thing for a while that like every single Gen.ai company, including us, like probably the demo we're gonna have on our website is gonna be in Python and have to do with machine learning code. It's like, it's kind of funny that Gen.ai does really well with Python and machine learning code because we're all machine learning in Python. It's what we're good at. But yeah, your question is very insightful. And I like to think that certain sort of verbose and difficult to grok programming languages are that way for a reason.

My academic background is in software engineering. I went to UC Davis where they are really, really have strong competence in computer security.

Making things like systems programming languages overly easy to use is not necessarily a great idea. Like all of the security problems that we deal with in native code, root kits and all that kind of stuff deal with buffer overruns and all these kind of things that you know if it's too easy to create software for that type of platform it's not a good idea. But it is nice when it does reduce the barriers to entry. So for example,

Mark Gabel (21:47.998)
We actually, I'll tell you a little secret, we do have a code completion product. We're just not planning to ship it anytime soon. And it is actually running on the PyCharm and IntelliJ and basically the JetBrains stack of applications. Java is very verbose. The plugin architecture in the JetBrains world is absolutely arcane. It is insanely powerful. You have to write so much code to do the smallest amount of thing.

using stuff like Co-Pilot and other models, including our own to sort of lower the barrier of wrenching for that. It's actually been instrumental at our own company.

Prateek Joshi (22:26.139)
interesting. And if you look at the idea that hey, generative AI will democratize software development, what you're saying is, if everyone's allowed to create software, then by definition, the quality will necessarily go down because you're allowing all sorts of crap to creep in. But what you want ideally is the structure of the strong language. It's just like you're just

help more people get into that versus doing, helping people do more crappy stuff with like subpar software. I think that's where I think you're going with this. Yeah.

Mark Gabel (23:04.05)
Yeah. There's a couple of counterpoints there, which is actually, you know, you were asking me this really sort of insightful question earlier of like, what is Gen.ai good at in general? Well, there's another kind of like sub-question to that is like, what does it particularly like, if we narrow down into just code completion, what does code completion do really good at? And like, if you're working in a brand new algorithm that no one's ever seen before, it actually gets in the way. Like you want to turn copilot off when you're doing that kind of thing. But Gen.ai is really good at creating tests.

Tests are always verbose. They're very structured. And I've actually noticed that developers are having a lot more fun and writing more tests as they're using GenaAnna. So we might actually see a bit of sort of a quality boom there, or at least sort of a counter-balancing force. But your other point is well taken. We created the solver, and even the very early versions of the models back when they were smaller and less thoroughly trained.

We definitely had some internal joking around where it was only kind of half joking that like you could use the solver to basically fake your way through a remote dev job for at least six months before you got caught. Like, you know, even when Jenaya is wrong, it is startlingly credible in what it produces. You know, like you see the hallucinations from Jack TPT where it cites a paper that no one actually wrote, you know, like that could slip right by. So it's a very fair point that you're making.

Prateek Joshi (24:26.955)
Yeah, no, that's some of the things that it generates. It's like, yeah, if you don't, some things like if a spammer or if a scammer is like trying to email you, you can tell the domain name, the headline is terrible, they're spelling mistakes. And but now that with Gen.ai, whatever it generates, at least it's like, you got to do some, you got to take a second, do your research to figure out if it's fake or not, because it looks very, very real. Both like not just in email, but also in like in some of the

code or paper titles I was generating. Okay, so let's talk about the role of generative AI in startup land, meaning all the startups now have this additional tool that they can use to do more stuff faster. So do you, but at the same time, big companies also have access to the same tools. So can startups leveraging Gen.AI for software engineering?

Can they really gain a significant advantage here? Or will it at least level the playing field? Like how do you think this is helping startups if at all if it's helping?

Mark Gabel (25:38.206)
Yeah, that's a very good question too. I think it is very much leveling the playing field. And also I think for sort of a, this unique period of history where we're in, where, uh, you know, for the next couple of years, big companies are going to be kind of hesitant to adopt all this for sort of the usual security reasons and stuff like that, they're going to wait for the strongest enterprise offering with all these guarantees to come out before they really adopted internally startups.

I don't worry quite as much about that. And the thing about startups is you move faster. If you look at sort of like the material effects that could have over say like the first six months or the first 12 months of a company's cycle, that could be the difference between making your next milestone and not. So it's not only leveling the playing field. I think it actually might be sort of tipping it in favor of startups for the near future. It's certainly been amazing with us.

Prateek Joshi (26:37.851)
Amazing. And just one quick question about the foundation model that you're building. Now obviously a lot of details that are under the hood to make it work, but to kind of build such a foundation model and make it work in practice, what does it...

For example, if you're building a text-based model, you capture a bunch of text, you train it, you run with it. And you mentioned the use of breadcrumbs to build the software. So can you just talk about, compare and contrast this foundation model with the average foundation models we're all familiar with, like a GPT-4.

Mark Gabel (27:20.364)
Mm-hmm.

Mark Gabel (27:24.438)
Okay, I'm gonna tiptoe a little bit, so hopefully I won't be too obnoxious trying to hide things, but I'll say sort of philosophically speaking, although this is sort of a foundation model, and that's a term that I'm using very loosely, because in a sense, it's not a foundation model. It's as big as a foundation model. It's had just as much data as any kind of foundation models been put through it, but it's not something that is generally good at all tasks. You can think of it as something that's like a task-specific foundation model that we've built.

Prateek Joshi (27:27.179)
Sure, of course.

Mark Gabel (27:54.114)
to only do really one thing, to act like a human software engineer across a ton of different contexts where a human would be working. So that's the real biggest difference right there.

Prateek Joshi (28:04.755)
Right. I think that that's a great point. And I feel like, I'm strongly in the camp of call it verticalized foundation models or like specialist foundation models. But basically, if you're building something for software engineers, you don't really care if your model doesn't know what happened in Rome in like second century BC. Like, well, it can shed that dead weight off and be more faster, more efficient. It'll cost way less. I think that's, it makes a ton of sense.

Mark Gabel (28:33.748)
Yeah. We're not evaluating our hella swag score internally or any of that kind of stuff. You know, there's only, we only care about our stuff.

Prateek Joshi (28:37.467)
Right. Exactly. All right. So moving on to the future of software engineering. Now, given all the progress we have made as a community on using generative AI to spur innovation and how we build software, how do you think this will redefine the role of a software engineer in the next five years?

Mark Gabel (29:06.306)
I see. Well, that is where I think people are a little bit, I don't know, I don't agree with any of the stuff people are saying, whether they're saying, oh, junior developers are going to become obsolete, or that we're all sort of like inventing ourselves out of a job right now. You know, every time you use your iPhone or your Android device, and then you go and use like any other piece of software that doesn't have a team like that working on it, you see there's just a vast difference in

I am definitely in the camp that software is absolutely a gas. It expands to fill all available space, that demand is still effectively infinite right now, and that there's just so much more software and so much better software that we want to build and we're not doing right now. So my answer to your question over the next five years is that I'm just looking forward to seeing more and more of the software I interact in my daily life, being closer to the iPhone and quality than closer to the, uh, you know, the

the cheap IOT device that I ordered off Amazon for 30 bucks or something like that.

Prateek Joshi (30:12.079)
Right. And there's a funny, not funny, but an interesting analogy we can draw here is G1's paradox where they say, hey, the cost per unit is going down, but our cost per unit is going down, but because of that, the world will just consume more of it. Right? Just technically it should be the opposite case, but in this, so drawing an analogy here is the effort or the cost.

out of creating software will go down. But because of that, more people will rush into it. So net-net, the world will end up needing more builders and more engineers to do more stuff. The amount of work itself will go up. So I think the world as a whole will see a lot more software being created. Yeah.

Mark Gabel (30:57.538)
That's exactly what I'm seeing right now. And lower barriers to entry, I think, it's gonna help a lot of industries, I think that way. Software right now, building a competent software team is like one of the biggest moats a business can build around themselves. And to cut some channels through those moats will just be a healthy thing for the economy. Although your example is, I think one of the famous cases from the paradox you mentioned is like electric cars, like,

Prateek Joshi (31:05.112)
Yeah, yeah.

Prateek Joshi (31:16.471)
Yeah.

Mark Gabel (31:24.826)
Oh yeah, I'm going to cut down on pollution by getting rid of my Honda Accord Hybrid, and then I'm going to buy a Tesla because it's green, but now I drive 20,000 miles a year because it's green. It's like, well, I didn't really help much anyway, right?

Prateek Joshi (31:37.617)
Right. All right. One final question before we go to the rapid fire round. And it's about the ongoing research in generative AI, specifically as it relates to software engineers. So if you had to predict, what's the next big breakthrough that might happen when it comes to

GenAI models for software engineering work.

Mark Gabel (32:07.71)
I think very squarely, I think the next big breakthrough, obviously the winking thing I have to say to you is exactly what we're working on here at my company. But I think in terms of the research world of what's gonna sort of punch a hole into what we can do next with products is gonna be something that's multimodal. So the idea, we described so many bugs in terms of screenshots. We have production companies that live in Stackdriver and are doing root cause analysis, at least sort of very...

heterogeneous sources of information that they themselves are not necessarily text, I think there will be sort of like a domain specific form of multimodality that will lead to a very large increase in capability.

Prateek Joshi (32:49.755)
Amazing. I strongly believe in that too, because humans, I know we use a lot of text, but we are very visual creatures and a very, very large chunk of the data that the world has. It's like, it's images and videos. And so it's a fantastic comment there. All right.

Mark Gabel (33:07.402)
Yeah, I mean, you see about like my company Slack, just scroll up and down every single channel. All we're ever doing is pressing, you know, command shift four and sending little snippets of screenshots, even when the screenshots of a log, because it's easier to do that because you don't want Slack to auto format the text and stuff, you know, like visual, it's what we do.

Prateek Joshi (33:25.107)
Right, right, 100%. All right, with that we're at the rapid fire round. I'll ask a series of questions and would love to hear your answers in 15 seconds or less. You ready?

Mark Gabel (33:37.654)
15 seconds, I'll try. I'm a talker.

Prateek Joshi (33:40.114)
Alright, question number one. What's your favorite book?

Mark Gabel (33:44.106)
Okay, I know I'm supposed to say something technical here, so for that answer, I'll give you a Gödel Escher and Bach by Doug Hofstadter. Really influenced the way that I think, but my actual favorite is On Food and Cooking by Harold McGee. It was the first scientific volume on food, and the way that he approached it is the way that I approach problems. I like to understand them very deeply until the solution is obvious.

Prateek Joshi (33:58.376)
Love.

Prateek Joshi (34:10.043)
Amazing. Love that you chose one in each category. And again, on this podcast, we love books, so more recommendations is always better. So all right, next question. What has been an important but overlooked AI trend in the last 12 months?

Mark Gabel (34:27.022)
I'm going to go slightly over 15 seconds here. We've actually talked about this one internally. And for example, I'll just give you a couple of opinions from my team too. So Raj on my team thinks it's the saturation of certain benchmarks and the emergence of new task specific benchmarks. That's really, really interesting and not talked about too much. Nico on my team is talking about multimodal a lot, but not for visuals. It's the fact that it's normalizing the idea of very d***y.

dense non-text inputs to things like LLMs. And there's a lot of really interesting ways we can compress vast amounts of information that isn't just text. And for me personally, I think the most overlooked thing has been what we've been able to do with extremely small models. Like the textbooks is all you need paper where they essentially, you know, incredible performance with that tiny LM. I mean, this has been the story of my career. This is like, I had a whole team working on this at Samsung.

cleaning data, improving data, it's not fun, it's not easy, but over and over and over again, it's clear that it has this outsized impact.

Prateek Joshi (35:34.587)
Yeah, I think that is a phenomenal viewpoint. And I agree with that. I think if you wanna really make like a meaningful, like an order of magnitude impact on anything, data is where you go to. Like you get better data, clean it up, you get better at that, then you'll see like a step change in whatever you're doing versus figuring out how to find two and get the extra three and a half percent. All right, next question. What's the one thing...

about software development that most people don't get.

Mark Gabel (36:09.462)
Well, I touched on this earlier, but it is that software is a gas and that the demand is completely limitless. So we're never going to code ourselves out of jobs, at least not as long as I'm alive. We're just going to keep building more, more software, better and better software.

Prateek Joshi (36:26.975)
Right. Next question. What separates great AI products from the good ones?

Mark Gabel (36:33.71)
Okay, this is going to come from the perspective of someone who's been working with largely deployed products. So I'm going to say it's a holistic treatment as an actual product. We really actually admire GitHub's co-pilot, not for the quality, the open AI model behind it, but for the level of polish that they added to it from what I am sure is a relentlessly product metrics driven development process. That's exactly what we've done for years at Apple and Samsung. It's hard. It's what makes products great.

Prateek Joshi (37:04.227)
Amazing. I love that. And I agree. Next question. As a founder, what have you changed your mind on recently?

Mark Gabel (37:14.262)
Okay, well, as a scientist and engineer, because I'm a technical founder, I would tell you a bunch of boring answers. Like, I'm not so enthusiastic about alibi positional encodings anymore like I was earlier in the year and stuff like that. But as a founder, the biggest thing I've changed my mind on recently is actually choosing to build more openly. So our past companies, Siri and Viv, were incredibly secretive. So we are gonna kind of do that too by default.

It was only more recently with the solver coming together and actually working pretty well that we decided to open up earlier and start engaging the world.

Prateek Joshi (37:51.451)
Amazing. What's your biggest AI prediction for the next 12 months?

Mark Gabel (37:58.142)
Okay, I'm going to be a little bit of a potential downer here, but, uh, my biggest prediction is that I can't. It feels like we're at a juncture point. And I think that in 12 months, it's going to be clear that we're headed for one of two worlds for AI startups, and it's either a happy one or a sad one. So the happy one is going to be like, okay, a lot of these smaller domain specific projects, models. We find all sorts of utility along the long tail. And we have this.

Prateek Joshi (38:01.253)
Ha ha.

Mark Gabel (38:26.898)
even more diverse and healthier AI ecosystem. The sad one is that it may be clear in 12 months that we're gonna be headed towards, this is like the 2000s with phone handsets and that we see an absolutely massive consolidation. So you see like open AI is like your iPhone, Anthropics, your Google, you know, Google's your, and Coheres your Samsung and that's it. And there's huge increased barriers to entry. All the small players are dropping out.

I'm not saying this isn't where we're going to be in 12 months, but I think in 12 months, it'll be clear that that's where we're headed or not.

Prateek Joshi (39:01.595)
All right, final question. What's your number one advice to founders starting out today?

Mark Gabel (39:08.438)
Okay, I can give some perspective on this because we're, you know, multiple time founders and or at least multiple time early stage startup people. And I'll say, whatever you aspire your company to do or be, start doing it in your first year. So if you're building products, ship a product in your first year. Don't just play around with research for a year telling yourself that you'll ship something after the series A. If you want to do research, like build new models, start building them now.

Don't hack together something with GPT-4 and then tell your investors you'll build bottles after the Series A. You've got to build the culture for that and the competency for that and that takes way more time than you think.

Prateek Joshi (39:51.436)
All right. That's actually a really good piece of advice. And it's actually very nuanced because it's almost like in startup plan, or at least as a leader of a group of people, as startup founder is, like the culture is what you do on a day-to-day basis. And in this case, what you're saying is the culture is what you do in year one, on a day-to-day basis, but like whatever you feel like the culture or the company...

model should be you do it and you do it in year one because that's gonna set the course for where you're going. So that's fantastic. That's really good advice.

Mark Gabel (40:27.026)
Yeah, you know, startups are all unique too, so take that with a grain of salt. But yeah, it's been our experience. The company that you act like in your first year is the company you end up being.

Prateek Joshi (40:32.396)
Yeah.

Prateek Joshi (40:36.635)
Right. Amazing. Mark, this has been a brilliant discussion. I loved your insights and especially the nuance you have on all these topics. It's been fantastic. So thank you so much for coming on to the show and sharing your insights.

Mark Gabel (40:53.166)
Thank you. Thank you for having me. I'd love to do this again. And thank you for all the insightful questions and conversation. I enjoyed this quite a bit.

Prateek Joshi (41:00.153)
Perfect