Infinite Curiosity Pod with Prateek Joshi

Sovereign AI

Prateek Joshi

Alex Gallego is the founder and CEO of Redpanda, a streaming data platform built for data-intensive applications. They've raised more than $165M in funding from investors such as Lightspeed, GV, and Haystack. He was previously the cofounder and CTO of Concord Systems.

(00:01) Introduction
(00:07) Defining Streaming Data
(04:14) Evolution of Streaming Data Systems Over the Last 10 Years
(09:10) Introduction to Sovereign AI
(14:14) How Sovereign AI Works in Practice
(19:51) Infrastructure Needed for Sovereign AI
(23:48) Sensitive Workloads in Enterprise AI
(28:01) Foundation Models and Streaming Data
(32:41) Breakthroughs in AI Related to Streaming Data
(34:39) Rapid Fire Round

--------
Where to find Alex Gallego:

LinkedIn: https://www.linkedin.com/in/alexandergallego

--------
Where to find Prateek Joshi:

Newsletter: https://prateekjoshi.substack.com 
Website: https://prateekj.com 
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19 
Twitter: https://twitter.com/prateekvjoshi 

Prateek Joshi (00:01.759)
Alex, thank you so much for joining me today.

Alex (redpanda) (00:04.824)
Thanks for having me, Prateek.

Prateek Joshi (00:07.241)
Let's start by defining streaming data. For listeners who may not know, what characterizes streaming data and also a system that needs to handle streaming data, what should that system be capable of?

Alex (redpanda) (00:24.322)
I think it's easy to explain streaming data in contrast to how the world has been built up to date, which is for most of us programmers, you would start on a terminal and then you would take a file, you say cat file, then you pipe it and you say sorted and then you say unique and it gives you sort of the number of unique entries on a file. And that concept of...

compute and storage sort of linked together. It's fundamentally what a streaming is. And so the kind of main divergence from how we have been building tech to date is that the world has been built in batch, which is this idea of where you take the time horizon and you punctuate it. You say at the top of the hour in UTC, I'm going to run a job or I'm going to run this file or I'm going to run, let's say the job is

count the number of unique files. So that could be the job. And streaming, contrast with the streaming, the idea is that the world doesn't behave in this nicely uniform ways. And so it's a more natural way to model the real world when you're modeling the real world as a function of events. There are many ways to think about it. There's a lot of similarities. And so I know we're going to get into

you know, details of event driven architectures versus this versus that at a high level, the concept of streaming is that you can take advantage of every single event that goes through your network. And so let me give you one example for the listeners. So we were helping a bank and they used to run a batch job at the end of the day. And so you would enter your debit card on an ATM or your credit card. And then at midnight, they would send you an offer that says, Hey Alex, would you like to apply for this credit?

And it had terrible conversions. Then they took their, at the time, I think it was a Spark job and they just ran it every three hours. So they just, okay, let's run it faster. And that improved the conversions a little bit, but not significantly. When they move to a real time, like actually as the person is inserting the credit card into the ATM to withdraw and they immediately prompted the user,

Alex (redpanda) (02:41.87)
do you want to do, do you want to withdraw this money as a credit line instead? They had like a 300 % increase in conversions. And so, so that, that's sort of the concept. The concept concept has been that you can take advantage of every single data event that goes through your network. And computationally speaking, if you want one concept to stream me, which is the notion of a

everything that you could do with batch, could do with the streaming, right? Because you can imagine you're like, set a callback for every hour, and then you can still reprocess. And so I think we're in this really interesting change in time in the world where the future is not going to be built in batch. The future is going to be built in real time because you and I and everyone listening in demands interactivity from their applications, whether it's Uber Eats or DoorDash

your credit cards or your fraud or oil and gas monitoring of pipelines, et cetera. think the world behaves in real time. The world doesn't behave in this neatly discretized buckets. And so hopefully that gives a sense with a couple of examples for the audience.

Prateek Joshi (03:46.063)
That's amazing. I think that's a very good trend of how the world is moving more and more towards real time. And also to the average person, a very famous example of streaming is Netflix. Streaming happens in so many different ways, but streaming video is a very popular example. And so we've been building streaming systems for a while.

Alex (redpanda) (03:47.685)
Ahem.

Prateek Joshi (04:14.135)
And obviously it has moved, improved a lot. So can you talk about how streaming data systems have evolved in the last 10 years? Like what was the state 10 years ago and where are we

Alex (redpanda) (04:29.102)
Totally. Just one nit. The difference between video streaming and data streaming, which is my expertise, and not to confuse the audience, is video streaming is largely what CDNs do. So like Akamai is really good at delivering video content. Same thing with Cloudflare. Same thing with Netflix. Netflix also uses data streaming, which is different than video streaming. And so it's just a terminology for the audience. Data streaming in the context of Netflix would be about real -time user preferences.

people that look like you that just watched this particular video would also want this other series or that. so just useful for the audience. To summarize what the evolution of data streaming has been over the last decade, it's been, I think, towards a unification of a protocol. And I think in my view, towards the unification of the Kafka API as a protocol, as sort of a lingua franca, not too dissimilar from

Amazon S3, their product ended up creating a specification, not because they started with a specification, but because usage and utilization of the product demanded a specification. so the historical sort of artifacts of how we got here today was that if we start with the classic on -prem deployments of the Tipco and Solos of the world where you get charged for connection and so like things that were really not scalable and otherwise cost prohibitive for, you know, internet

companies like LinkedIn as an example who wrote Kafka. The migration away was from a specialized hardware into making the software smarter. In many ways, it was sort of a derived idea from the old school MapReduce era and they had do -be -style thinking, which is like, hey, you can just use cheap hardware and make the software smarter. So that was a big transition. I would say the

version of this was RabbitMQ. It really shifted away from proprietary protocols into an open protocol. Then Kafka came about and said, hey, replayability was a really strong muscle and benefit for companies. And so that way, if your job crashed, you didn't lose the data, which wasn't the case with previous streaming systems. Pulsar then came about in 2014 or so, and the context was Yahoo. And the context for them was about the disaggregation of computer and store.

Alex (redpanda) (06:54.7)
And so that was sort of Paul's contributions. In my view, Kafka's biggest contribution is the community and the inordinate amount of connectivity. And so I think people tend to refer to Kafka as this all encompassing thing that gets conflated with the GitHub project of Kafka, right? And so it's worth it to understand where like a little bit of nuance here, which is the integration points into Kafka is an order of magnitude larger

lines of code and in systems than the Apache github .com slash Apache foundation slash Kafka. That's 400 ,000 lines of really valuable code that runs a big part of the world, but the community is one or two orders of magnitude larger. And so when people refer to Kafka, the protocol, I think in my view is the community, the connectivity. Obviously the baseline implementation is really important, but I think at this point, the protocol has transcended the original

And to finish the evolution of where we are today with the streaming, at least in my view, is that Red Panda, so I wrote Red Panda originally, we couldn't be here without any of the previous systems. We needed the lessons learned from Pulsar on learning the disaggregation of a storage and compute. And we leverage the community of the Kafka, the Kafka community, because we implement the exact same protocol. And so to end the users,

It looks exactly like a Kafka protocol, but it has sort of the benefits and the lessons learned from all these big systems. And I think in many ways we are a part, I think today probably a strong part of the Kafka community ecosystem. And that's sort of an evolution of how we got here today.

Prateek Joshi (08:41.911)
Amazing. That's a very nice and crisp history. Now, in 2024, we are witnessing massive trends. And I'd like to highlight two trends. One is what you mentioned. The world is moving more and more towards real time. People expect a lot from products they use and companies have to deliver those experiences. So that's on one side. The other is more and more software products.

Alex (redpanda) (08:44.642)
Cough

Prateek Joshi (09:10.177)
they're getting infused with AI. So people just expect your product to have certain basic AI capabilities and it's not optional anymore. So to that end, you recently published Sovereign AI. So just to get started, can you explain what that

Alex (redpanda) (09:30.104)
Great question. Sovereign AI is the ability to leverage state -of -the -art models where your data doesn't leave your network. It is not too dissimilar from the general concepts that we've been talking about for a few years around data sovereignty applied to AI models. And so let me give a couple of marketing insights. And so it's true that the first tailwind that we're seeing is just that the future

for data intensive application is not going to be built in batch. And so people build in applications that they're starting with real time as a super set of a technical architecture over batch. So that's sort of the first tailwind. The second tailwind is interactivity that these AI models gives you need to be powered by some sort of real time technologies. Whether you're recording the responses or you're fine tuning on the fly or whatever it is.

it would be absurd to say, hey, let me ask this model for a question about something I wanted to figure out at lunch. Like sometimes I'll go on a chat, you know, it's like, can you summarize this product? have like five minutes. And my attention is, you know, whatever a few seconds is not hours. And so if I were to, if instead the interactivity with the product would be that of getting an email at midnight, I would probably forget why did I ask this silly question, right? Like it's, I think in many ways I'm

the thinking or to a model that is for simple things. And so those two tailwinds, I think, are an important concept of just the streaming in general. Now, I want to superset this with the idea that I think there's that in infrastructure and in infrastructure companies in particular, there is a new trend which it's called bring your own

And this concept of data sovereignty, and so to contrast it with what we had been doing for many decades where you would send your data to a third party vendor, let's say, you know, any of the like house providers like it's no flake or data breaks or whatever, where they may even give you, you may even be able to negotiate free ingest of data. But every time you query your data, you have to pay for it. And so, view IRC as an architectural paradigm,

Alex (redpanda) (11:50.634)
absent, you know, really red panda from the picture, is this idea that we have the technologies today through, you know, basically cloud APIs to separate the control and data play where the data playing lives inside the customer's network. Okay. And so now you can experience a fully managed SaaS inside your network where data never leaves. So that's, that's a really strong pillar. It's kind of like, if you pair it with the tailwind of AI, then the idea is

Why now that I have, went through all this trouble of making sure that my data is it's it's resident to the hard drives that I control to the computers that I control inside the network that I control so that it doesn't leak and horizontal attacks like we've seen, you know, recently with some major data vendors doesn't necessarily affect your data play. Right. So this concept of data level, atomicity. so now that from an infrastructure, we went through all that trouble.

What follows is that you would want to leverage this new technology like Gen .AI also in a sovereign mode, which is you want the execution of the model to be close to the data. And so our contrarian view and the blog posts and announcements and the product that we released today goes sort of against the grain of classical wisdom of taking your data and shipping it to a vendor, whether it's OpenAI or Anthropic or Mistral.

which there is nothing wrong with that. I just think that for the enterprise and for the customers that I tend to talk to, which is the Fortune 5000 mostly, they never want to share the data, right? So if you take the like America's military bank, they're never going to send the data to, you know, to any API provider period. Or if you take a drug discovery company, they're not going to send the formula that they spent 6 billion getting approved.

to an AI vendor, or if you take an HFT firm that makes a trillion dollars a year, they're just not going to send it. Those are just examples to highlight extremes, but it is more common for people to not want to share the data, like the true gold in the business, than it is to want to share the data. So in summary, that's what I think sovereign AI is, is the ability to own and understand end -to -end through model execution.

Alex (redpanda) (14:14.798)
where your data is at all times and that it shouldn't leave your data out

Prateek Joshi (14:21.143)
Amazing. And the way in which an AI application gets built today is you go to an AI vendor, you upload your data, you send your data, the model gets trained, and then you can use the application that uses the model. But you said you're flipping that by shipping the model to the data. Can you talk about how it works in practice? And also, can you highlight the difference between

Alex (redpanda) (14:25.921)
Ahem.

Prateek Joshi (14:50.241)
Hey, I'm a big company. training a custom model using my own data versus shipping a model to the data. Are they the same thing? Are there nuances

Alex (redpanda) (15:01.346)
Yeah, so I'm talking about inferencing in particular when I talk about shipping the model to the data. Fine tuning and training are like a totally separate set of discussions. I think for the purpose of this audience, we could always talk about fine tuning and training separately and we power some of the world's largest AI companies. So I have an informed opinion on that based on those customers. But the interesting piece is

At the most tactical level, you can take an Ollama, right? So it's like a Go binary and you can say, hey, basically this is an HTTP call and you can send an HTTP call to this Ollama process that runs in your laptop, just as an example, that effectively downloads and installs, let's say, a Lama 3 .1. So yesterday, Zuckerberg released Lama 3 .1. He talks about

inside that there's no real breakout model on the high end and that a fine -tune model has a state -of -the -art performance. so yesterday, like three hours later than Olamma, that project released integration with Olamma 3 .1. And so when I mean shipping model to the data, I mean that instead of your data living like your corporate firewall, if you will,

you send the data to this, in this particular case, to this olama process, and then it would give you a result. And so in the case of computing embeddings, it would just give you back a vector of embeddings that then you can send to like your local Postgres database with a PG vector or something like that. And so that's the fundamental difference. In contrast with what we have been doing today for POCs, which is you take an event and then you send it to OpenAI, let's say to compute embeddings. So it is the same

It's a similar output in that whether you query open AI remote endpoint or you query a local host endpoint, you still get a vector of embeddings as this example. So you can get text generation or whatever it is. And so it is functionally equivalent. The only difference is that your data is never exiting. And so in the case of a cluster, I'm giving all of them as an example. You can launch this obviously with Red Panda Connect, which is how we launched

Alex (redpanda) (17:25.486)
That's a cluster mode and I would say the usefulness of that is for user patterns that tend to have spikes or that are sort of rather unpredictable. There's actually a really large set of machine to machine programs where usage is fairly steady state. So if you think of telemetry from computers, right? It's like every second and computers are just going to send you this data. So let's say like a data dog style.

use case. And so for Red Panda, when I talk about shipping the model to the data, I mean, if you split this idea of models into runtimes and configurations, people tend to conflate both configurations and runtime as a model, which is true kind of at a superficial level. But if you split it into two, know, not using AR jargon as configuration, which is the thing that you download from Hoganface and runtime,

a llama3 runtime, TensorFlow runtime, et cetera. As part of those runtime, you can send those runtimes different configuration parameters. So let's take in the case of llama3, so we don't get too down in the weeds. You can ship to a llama3 any gguf format, formatted model, and it'll run it. And so in the case of Red Panda, it's like, when this C++ binary starts, it detects if there's a GPU.

It has the runtime already as part of the, you we compiled it before you put it in your computer. So it's in your computer. And so as data is going through the streams, it can simply ship the data to the local host GPU to do inference for either text generation or, or embeddings. Does that make

Prateek Joshi (19:12.649)
Yeah, yeah. And I think that's a good amount of, I think, color on how it can work in practice. Another aspect that I want to touch upon is, in this premise, Solved in AI, you can leverage the latest models in your own hardware. So can you talk about how you talk to customers or how you advise customers about the infrastructure they would need to leverage this premise? Because many, companies

They may have the money, but they may not have the time or the bandwidth to manage the infra needed to run the latest model. So how do you think about that aspect?

Alex (redpanda) (19:51.96)
I think that the rubber meets the road on three core principles. I guess two beyond shipping the model today. The other one is this idea of understandability. So even if you have the money, and even if you have, so with infrastructure, I think today with clouds.

Mostly is okay. I still think that Nvidia is extremely expensive. It's maybe like a billion dollars for every 20 ,000 GPUs or so. Sort of guesstimating behind the scenes on what we've heard from customers. We have a couple that order a couple billion dollars worth of Nvidia's GPUs. And so I think people have the money to spend

a few GPUs that are like, you know, really formation critical systems. The issue comes with understandability. And actually, as I went on a roadshow early this year, and so we've been building this, by the way, for a year, we just launched it today at like 10am Pacific time. And so when I talk to CTOs of let's say, US largest banks or largest pharmaceutical companies, or what are our largest IoT companies, they're like, I don't understand

What generated what or like what was the input? So actually tracing was a fundamental blocker. They didn't feel safe shipping a model and reporting to the CEO as like, actually I have no idea what, know, the stake in the case of a credit card, credit line decline, right? Like then you need to sort of be able to understand what were the exact inputs to that model that generated a credit card decline.

event. so, Tracing was one. And then the last one is that the things that the enterprise have been chasing for a really long time, they just want the same things, frankly, as whether it comes from Red Panda or a database. You want access controls. You want authorization. You want authentication. You want OpenID Connect. You want to be able to centralize the service and the principles. And so, I would say that those are actually the long poles to adopting.

Alex (redpanda) (22:10.398)
sovereign AI is like, you need the basics and you need to make sure that only the applications that are allowed to access this model and this data through plain all access controls that we've been having in databases and systems of record for decades. And same thing with authorization technologies and, know, perhaps more modernly with like OpenID Connect and RBAC, right? Like you need that and tracing even

If you had figured out how to shift the model to the data, you would still need those two pillars to get to production.

Prateek Joshi (22:45.385)
Right. I think you made a very important point with understandability, right? When many people, even now, even the more sophisticated part of the spectrum, even they many times don't know exactly how something happens, be it influence, be it training times, be it convergence, be it the speed, latency, so many other things that it's hard to...

That's okay if you are like a small company being like a side project, but if you're a big bank, right? It becomes very, very important. And so when you look at sensitive workloads, one example you gave was like, hey, any workload related to approving a line of credit, like that's an important thing. So in the real world, practical world, Fortune 5 ,000, what workloads are important enough, sensitive enough? What are you seeing in the market?

that are making people go, I need to fully understand exactly how this has

Alex (redpanda) (23:48.686)
I am biased because my... So forever since I started as a career choice, I only wanted to work on mission critical systems. And Red Panda, when we first launched in 2019, we went to power some of the world's largest CDN or some of the world's largest electric car company or whatever, like some of the world's largest oil and gas companies for telemetry.

And many such use cases like banks and ISPs, those tended to be the use cases. So I don't feel qualified to kind of make an overarching statement because there's just so much, know, there is like retail, there's apps, there are mobile games. It's just like sort of the spectrum is so rich, but for the subset of customers that I personally talk to, so Fortune 5 ,000 companies.

the average workload is actually quite sensitive. And so if you take checking products, just as an example, and so we help power through either directly or through partners, some of the world's largest banks across the world, in Australia, in the US, in Canada, and so on. And when you think about those checking products, if you start to think about the kind of learnings that you can do on those products, everything about it is sensitive.

that you need to have access to, know, basically the way your identity, super private, your name, your gender, your demographics, your income, you know, your sexual orientation. It's just like there probably isn't a thing that isn't really sensitive. Like almost everything that you're a kind of touch, it is of extreme sensitivity. And so I would say that for the customers I talked to, the alternative is the anomaly.

The alternative where you actually send the data to a vendor is the exception. And those tend to be, in my sort of view of the world, they tend to be for prototyping of ideas. And so an example is semantic search. You want to look documents that look like X. What we have seen in practice with some of the, you know,

Alex (redpanda) (26:12.586)
Again, I make it sort of general statements about a smaller subset of the market, but even those you would be able to find exceptions. So the caveat here is that those are for the conversations that I have directly. which again tend to be this core, like what I call ring zero use cases for companies on which everything tends to be built on top. And so for sending your data to these APIs, they tend to be for auxiliary use cases. So you would land

those companies would land a use case around summarizing some interactivity, like a small subset of the applications, or with mortgage lending. It is a standard language, it is legal language, you want to, a customer would want to send the legal documents. I was like, summarize these legal documents. It's important for you, particularly if you just signed a loan, it would be nice if you actually go to your bank website

And then it says, summarize what is this mortgage? Because I guarantee you, no one has read the 125 pages, at least in the US, of what a mortgage loan is. But it would be really nice if you asked this, hey, was like, hey, can you summarize what I just signed up my life for? It's like, whatever. I just signed up for a million dollar loans if you're in the Bay Area plus. And I have no idea what I just signed up because there's no way that you're going to read. So those kinds of use cases tend to be the use cases for which you would leverage.

Prateek Joshi (27:16.951)
All

Alex (redpanda) (27:39.438)
you know, these services and they're all great. But for everything else, the trend that I am seeing in practice is that of wanting to do the same flow except where the model is inside the local network and data is never actually exfiltrated to a third party API.

Prateek Joshi (28:01.513)
Right. Those are great, great examples. Now I want to touch upon foundation models for a second. There are foundation models being built to handle various modalities of data. There are text, images, videos, are physics informed neural networks, so biology. So there are many different foundation models being built to handle all these use cases.

Do you think there should be a separate foundation model that can natively understand and handle streaming data? Is that different enough? Or do you think all these foundation models should exist separately, but they should have a streaming component to it? So how do you think foundation models and streaming data should or would come together in the future?

Alex (redpanda) (28:54.648)
Yeah, my thinking is that enterprise doesn't need foundation models. My thinking is that the enterprise needs fine -tuned smaller models. Maybe there's obviously, I'm sure if this podcast reaches enough people in the world, somebody is going to point out enough use cases where I was like, obviously I would be wrong. But as a general rule of thumb, I don't see foundation models being useful for the enterprise. And the thinking of this is that

You just don't need to understand how to fix the tire of a bicycle. And at the same time, something about drug discovery and being a carver is just like too, too expert. Now for end users, that's super useful. Like I pay for some of the services myself because you know, I have like a more, more of a, I'm not a company, I'm a person. And so I want to know about like, you know, woodworking or what, or this or that, or I want to know how, like, what I'm reading or I want.

this thing to summarize the product. And so my input into this model is so dynamic from an input range perspective. That doesn't tend to be the case for the enterprise. for streaming, and so the reason I say that is that streaming tends to deliver the most value as an architectural paradigm. As soon as you start to get a little bit more data than, know, whatever, one record a second or something like that. I think at that point, there are ways to think about the world that are very simple.

a simple database with a single JavaScript frontend, I think that would be fine. As soon as you start to get a little bit of a scale, then it's very freeing to architect your applications where you're not setting yourself up in a dead end. And so for streaming, I think that they are already built. So to answer the question more directly, if you look at the latest Lama 3 400 billion parameter model, there is this idea of

And so I think things like rack architectures, retrieval of mental generation architectures, or callbacks from the model itself are super useful. And so I do think that that has already been solved. And so to give context for the audience is that these models hallucinate, which means they make up things that are not true, but they sound rather authoritative on the subject. And so the way you eliminate some of

Alex (redpanda) (31:18.456)
hallucinations is you give it more context. So at the time of asking it for an answer, you can enhance it with additional context. And so that's really where we're seeing streaming data as an architecture pairing really nicely with this inference of models, where you just simply give it additional context so that the results that you're getting are better, so that the results that you're getting

are less made up. So if you actually just run the basic model, you get some inane answers for things that the model wasn't particularly well -trained on. Or if you follow the literature, they tend to remember that first and last and not the middle of the training and so on. There's all sorts of problems with those things. so streaming is a way to enhance the native.

perhaps the strengths. And so this is where you see the rise of vector databases and so on. Does that make sense?

Prateek Joshi (32:19.319)
Yeah, 100%. I have one final question before we go to the rapid fire round. And it's about what's happening in AI right now. So what breakthroughs in AI are you most excited about as it relates to handling

Alex (redpanda) (32:41.56)
Think time series models are particularly interesting. think Datadog just announced that they are building a time series specific model. I think Google is building one. I come at this less from the scientific background of understanding the models more deeply, like at a scientific level. I come from the infrastructure space, right? So more like the plumbing.

What are the things that we need to expose to engineers so they can be more effective? And so that's also a partial lens. I'm also super jazzed about what Lama3 is doing. And I say that only because Mark Zuckerberg happens to be a particularly visible figure. And he wrote something that I think aligns with my thinking, which is that open source models have now caught up with state -of -the -art models. I find that to be fascinating, right?

when Chai GPT launched, OpenAI launched their models, you're just sort of blown because it was different, or at least I was personally blown. I was like, my God, this feels amazing. This feels like the future feels real. It feels different. It's like I can ask it questions and it would answer smart things. And now with the release of Lama 3, it's just phenomenal just to see how fast the world was able to caught

catch up, excuse me, with the best models. So I think those two trends, one time series models, because there is a bunch of data for machine to machine workloads that are time series that we get to see a lot in practice. And I think the rise of open source models, whether it's Mistral, truly open source, or in the case of Lama, their definition of open source models that are on par with the world's best closed source models.

Prateek Joshi (34:39.831)
Amazing with that. We are at the rabbit fire round I'll ask a series of questions and would love to hear your answers in 15 seconds or less. You

Alex (redpanda) (34:49.518)
Let's do this.

Prateek Joshi (34:50.613)
All right, question number one. What's your favorite book?

Alex (redpanda) (34:54.062)
I haven't read a full book in I think over 10 years. I only read small chapters of books at a time because I don't have the time to read the whole book. But if I want to learn something, it's very utilitarian, very boring.

Prateek Joshi (35:07.575)
I think learning on demand is actually a really good premise. Karpathy talks about it in different shapes and forms. that's cool. Next question. What has been an important but overlooked AI trend in the last four months?

Alex (redpanda) (35:27.47)
fine -tuning I guess I'm not sure I think perhaps less less not not overlooked by experts but not well understood by the public is just how good a fine -tune model actually is in practice relative to foundation models for specific use cases

Prateek Joshi (35:54.709)
Right, that's actually a very good nuance here. Right, next question. What's the one thing about streaming data systems that most people don't

Alex (redpanda) (36:05.974)
It is easier to model the world in its natural way than to force the world at an architectural level to look differently. And so in my view, I think it's easier to start out with a streaming than it is to start out with batch.

Prateek Joshi (36:21.643)
Right. What separates a great AI product from a merely good one?

Alex (redpanda) (36:27.63)
the developer experience and the user experience, 100%. I think the developer experience is the last mile.

Prateek Joshi (36:34.537)
What have you changed your mind on recently?

Alex (redpanda) (36:45.186)
I would say I was skeptical of the role that streaming played, the foundational role that is streaming played in AI. And by recently, you know, I think as we started playing more and more with the technology in this last six months or so, was like, streaming tend to see almost a hundred percent of your data. And if you just make it a couple of percentages of martyr, there's a tremendous amount of value for the world.

And so it's pretty much a 180 in how I see the

Prateek Joshi (37:19.467)
What's your wildest AI prediction for the next four months?

Alex (redpanda) (37:28.562)
doesn't make any money for people. don't know if, you know, I do think that people struggle making money with AI. hope that the things that we're doing, or at least I'm doing, and the company Red Panda is doing, are here to help the world. I think that, like, businesses, the classical businesses, need to figure out how to monetize and transact dollars to continue to fund

the evolution of this technology. think we're in the really early innings and I think it would be a shame to sort of see it die off because it didn't make financial sense. And so I don't know if it will, but I fear that over the next 12 months, there will be few companies that actually make money with AI, few relative to like everyone trying.

Prateek Joshi (38:27.211)
All right, final question. What's your number one advice to founders who are starting out today?

Alex (redpanda) (38:38.286)
I am built for the long term. I think from a personal practices perspective is very easy to sprint because people have energy, they have passion, their ideas cool. This is why they started a company. They have intensity. They're different from a lot of people. This is partially why others give them money to go and start their company.

I think what's been different for Red Panda relative to the first startup that I had is a different way of thinking about my longevity as a CEO of this company and the fact that I have to pair it with healthy practices. And luckily for me, I don't think I would have learned that without having kids. like my kids was a forcing function, but in retrospect, actually having less time and being

You know, voluntarily, I love my kids. So I want to spend more time with kids. I want to be the dad that puts my kid to bed every night. And that's created a really great balance for me relative to what or how I used to spend my time before I had kids. It was very, I would say still not balanced on the hall. If I compare it with other friends, it's very much work

family for me. find that to be a great, I think a strategy for me personally, but I would say if I didn't have my kids, I would be even more imbalanced as a human and having to pause at the end of the day just for a couple hours and reflect. And so the short version of it is plan for the long -term. This is going to be a 10 -year journey and you

that's would say optimized for the long

Prateek Joshi (40:32.407)
Amazing. Alex, this has been a brilliant discussion. Obviously, so many hard -won lessons. The depth of technical knowledge is fantastic. So thank you so much for coming onto the show and sharing your insights.

Alex (redpanda) (40:45.23)
Thanks for having me, Pritik.