AI Meets Data Infrastructure: Cost, Performance, and What’s Coming Next — with Barzan Mozafari Artwork

What's New In Data

A podcast by Striim (pronounced 'Stream') that covers the latest trends and news in data, cloud computing, data streaming, and analytics.

All Episodes

What's New In Data

AI Meets Data Infrastructure: Cost, Performance, and What’s Coming Next — with Barzan Mozafari

June 05, 2025 • Striim • Season 6 • Episode 8

Barzan Mozafari, CEO of Keebo and former computer science professor, joins us to explore how AI is changing the way data teams work. We talk about the hidden inefficiencies in cloud data platforms like Snowflake and Databricks, how reinforcement learning can automate performance tuning, and why the tradeoff between cost and speed isn’t always what it seems. Barzan also shares thoughts on LLMs, the future of conversational analytics, and what data teams should (and shouldn’t) try to build themselves.

Follow Barzan on:

What's New In Data is a data thought leadership series hosted by John Kutay who leads data and products at Striim. What's New In Data hosts industry practitioners to discuss latest trends, common patterns for real world data patterns, and analytics success stories.

Speaker 1: 0:15

Hello everybody. Thank you for joining today's episode of what's New in Data. I'm super excited about our guests. We have some awesome topics today. We have Barzan Mozaffari, ceo of Kiboai. Barzan, how are you doing today?

Speaker 2: 0:27

Doing pretty well. Thanks so much for having me.

Speaker 1: 0:32

Yeah, absolutely, Barzan. You're an expert in this field. There's so much incredible stuff going on in terms of the implementations with Snowflake, Databricks, Data Lakes in general, and it's growing to such a massive scale, and we're going to talk about that today. But first I wanted you to tell the listeners a bit about yourself.

Speaker 2: 0:52

Sure happy to. I'm a co-founder of Keyboardai. Prior to that, I was an academic. I was a professor in computer science at the University of Michigan at Armburg. Before that I was at MIT and UC, berkeley and UCLA. Before that, I also worked for a number of companies in this space. I pretty much spent the last two decades of my career doing research at the intersection of AI, slash, machine learning and data-based systems.

Speaker 1: 1:20

Excellent. Yeah, so you bring an awesome technical research background and now it's applied in this incredible product that you're building, which really solves some critical problems for data teams that are operating at scale and sort of the intersection of data engineering and FinOps. But I wanted to first talk about this at a high level. So your product, kiboai, does focus on cost optimization. What are the biggest inefficiencies you see generally in how companies use Snowflake today and do you see any common mistakes that lead to runaway compute costs?

Speaker 2: 2:01

I think there's a lot of interesting sources of inefficiency. So, like we basically our product. So, just for real viewers, what Kibwaai does is that we're a data learning platform. We actually our models, learn from how your own users and applications enact with your own data in the cloud where you know whether it's Snowflake or Databricks and then use that to automate and accelerate the tedious aspects of the interactions between the data teams and the cloud data warehouse, and then that allows them to actually get more out of the platform. So some companies use that to reduce their spend on their Snowflake bill, on their Databricks bill, et cetera. Some of them use it to actually enhance their productivity, to get more done with the money that they're spending on these powerful platforms, and sometimes people use that to actually improve performance, for example. You know I wouldn't call it necessarily mistakes that teams make, but it's more about how these platforms have created new patterns and how some of the traditional data engineering and data pipelines are no longer as effective, right, and data pipelines are no longer as effective right. So, to sort of look at the bigger picture, there's a reason why the likes of Snowflake and Databricks have been very successful in this market, right, like what they've done traditionally. If people had to rely on these centralized on-prem data warehouses, go to DBAs, go to a centralized data team and everything basically had to go to DBAs, go to a centralized data team and everything basically had to go to that bottom. But what the likes of Snowflake and others in this space have done is they've really lowered the adoption barrier.

Speaker 2: 3:53

Right Now, it's significantly easier for organizations of any sizes. Like you know, a startup of like two people, all the way to companies of, like you know, 50,000 employees can leverage data. Any team can just tap into data and that time to insight is drastically carved out. That's all great right. That's the positive side of the story.

Speaker 2: 4:15

But the side effect of this is that the flip side of this is that now you've got more users tapping into more data and doing more things with it and, because it's so easy, they're also bringing in more data combining more data sources. So when you have a situation where more data is being queried, more data sources are being combined and joined together and then you have more users querying that, that means your modern data pipelines are significantly more complex, and it's not just about the mistakes that people make. It's just that there's a human limit to how much you can optimize a very complex pipeline. And now add to this complexity the fact that now that you have a lot more users querying this data, then it's just a natural fact of life that not every user is a database expert right Back in the day you had a BI analyst who knew exactly where to find what kind of data, how to hand-tune and hand-optimize every single query.

Speaker 2: 5:17

Now you have an analyst from the supply chain department, there's someone from the marketing team, there's someone from sales ops Like they're all writing queries. And you know, to put it in light terms, not everyone who's writing SQL queries has taken a, you know, has a degree in computer science, right? So some of those SQL queries have some, you know, room for improvement, right, let's just be honest about this. But now you've got like millions of these queries hitting your store. David Pérez, your store yeah.

Speaker 1: 5:48

It's just a matter of skill right, absolutely yeah, and I like the way you put it. But you're right. You know to use a marketing buzzword, you know Snowflake and these data platforms have democratized data.

Speaker 2: 6:02

Exactly Very well, it's an effective democracy.

Speaker 1: 6:07

And sure they made it easy for large companies to sort of build this customer data platform, data platform that everyone can use across different levels of expertise.

Speaker 1: 6:20

So, yeah, you might have people who are non-technical just doing like a select stars and dumping it to a excel file and filtering there and you know, and there's really. It is really great because it's increasing the efficiency and how fast companies can make decisions with data and you know what types of teams can make decisions with data because, like you were saying, traditionally, yeah, you would have a a very trained bi analyst who's working with a very finite set of uh resources in an on-premise data center and they have to write really great, efficient queries and everything has to be planned because that capacity was built out by an IT engineer who says this is running on a 48-core box in our janitor closet In our data center, but now everything's in the cloud and it's just like a utility and anyone can use it. But that's great that you're solving this problem and you know your product, interestingly, leverages AI to automate some of this query performance tuning. Can you walk us through how AI is being used under the hood?

Speaker 2: 7:27

Yeah, absolutely. So, you know, if you think about the complexity of what's happening, right the way. So the way these days people talk about AI is sometimes misleading, right, like I sometimes joke that like you see people who are basically selling cookies pitching AI. It's like you don't need AI for everything. There are situations where you don't need AI and AI is not applicable. So if you look at the complexity of the data pipelines, the fact that you've got millions of these queries hitting your cloud data warehouse at different times, with different teams and trying to g lean different types of insights from your data, some of these queries are pretty complex, right. So that's where I'd say it's humanly impossible to make the optimal decisions. It's humanly impossible to make sure that every single query is hitting perfect, optimal warehouse. It's humanly impossible to make sure that every single warehouse has the optimal settings of resources. Like, and it's not just a matter of effort because right now possible to make sure that every single warehouse has the optimal settings of resources. And it's not just a matter of effort because right now, let's say today, for example, at 9 am, a medium snowflake warehouse was optimal for my reporting workload, but two and a half hours later, a medium is not enough. I need to go to a large. Maybe after 5 pm I can go down to an X small. Maybe a sudden report comes in. I need to go to a large. Maybe after 5 pm I can go down to an X small. Maybe a sudden report comes in. I need to boost it up. Right, I need to bump up the size.

Speaker 2: 8:51

So, making these changes continuously by analyzing millions of statistics and calculations, it's where, actually, that's the sweet spot for AI. I call sometimes what we do at Kibo. We're building an infinitely patient, infinitely competent DBA and a lot of people have seen what AI can do when they go to chat GPT and ask a very complex question. They want to chat GPT to research something and distill it down. We're leveraging AI in a very similar way when it comes to optimizing your cloud data warehouse, right.

Speaker 2: 9:29

So we actually analyze millions of statistics over the last whichever number of days of this warehouse which queries have run on it, which users have sent what kind of requests, what has been the performance, how long each of those queries spent in queuing and based on those, we make optimal decisions about okay, right now, this is the optimal size for this warehouse Right now to get the SLAs that the users care about. Here's the kind of resources we need. Right now. It's idle. Right now. This is unnecessary. We can get away with a smaller, let's say, compute size. Let's say, right now we decide that, hey, we can get away with a smaller, let's say, compute size. Let's say, right now we decide that, hey, we need more clusters, we need smaller clusters, larger clusters. And then we can also have recommendations for users in terms of what are some low-hanging fruits they can do to improve their own productivity, improve performance, but also drastically reduce the computational footprint and the overall cost to their company.

Speaker 1: 10:28

That's excellent. Are you using the mainstream foundation models or are you training your own models or a mix?

Speaker 2: 10:35

So it depends on what the user is trying to do. So what we do at Kibo is we're not replacing the data engineering team, we're basically augmenting them. We're empowering them to get more done right. Our mission at Kibo is to empower data teams to basically take control and drive growth to our region. So the data team basically defines their own performance guardrails, their own goals. So, for example, if a team's goal is to say you know what, this is a mission-critical workload and needs to finish by this particular time, then they set up those guardrails performance guardrails in the system and then we basically train our models and to answer your questions in that situation, for example, we leverage reinforcement learning. That's roughly the third step in geriatric AI is reinforcement learning.

Speaker 2: 11:24

You train an agent and much like how you teach your kid how to, for example, eat right, whatever they eat, without making a mess. You congratulate them, you clap for them right and every time that they make a mess you tell them hey, you shouldn't do that right. So the same way with reinforcement learning agents. Every time that that agent makes a decision, for example, that agent decides to send this query to a different warehouse, or decides to reduce the size of this warehouse or to optimize or increase the memory on a particular instance. Every time it makes the right decision. And what's the right decision? It depends on what the user's goal was. If the user's goal is to save money without improving, without causing the performance slowdown, then whenever the agent takes an action that leads to that outcome, we reward the agent, and whenever it takes an action that doesn't help with that outcome, we penalize the agent. So very quickly these agents become an expert in your own workload and it actually allows people to think that, hey, that's going to take days or weeks or months for me for that agent to learn my workload.

Speaker 2: 12:36

But the thing is we actually accelerate the learning. The models actually start warm. They only have built-in knowledge right General knowledge about how Snowflake works, but they also pull the metadata from the last three months. So most of our customers actually see significant savings in less than 24 hours from when the second that they are moved. So these models are actually very, very quick to learn and start delivering value. That's the beauty of free personal learning. But we've also started investing in allowing users to leverage Gen AI and MLM to actually rewrite queries. We actually have papers out there that people can read in terms of how you can build. So the work is called, the product is called GenRevite, where we actually leverage generative AI to rewrite these queries using an LLM, but in a way that preserves the semantics, make sure that the related query returns the exact same answer, but also is significantly faster, is significantly faster.

Speaker 1: 13:43

Yeah, that's excellent. And on top of the hard cost savings here that you're providing just on the utility that we described as Snowflake and other data lakes and infrastructure there's also an element of total cost, of total cost of ownership, because when I'm a data engineer and I get, you know, tons of requests from my business team, my first response isn't oh, how do I, how do I make sure this is cost optimal? No, I'm like I'm spending 100 of my brain power make sure I'm solving the business problem. And then delivery, delivering that and getting acceptance and you know, uh, thumbs up from business leaders that leaders that these new reports or these new data applications are working and providing value.

Speaker 1: 14:33

If I can just have something that will automatically cost optimize everything for me, I want to be lazy, I want to just write bad SQL because that's fast. I can optimize anything if you, if you give me extra time, right. But you know, having a tool kind of do that for me, right, it just accelerates the, the time to value there and overall cost ownership of building and maintaining these data pipelines. So so, yeah, it's a great application of AI for that specifically. So, yeah, I love your approach there.

Speaker 2: 15:11

No, thanks for saying that. So I think one way to look at this is, to your point, like it's like given enough time and patience, everything's possible where there's a cost associated. So one of the biggest problems that we see teams face like we were talking to this major insurance company their C-suite got leaned outside of us, so they decided we're going to use Snowflake, right, but then the adoption was really slow. So they purchased millions and millions of dollars worth of credit, but then we landed this problem where they weren't able to onboard those use cases without the worry of hey, what if someone leaves a 3X large warehouse on over the weekend? What if someone writes a really dumb query? And then we burn through the millions of dollars very quickly Because we invested $10 million. Now we have to protect that investment. So, instead of sort of being focused on how to drive routing from the $10 million, now the data team becomes guardians of that $10 million investment, and the way this works is that they have to basically build a lot of in-house tooling to make sure that they have alerting in place, that they have guardrails in place, that people don't do silly things with that cloud data warehouse.

Speaker 2: 16:42

And a significant portion of these data teams is just spent and maybe, allow me to say, wasted on staring at these queries, figuring out how to optimize them, looking at the bill, trying to make sure they stay within budget, and all of that stuff. And that's not value add and that's not fun work to do. Right, like, for example, if you're an interest company, you grow your business by offering more competitive rates. Right, if your engineers are constantly worrying about how to improve those great, you know the performance, every single query, making sure they don't run out of budget, they don't do anything bad. Like you know all of that stuff. If you could automate that and have your data teams focus on growing your business instead of running and optimizing your infrastructure, I think just you can imagine how many engineering hours, how many thousands of engineering hours, that there'd be for you to have to do actual productive work that drives your business forward.

Speaker 1: 17:46

Absolutely, ed. And now we're seeing so much evolution and fast-paced adoption of AI in the data stack. So, seeing AI, like you know, for example, you're solving a very targeted use case using AI to optimize the compute and resources of these data engineering workloads. Beyond query optimization, like, where do you see the biggest opportunities for AI to improve how companies use and analyze their data?

Speaker 2: 18:16

Beyond optimization.

Speaker 1: 18:18

Yes.

Speaker 2: 18:19

So I mean you've seen a lot of recent, probably investment in RACs, right, like how we can actually leverage MLMs for better retrieval from your own database, right. So that's an area we're also actively looking into is you can go to ChatGPT, you can go to Cloud and other LLMs out there and ask general information about publicly available data. But you have these extremely valuable data sources but the value is kind of locked out, and so we've seen about three decades of BI technology trying to make the data accessible, like this idea of democratizing data and insight is not a new one. People have been pitching for the last two decades, but the better truth is that it had already happened. You know, john Smith, who's sitting in the marketing department or ops department, is not empowered to just go and open Looker and just magically ask whatever questions they have. There's these legacy internal search systems like hey, go and search Confluent to see if there's anything, but the search is really, really bad. You cannot really get to data. It's just like what we're seeing is that we're transitioning from search actual conversations, right. So, like Google, even like you know, back five years ago, the primary way of learning was something you would Google it and Google would do a really good job of showing you the most relevant sources, but then you'd have to go and read those sources, distill the information, decide if that's what you wanted or not, and then go and revise your keywords and rinse and repeat right. But now you just go to ChatGPT and ask exactly what you have in mind and they give you the answer. So what we're seeing here is that on the enterprise side and even mid-market people are trying to sort of have the same conversational interface to their own internal data.

Speaker 2: 20:23

Hey, curious, what happened yesterday? Did we sell more in this department? Or? I see that basically my customer acquisition has gone off today. What are the biggest root causes of Like? You need a really competent data science team to go and perform those root cause analyses and it's going to take them a long time to come back with reliable answers. But if you have a rack built up internally, you can hook it up with your own database. Have an LLM next to it in a way that addresses cost considerations. I do not bring through thousands of dollars on LLM invocation costs, api calls where it addresses hallucination problems. It addresses compliance problems. You're not sharing your internal confidential data with an outside LLM. It addresses security concerns, that hey, I am authorized as a user in this company to look at this particular table and I can get quick, meaningful, actionable insights. I think that's where the future lies. There's a to be honest with you, there's an armband right here. There are a lot of people trying to provide solutions in this space.

Speaker 1: 21:50

Yeah, it's a super interesting area and kind of combining the whole semantic layer with this. Just, you know, natural language interface where, like you said, someone who kind of just wants to ask kind of arbitrary questions about the data right and get very useful answers, kind of like what we're getting today, like you can chat with chat GPT, kind of like what we're getting today, like you can chat with chat GPT. You know, you don't have to be particularly specific about what you want and obviously the more specific you are, the better responses you get. But you can still get great responses just by chatting right, just without being too precise. So you know, being able to extend that to data where data, like queries, are typically very like deterministic and you have to know the exact table be able to extend that to data where data like queries are typically very like deterministic, and you have to know the exact table, you have to know the data model, you have to know the right way to query it, et cetera.

Speaker 1: 22:41

Now you can kind of open it up by saying, okay, just let me chat with the data. You figure out what SQL that's going to generate, or you figure out what Python you're going to write to, to to come up with that To get that done Exactly. That's an exciting area. Yeah, for sure, and yeah, so hopefully we'll see some cool stuff continue to come out there.

Speaker 2: 23:02

No for sure. I think those three, like traditional conservations, are still going to be there, right, like cost is going to be there the cost, the effort and the accuracy.

Speaker 2: 23:14

Those are the three things that I think are going to be the cost of all of this that, like we saw, like with the adoption of Snowflake and Databricks, people very quickly realized these things can get very expensive very quickly right, then the effort needed, right, the amount of like we actually I've never seen a company that's spending, for example, more than $10 million on their snowflake, or even north of a million dollar, except that they also have a significant a number of engineering cycles being spent on monitoring and optimizing and worrying about it. Right, and it just makes you wonder is it actually worth a company that's in a different industry to go and build tooling to monitor their own tooling spend? Right, and that's why you see all these final solutions coming up, because cost is becoming prohibitive very quickly. But then the way that people are trying to adjust cost is by putting in more effort, which is again expensive. So people have figured that, hey, we're spending, for example, $5 billion on our Databricks or on our Snowflake.

Speaker 2: 24:24

So maybe if we have three full towers just constantly optimizing these pipelines and keeping an eye on it and whatnot, that's five times $250,000. That's not too bad right, but if you it and whatnot, that's five times $250,000. That's not too bad right, but if you think about it, that's not a really good ROI. Yeah, you know, this is exactly when you can leverage AI to monitor your own uses of AI and make sure that things the wheels don't fall off the bus.

Speaker 1: 24:53

Yeah, and you know, when I look at what data teams want to do just generally with AI when it comes to business logic and their own internal applications, you'll see data teams kind of gravitate towards you know they understand that and they want to implement it. But you know cost optimization and you know some other you know kind of things that are adjacent to solving the core business problem is such a great place for AI to just solve that for you, right, and you know I think solutions like Kibo make that super simple, especially since you sort of in your own way, generalize the problem and you've done reinforcement learning on your models constantly and you're bringing that in with your product. So one thing I want to ask you is self-like users are constantly having trade-offs between performance and cost. How does Kibo help them strike the right balance between keeping queries fast and then cost optimal at the same time?

Speaker 2: 25:56

That's a really good question. I think the biggest learning we've had and we work with hundreds of data teams, right, like literally hundreds of these data teams, if not thousands the biggest learning we've had there is that there's no one size that fits all the um. The particular data team at this particular company at this particular time has a completely different view of where what's concerned. In the right trade off between the two, then a different company, even within the same company, we see very different understandings of what's concerned good performance on this particular workload versus this other one. There's situations where the customer says you know what this workload, as long as it's finished before 6 am, so the dashboards are off. Today I couldn't care less how long it takes, just keep the cost low. You know, be as aggressive as possible as you want, as long as the jobs don't fail. They finish before the first employee shows up. And then there's some you know other workloads where like this is like if this thing slows down, people start complaining and then I'm going to rip off the solution.

Speaker 2: 27:07

So what we have done is we've followed two design principles in our product and I think that's been incredibly helpful both to us as well as to our customers, aligning what we're doing with what our customers are looking for. Number one is that we've decided that whenever there's a fork, we're going to go with performance, and the reason for it is when you talk about full automation, the analogy I usually give is like autonomous cars, an autonomous car when you're building an autonomous car, your first crash is going to be your last crash, because people will never trust you. Let's say I could save this customer 30% today without any chances of a slowdown or negative impact on the pipelines, or I could push it and save them 35%. But there's a small chance that some of these queries are going to actually slow down and the users will complain and then the customer will complain and then the trust will be ruined. So whenever there's a situation like this, we actually go to 30% savings, not the 35%. The reason for it is no one gets angry. No one yells at you why you only send me 30% instead of 35%. But they will yell at you if you cause their work pipeline to break. If you make the user's experience you know degrade, they're going to complain, right. The same way with an autonomous car If you get home at five, right, you're not going to complain.

Speaker 2: 28:48

Why did my car not get me home by 4.58? It's just two minutes of delay, like you know. So what? They took him some mile to my route, maybe drove a little slower than they should have. It's tolerable. But if that autonomous car gets you into a crash, you're going to be really upset.

Speaker 2: 29:09

So that's the first thing we followed. Is that all other things even and equal? Whenever we have a fork, we go with protecting performance rather than saving more money. So the default is how can we save the most amount of money without impacting performance? That's what we call no-brainer shavings, because if we tell customers we can save you, cut your snowflake bill by, let's say, 40%, but 10% of your queries are going to be considerably slower, then they have a trade-off. They have to think about it. Those decisions, believe it or not, are very difficult. They're paralyzing decisions because you don't know how people are going to react to their queries taking 10% long. But if you told that same customer, guess what? I can save you 30% and no impact on your work. That's a no brainer. No one has to think about it, right? So that's the first thing we've done, and the second thing is back to what I was sharing earlier.

Speaker 2: 30:07

Every customer is unique, every workload, every warehouse is unique.

Speaker 2: 30:10

We we have the most comprehensive suite of basically flexible uh suites where basic people can go in there and pull their own levers they can.

Speaker 2: 30:20

We have a slider, for example, where they can say how aggressive we're all around Conservative they want to be with this particular workload. You know, that's the first layer. The second layer they can go in there and actually define SLAs. That's, for example, I want to make sure the Maginot personnel latency stays less than two seconds. I want to make sure the 99% latency stays less than two seconds. I want to make sure the number of queues is less than this and that there's another layer on top of it where they can actually even define rules that the system will protect. So those are all guardrails and then we train the agents AI to actually operate within those guardrails that the users have specified. So short answer to your question is we never have to make that decision. What we have done is that we built a very flexible interface where users can tell us what is it that they consider good performance and what is their goal, and then the agent exactly delivers what they care about, what is productive, what team performance, maximizing savings, et cetera.

Speaker 1: 31:17

Yeah, it's so interesting and it might be one of those things where if I ran a data team and I really had to solve let's say, the FinOps team or the finance team has been on me about my cloud costs first I would use something like this to optimize my costs and then see, you know, is anyone complaining by the queries being 10% slower?

Speaker 1: 31:40

You know, if yes, then you know I'll pull the throttle back on Gauss optimization, but you know. But like you said, I mean it's no one size does not fit all for sure and you know it's definitely something that every data team needs to evaluate for their own operations and their own business. But it's great that you've kind of figured out how to add the right levers and ways for them to approach that, you know. And I like the autonomous driving example because it's like, yeah, I need to be home by 5 pm. You know I can probably sacrifice, you know, some some time to get there safer and more comfortably, smoother ride, less lane changes, like maybe I'll get there at 502 pm, that's fine. But if it's going to get me there at you know, 5, 45 pm, because it's going to drive in 10 miles per hour in the right lane, uh, the whole time. Then I probably won't accept that.

Speaker 2: 32:35

That's so, but I really liked your example too. It's like you know, that's exactly the kind of interaction that you, as a data leader, were able to do is like to try something, wait to see if people complain or notice, and if not, then you know that's okay and then you try again until you find that sweet spot right. So that's exactly where humans can handle that right and we rely on our champions and our data engineering that's leveraging Kibo to have those kind of decisions, make those kind of decisions and have those kind of conversations. We just give them the levers Today, if you want to go 10% more, here's a lever, and then you tell us if it was too aggressive, you pull it back.

Speaker 1: 33:18

Yeah, exactly, barzan. You have both an academic background and an entrepreneurial industry background, so I want to get your take on this. Where do you see AI-driven automation going in the next one to three years? Or do you see AI-driven?

Speaker 2: 33:37

automation going in the next, you know, one to three years. I usually say, like you know, predictions are incredibly hard but also incredibly easy. At least it wasn't the time. Where people make predictions about the future, no one goes back and holds them accountable. I will. It's not going to work. So that's why I said it's hard and easy.

Speaker 2: 33:59

But, joking aside, I think a lot of what we're seeing like LLMs have really gotten to a point where there's a the analogy I give people it's rather an overnight thing, right, like if you're staring at the wall right and someone has a ladder stood up against that wall and they're climbing that ladder every single day. They're making progress, they're getting closer to the top, but you're waiting on the other side of that wall. You don't see any progress, right, until that person, their head, makes it above the wall. You know the wall line and then that's when you see it. So to you it feels like a sudden thing where after, let's say, 20 minutes, nothing was happening and suddenly you see a person's head and now you can see more and more of their body as they climbed that wall. But for the person who was on the other side it was not an overnight thing. They'd been basically climbing that ladder continuously. So I think that's a different perspective between academia and industry, right, like it wasn't. Like hey, suddenly now we have chat GPT, like you know, we've been part of that progress and witnessing it and witnessing it and contributing to it, like for the past decade and a half, two decades, right, but I think finally the compute power and, you know, a sheer volume of data that we could actually train these models on, got us to a point where now the public can see the value.

Speaker 2: 35:27

So you asked me whether I think it's going to happen a year to three years from now. I think we're going to see basically a lot more real-world adoption of basically these agents and all walks of life. Right now, it's just a cute thing to go and ask ChatGPT, but people are actually building interesting verticals and apps on top of ChatGPT where you can, for example, go and redline your document, review your lease with these agents. Like you know, you're hiring someone in a completely different country. You can actually ask one of these LLMs to check for compliance, right, because they have. You know, they have trained on those local laws and all of that stuff. So I think we're going to see more adoption over the next year.

Speaker 2: 36:23

But if I were to look at the next 10-year horizon, I think what's going to factor, honestly, is that a lot like I don't want to say software engineering is going to go away, like you know, because there are people who say you know what? No one else is going to be writing code. Everyone turns into a prompt engineer. I don't think that's going to happen, but I think what will happen is that an average engineer will be a lot more those engineers who want to stay relevant, whether a data engineer or a software engineer. They will be a lot more well-versed in machine learning, statistics and AI.

Speaker 2: 37:06

Right, I think what's happening right now with this, I guess over the next 12 months and this is something that's been happening over the, I would say, over the past year and a half and probably another year we're seeing some irrational resistance and you see some engineers or some teams feeling threatened by AI that oh, ai is here to take my job away. So what happens is that there's still a lot of teams who are very locked in to try to build every tool inside. I mean, we face this with Snowflake too, like you'll be surprised how many people want to actually build their own optimization tool in-house, and this tendency to I think that sounds awful.

Speaker 1: 37:51

By the way, I would never volunteer for that. No, you shouldn't never volunteer for that. No, you shouldn't it's a buy versus.

Speaker 2: 37:55

Like you said, it's a probably century long, you know, buy versus build kind of question. But I would say that's going to change quite a bit over the coming years.

Speaker 1: 38:03

Yeah, absolutely. And you know when I hear that you know being thrown around. You know AI is going to replace engineers. I always just anthropic and open to AI hiring all these engineers that are supposedly going to get fired in the next year.

Speaker 2: 38:17

My joke is that if you're afraid that AI is going to take your job away, it probably is so. Instead, you need to learn more about how you can best leverage AI, rather than resisting it.

Speaker 1: 38:29

Absolutely, barzan. This was really great. Thank you so much for joining us today and thank you to all the listeners for tuning in Barzan. Where can people continue to follow on with you?

Speaker 2: 38:41

If they just go to kiboai K-E-B-Oai, or follow me on LinkedIn.

Speaker 1: 38:49

Excellent. We'll have those links down in the show notes for the listeners. Barzan, thank you again and thank you to the listeners for tuning in.

Speaker 2: 39:00

It was a great pleasure, John. Thank you so much for having me.