How a CTO and Founder Uses AI to Cut Cloud Waste Artwork

How I AI

How I AI showcases the people shaping the future with artificial intelligence. Host Brooke Gramer spotlights founders, innovators, and creatives who share not just the tools they use, but the transformations they’ve experienced. Human-centered storytelling meets visionary insights on business, culture, and the future of innovation.

All Episodes

How I AI

How a CTO and Founder Uses AI to Cut Cloud Waste

January 13, 2026 • Brooke Gramer • Season 1 • Episode 42

In this episode of How I AI, I sit down with Leon Kuperman, CTO and Co-Founder of CAST AI, right after his keynote at the Miami AI Summit during Art Basel Week.

Leon brings over 20 years of experience across software engineering, cybersecurity, and infrastructure at scale. After selling his first company to Oracle, he now focuses on one of the most overlooked forces shaping the AI economy: how efficiently we run what we build.

If you’re building with AI, scaling a product, or trying to understand what’s really happening under the hood of the modern cloud stack, this episode will change how you think about growth.

Topics we cover:

What Kubernetes actually is and why it matters for AI-powered products
How workload rightsizing reduces cloud waste, energy use, and spend
Why cloud overprovisioning has become an industry-wide problem
How startup velocity is increasing with smaller, more technical teams
When founders should stop relying on third-party AI APIs
Why security and compliance must be built in from day one
What Application Performance Automation (APA) unlocks in 2026

Tools, Platforms, and References Mentioned:

Coursera (machine learning foundations)
CAST AI Kubernetes Cost Benchmark Report
Kubernetes
AWS startup credits
Google Cloud Platform startup programs
Microsoft Azure startup programs
PCI DSS (Payment Card Industry Data Security Standard)
Amazon Rufus (AI-assisted shopping agent)
The Lean Startup (book)

Who this episode is for:

Founders building or scaling AI-powered products
CTOs, engineers, and technical leaders running cloud workloads
Startups navigating cloud costs and infrastructure decisions
Operators focused on efficiency, margins, and sustainability
Anyone curious about how AI actually runs behind the scenes

Learn more about CAST AI:

Website: cast.ai
Trust & Security: trust.cast.ai
LinkedIn: https://www.linkedin.com/company/cast-ai/

Ready to cut through the AI overwhelm?

Explore all my resources in one place → https://stan.store/BRXSTUDIO
Free AI Guide • 45-Minute AI Audit • AI Community • How I AI Companion GPT

If you enjoyed this episode, please rate and review the show. Sharing it with a friend who’s AI-curious helps How I AI reach more people.

Support your body against the invisible stress of EMFs with Leela Quantum Tech.

More About Brooke:

Website: brookex.com

LinkedIn: Brooke Gramer

More About the Podcast:

Instagram: howiai.podcast

Website: howiaipodcast.com

Leon: 0:01

something like 13% of the resources that customers, basically all cloud users, request get used. That means there's something like 85% waste.

Brooke: 0:11

Wow.

Leon: 0:12

That waste doesn't come without an energy footprint.

Brooke: 0:14

Yes.

Leon: 0:15

And it doesn't come without a cooling footprint, and it doesn't come without a cost footprint, so mm-hmm. We have to do better across all of those fronts.

Brooke: 0:23

Welcome to How I AI the podcast featuring real people, real stories, and real AI in action. I'm Brooke Gramer, your host and guide on this journey into the real world impact of artificial intelligence. For over 15 years, I've worked in creative marketing events and business strategy, wearing all the hats. I know the struggle of trying to scale and manage all things without burning out, but here's the game changer, AI. This isn't just a podcast. How I AI is a community, a space where curious minds like you come together, share ideas, and I'll also bring you exclusive discounts, and insider resources, because AI isn't just a trend, it's a shift, and the sooner we embrace it, the more freedom, creativity, and opportunities will unlock. Hello everyone and welcome back to How I AI. I'm your host, Brooke Gramer. Today's guest is Leon Kuperman. He's the CTO and co-founder of Cast AI. I caught Leon right after his keynote at the Miami AI Summit during Art Basel Week, and we go deep on the part of AI that quietly decides everything: compute. Leon explains why Cast AI focuses on automating the boring, hard, tedious infrastructure work so teams can ship better software

1:49

with

Brooke: 1:49

better performance, stronger security, and at lower cost. In today's conversation, we dig into what founders get wrong, why cloud credits are helpful, when you need to stop relying on third party endpoints and what he's building next during 2026. If you're building with ai, scaling a product, or you just want to understand what's under the hood of this whole AI economy, you're going to love this episode. But before we dive in, I wanted to take this opportunity to share a technical term and concept I learned ahead of this interview. Kubernetes or K8s is basically a traffic controller for apps that run in the cloud. Think of your app like a restaurant. Your code is the recipes. The containers are like portable cooking stations. Each one includes everything it needs to make a specific dish so it can run the same way on any computer. And the servers are the kitchen space and equipment. Now, here's the problem. Customers show up in waves. Friday night is chaos. Tuesday afternoon is dead. Just think about e-commerce weeks like Black Friday and Kubernetes is the manager for all of that. It starts more cooking stations when demand spikes, so your app doesn't crash. It shuts down extra stations when demand drops so you're not paying for empty kitchen space. It restarts stations if one breaks. It basically keeps everything running smoothly without humans having to overlook it 24/7. So what does Kubernetes have to do with cast ai? Most companies use this to run their apps, but they still waste a ton of money because they've overorder cloud resources. Think about too many servers, too much compute, too much memory. This type of technical setup gets complicated fast. Maybe they pick the wrong type of cloud machines. They leave things running when they don't need to. So teams start to play it safe. And safe usually means expensive. What Cast AI does is that it sits on top of Kubernetes and helps make smarter decisions automatically. Back to the restaurant analogy, Kubernetes is the manager and Cast AI is the super smart assistant manager with a calculator and a six sense who can find waste like empty tables you're still paying for, it fits you in the right size kitchen, so you're paying less for equipment, it keeps your performance strong by lowering the costs and it does this automatically, so your team doesn't have to constantly fine tune it. Why does all of this matter in the AI era? Well, AI features often make apps more expensive to run because they use more compute, and when startups start to scale and the bills jump, teams either panic and slow down innovation, or they keep shipping and quietly bleed out money. Kubernetes is how the modern team runs these workloads at scale and Cast AI helps Kubernetes run it smarter, cheaper, without breaking performance. Alright, now that we've had that quick learning moment, let's dive into today's interview with Leon. Hello Leon. How are you today? Thank you so much for being on How I AI.

Leon: 5:17

Hey, Brooke. Good to be here. And glad we could set this up finally.

Brooke: 5:20

We are in the thick of Art Basel Miami Week, and we just had the Miami AI Summit, which I got the pleasure of watching you open with the keynote talk, and I would love for you to share with listeners today all about Cast AI. First, let's start with how you started How did you get into tech? Maybe it's a childhood story. I always love to get behind the person. You know, you're the CTO of Cast AI, but let's get to know you today.

Leon: 5:50

Sure. So, uh, I'm in the immigrant child from an immigrant family. We moved to North America, 1978. I always joke with my dad the difference between me and Sergei Brinn is, is that his dad was a mathematician and mine was a mechanic. Yeah. But uh, I come from a family of engineers. Mm-hmm. Uh, my mom is an electrical engineer. My brother is a mechanical engineer. My dad is mechanical. And I decided I was fascinated with computers from a very young age. Mm-hmm. And as you know, 10,000 hours. Really stacks up when you start coding at a very early age. In my case, I was around nine.

Brooke: 6:27

Wow.

Leon: 6:28

Um, so by the time I got to university, I was, uh, really doing something that I loved and I've always found joy in merging business with, with technology. Mm-hmm.

Brooke: 6:38

Uh,

Leon: 6:39

so my mom found this, uh, very old letter that I wrote. I think I was in grade six or seven. And it is something to the effect of, I love building software and I hope one day my software will be good enough that people will want to buy it.

Brooke: 6:53

Tell me about your early projects into AI and tech. What were some of the things that got you into the space? What were you creating in the beginning?

Leon: 7:00

So early on I was not a, I'm not a trained machine learning. Mm-hmm. Or AI expert by education. I and I, and I think when I was going through university, it didn't really exist. I, you, you take AI courses, but these were all primitive techniques back in the late nineties. And, I self-taught through and I, I would highly recommend people still go. So Coursera has this amazing Yes. Stanford course. The professor's name is Andrew Ang it's uh, I think an eight or 10 week course that I went through while it was working and I did it back in 2014. Okay. Or 15, like somewhere right around there. And I started to realize that a lot of problems could be solved, not through heuristics, standard heuristics, but through predictive models. And we applied that in our the first company that I built with this group of co-founders is a company called Zenedge.

Brooke: 7:57

Mm-hmm.

Leon: 7:57

Which we ended up selling to Oracle eventually, but we, we were trying to solve a cybersecurity problem through machine learning. And the idea is, could we detect malicious web traffic without having a set of hard-coded rules?

Brooke: 8:13

Yeah.

Leon: 8:14

But allow the machine to learn what's the difference between good and bad. That was a very early experience.

Brooke: 8:19

Yeah.

Leon: 8:19

Very rudimentary models. But it sparked this, this like 10, 15 year curiosity in machine learning, which has come so far in the last. I would say five or six years extra exponentially faster in the last two to three years. Mm-hmm. And if you think about it, the generative AI that we all are getting comfortable with today

Brooke: 8:41

right,

Leon: 8:41

is really an extension of machine learning. It's a token predictor system. Mm-hmm. So a lot of the techniques that we learned early in early days still very much apply. Maybe some folks that are deep into it now, like as users don't understand all of that theory, but I would highly recommend as if you want to understand the foundational principles. There's nothing better than starting with kind of a machine learning 1 0 1 education.

Brooke: 9:06

So fascinating that you took Coursera. I've looked into that myself. I, my motto is always be learning. And to be consistently upskilling. And a lot of people use YouTube to learn what it is that they're doing and selling right now. Um, so take us to present day and everything that you're doing at Cast ai. What does your typical client look like? What does the average service and most common service you provide?

Leon: 9:32

So I, I think, let, let's start back in 2020 is when we started the business. Mm-hmm. And we had this very strong belief that the world of software. So our customers who are building software, we believe that we, they deserve and we want to help them build the best software and deliver the best software they possibly can. And in order to do that, we think there aren't enough people that are currently writing, building, maintaining code.

Brooke: 10:02

Mm-hmm.

Leon: 10:03

To allow software's growth to extend the way it needs to. So we need to automate things that are boring, that people don't want to do, things that are hard to do and tedious and and mundane. And so we came from a principle of, okay, how can we help our customers deliver the best quality software that they can at the the most secure, best performing, lowest cost price possible? And so we took a couple of bets or, or created a hypothesis, we said, okay. This is a little bit technical, but I'll, I'll kind of give you, tell you why we're, why we're starting there. Most application software applications are going to be containerized, so if you, if for listeners who don't know what containerization is, it's just a way of packaging software. In a way that's highly portable, so you can put it on my computer. Your computer or the cloud. Okay. And it doesn't matter.

Brooke: 10:55

Kind of like Canva is, I would say highly that way you can access it pretty much anywhere.

Leon: 11:00

Yeah. So that's a software as a service. Mm-hmm. But like, if you actually look at how Canva builds and deploys their software, they package it in this way, that's really easy to deliver. So we said, okay, there needs to be an operating system.

Brooke: 11:13

Mm-hmm.

Leon: 11:13

For that whole containerized world, and there was a number of competing standards back in the day. Docker was one of them. Google had just released and open sourced their internal system for delivery.

Brooke: 11:27

Mm-hmm.

Leon: 11:27

It was called the Borg internally, if you can believe it. Okay. Um, You know, Resistance is futile, but it came to the market in open source as Kubernetes. Kubernetes is just a Greek word for for navigator. So everything you'll notice in our world is navigation themed or, or at least nautical themed. So we took those two bets and we said, okay, if that's true, all of the work that needs to be done to maintain software and the way it's deployed in this ecosystem is extremely hard, and we're just not gonna have enough humans to do it. So how do we apply smart algorithms to do the work for people, and that's where we kind of started. Now as we got going, one of the huge side effects of that automation mm-hmm. Was our customers started paying a lot less for their infrastructure.

Brooke: 12:13

Great.

Leon: 12:14

As a result of not having to make human decisions that were often suboptimal, people started saving a ton of money. So naturally that became the easy sales motion for our enterprise sales team. I would say that automation is what we deliver. The byproducts are cost savings, better performance, higher, better SLAs and so forth. And we do that through a combination of algorithms. Some of them are AI driven, some of them are heuristics. And my personal belief is, is that you use the simplest algorithm for the job. So if you can solve something with an, if the now statement, do that. Versus trying to create a very fancy model that will chew up a lot of horsepower for no reason.

Brooke: 12:56

Yes. Speaking of, chewing up too much horsepower. We talked a lot about at this summit, you know, energy resourcefulness and this ethical responsibility for making sure that our output, the necessary power and wattage use that we're doing isn't just ai generative slop for fun. If you could expand on Cast AI and its ability to be more resource efficient and why that's so important right now in the future that we're headed with ai.

Leon: 13:26

Yeah, that's, that's a great. It's a great point. So you don't kind of get cost savings for out of thin air.

Brooke: 13:34

Mm-hmm.

Leon: 13:34

You get them by requesting fewer resources, which means those resources take less power, take less cooling, and can be allocated for other applications in a data center. So the cloud doesn't have to provision more and more and more. They can do more with less. Mm-hmm. So, so that is a really natural byproduct of what Cast delivers and we have very specific components in our system that take care of that. So, for example, there's this, uh, fancy term called workload rightsizing. Basically, when a developer, a software engineer says, I'm, I'm gonna deploy this application, they also tell the system how many resources they think they need. Now that's usually a finger in the air hypothesis, like, I don't know, like. I need four CPUs and 16 gigs of Ram and two GPUs. I don't know, And roughly do people take the time to benchmark those things effectively?'cause it's too hard?

Brooke: 14:26

Mm-hmm.

Leon: 14:27

And what we do is we benchmark those applications in real time and they not adjust the resource requirements to exactly what they need. Not more, not less. Sometimes they're underprovision, most of the time they're overprovision. We actually have a really good state of the Kubernetes cost benchmark report where we talk about the over provisioning in the industry. Oh, and the numbers would be pretty shocking. So something like 13% of the resources that customers, basically all cloud users request get used. That means there's something like 85% waste.

Brooke: 15:00

Wow.

Leon: 15:01

That waste doesn't come without an energy footprint.

Brooke: 15:04

Yes.

Leon: 15:05

And it doesn't come without a cooling footprint, and it doesn't come without a cost footprint, so mm-hmm. We have to do better across all of those fronts. What exasperates the problem with AI is, is that GPUs are very power hungry.

Brooke: 15:19

Mm-hmm.

Leon: 15:20

They're very hot machines. They, they need to be cooled. Yes. And so when you start putting hundreds of thousands of H100s into a data center or whatever the next generation is. You are gonna produce a lot of heat and consume a ton of power.

Brooke: 15:39

Hmm.

Leon: 15:39

And that was that gigawatt conversation we were having. Yes.

Brooke: 15:42

And I'm gonna be recapping the whole conference on an episode coming up soon. Amazing. So I'll add some more of your comments there. Um, I would love for you to speak directly to any startups and founders that are in that position where they're starting to scale. What do you feel like are some of the biggest mistakes you see startups and new tech founders making when it comes to cloud usage and compute usage?

Leon: 16:09

Well, I have some general advice for, for entrepreneurs. I think people in general spend way too much time on the technical idea without validating product market fit now.

Brooke: 16:19

Mm-hmm.

Leon: 16:19

People are better educated now than they were 15 years ago.

Brooke: 16:22

Yes.

Leon: 16:22

I think the Lean Startup, which is it was a great book, was an eye-opener for a lot of folks.

Brooke: 16:27

Mm-hmm.

Leon: 16:27

You should get to your market fit, answer much more quickly, spending as little money as possible, and as we were talking about at the conference. It's very cheap to get started. You can create a business and a company market it, get interest, see if people are willing to pay for almost no money. Right. The, I think it'll actually be a VC problem eventually because VCs have a certain amount of capital that they need to deploy.

Brooke: 16:52

Mm-hmm.

Leon: 16:52

And founders aren't gonna want to take that much money once that efficiency train starts rolling.

Brooke: 16:57

Interesting.

Leon: 16:58

So it's gonna, it's gonna be the opposite of what, what I think uh, there is going to be a revolution in startup velocity. Mm-hmm. And you already see it with companies like Lovable. Yes. You already see it with Cursor. You already see it with Windsurf, where with a very small number of people, these companies are scaling massively. Now, back to your point about resource efficiency, a lot of startups will start using, if they're going to use generative AI in their software stack mm-hmm. They're gonna start using the APIs that are provided to them. And this was the hot topic of the conference that we opened with with the Hugging face CEO around the importance of open source.

Brooke: 17:46

Yes,

Leon: 17:47

it's fine to start with open AI endpoints or anthronpic endpoints, but at some point, you are gonna need to take control of your destiny. Mm-hmm. Those models are gonna have to be closer to home than arm's length running in some data center at Open ai. Why is that important? Well, I think that there's kind of three core reasons. The first one is there are gonna be some applications where you just can't take the information and pass it across a boundary like that without insuring security and privacy, so, right. What's an example, like an insurance application. You're not gonna be able to take someone's personally identifiable information that has been trusted to that insurance company and just ship it and then hope for a response back. You're gonna have to bring those models closer to you in your kind of private network. The the second reason is cost and efficiency. So we have this tokenomics model where you pay, when you're using APIs, you pay per token. Like X number of cents for a million tokens in and out. Anthropic and open AI have to make money on that. Right? So they're providing you software as a service. The closer you get to the infrastructure, the more money you're gonna save.

Brooke: 19:00

Mm-hmm.

Leon: 19:01

So often what I see is startups have a great idea. Google and Azure and AWS will lure them in with startup credits you can get tens of thousands of dollars in startup credits. And if you guys haven't done that, go and ask, go and sign up for those incubator programs on GCP Azure and AWS. They're great.

Brooke: 19:20

Yes.

Leon: 19:21

But once you chew up those credits and you're so heavily tied to that ecosystem, that's where they got you. That's where you're stuck. And your cost of goods sold. So the thing that defines your gross margin as a business is gonna be tied to the economics of that provider.

Brooke: 19:37

Mm-hmm.

Leon: 19:38

And we're proponents of saying, no, no, no. Take control of your destiny. Mm-hmm. Deploy those models in your own infrastructure, even if it's in cloud. And then you control how efficient that infrastructure is used. And that's where we're trying to help, uh, our customers with generative ai. Run those models locally, run them in your clusters, run them on your infrastructure, and we will help you scale efficiently.

Brooke: 20:00

Hmm. That's incredible. I can see that trend happening. People really wanting to take more control of their data and their profit. If you're tuning into this podcast, you're most likely an AI advocate, and you may have also wondered how to support your body against the invisible stress of EMFs. Think wifi, cell towers, or hours in front of your laptop. Leela Quantum products are lab tested in triple blind studies and are proven to help harmonize and neutralize EMF signals. Their products are the few things I felt a real energetic shift from. I personally wear their quantum energy necklace daily. And if you're someone who cares about optimizing your energy and nervous system like I do, you can explore their offers with my affiliate link in the show notes below. Which brings me to my next topic organically around cybersecurity. So you mentioned you worked a bit in that, in the earliest stages, and how do you keep, you know, everything that you're doing in your company advancing security while keeping costs low. I can imagine that's a real concern and threat with a cloud-based service such as yours. If you could speak about all of the precautions you're taking around cybersecurity.

Leon: 21:12

So this company, so Cast is the first infrastructure company that I've built everything before this has been a cybersecurity, so I'm very passionate about the cyber topic.

Brooke: 21:23

Mm-hmm.

Leon: 21:23

Was a little burned out years, a few years ago. But, uh, it's a very, it's a topic near and dear to my heart.

Brooke: 21:30

Mm-hmm.

Leon: 21:31

When you build with a security first perspective, there are a few principles that are really important, but, and there are frameworks around those principles now that are fairly mature. Mm-hmm. So one of the things that anybody looking for any service wants to do is they wanna go to their trust page. Every company kind of our scale has a trust page. Ours is trust.cast.ai. And what you'll see is security certifications. So what are some of the common ones? SOC two type two is an important standard. ISO is an important standard, the most stringent one that we work with. Is something called P-C-I-D-S-S, payment card Industry Data Security standards.

Brooke: 22:12

Mm-hmm.

Leon: 22:13

And what that ensures is that our customers can process sensitive credit card data through systems that we manage. With a security guarantee all the way upstream to our customer base and to their customer base'cause they in turn have to be compliant as well with the, with that data standard. Mm-hmm. So when you line up all of those frameworks, you, there's a set of tools, tactics and techniques and procedures that you structure from the ground up. And you know, I would say 30% of everything we do is making sure that it's not just the application, but it has to be secure. So, every single proof of concept that we go through with enterprise customers. Mm-hmm. There's a massive security checklist, which, by the way, AI is great at helping to fill out these days, but that checklist is there to protect those enterprises because they have compliance, they have tactics, techniques, and policies. Mm-hmm. And they have a cyber policy and insurance policy that they have to adhere to. In order to maintain their coverage. So it's a very tight chain, and if you don't think about that in enterprise software from the very beginning, it's a very uphill battle.

Brooke: 23:23

That's great to hear that you're putting that security mind first. I would feel safe uh, putting my product and service on your servers. Yeah.

Leon: 23:33

And our, you know, customers, enterprise customers demand it, you know, so yes, customers like Mercedes-Benz and Hugging face and many others, all kind of, all stages in between. They require that level of security in order to feel comfortable in running mission critical applications on mm-hmm. On the infrastructure we provide,

Brooke: 23:51

so right now it's December, 2025. We just had the busiest e-commerce shopping week of the year, and a lot of advancements are being made with agents having purchasing power and buying within ChatGPT. Where do you see Black Friday 2026 looking like. How, how different do you feel like e-commerce will be, what advancements do you feel like there will be in ai this time of year forward?

Leon: 24:19

I think it's gonna increase volume. Mm. So for us, that's good news. I mean, we're just get busier and busier every year and, and this week and the paranoia around getting the infrastructure required mm-hmm. To run eCommerce gets higher and higher every single year. Mm. Like some of our first customers were super paranoid about can we get the compute we need? Yes. To run our business during these busy times. Like what if the cloud runs out of resources? Like how do we, how do we change, what's our disaster recovery strategy for that? Mm-hmm. So we've take a lot of our product is designed to handle those immense spikes. Like we're working with an oil and gas research company right now. Mm-hmm. And they're spinning up, compute something like in the tens of thousands of computers. Wow. To run one single research batch job. And you can imagine if you're just kind of 10 cents off on a, on a unit, it blows the whole budget. So everyone is super tactical about exactly getting the right resources for the right time. But back to your question about the, the shopping experience. I've read statistics, and I haven't validated this a hundred percent, but that the propensity to buy from users that shop through using AI assisted tooling mm-hmm. Is way higher.'cause there's more confidence that there is, um, like when you're using ChatGPT or any other AI agent to help you compare products, you're getting a third party somewhat unbiased view Yes. Of the market. As a result, you feel better about pulling the trigger more quickly and maybe price isn't the only thing that you're, you're shopping for.

Brooke: 25:57

Mm-hmm. So

Leon: 25:58

I think it takes a lot of uncertainty outta the process. Amazon has done a really good job with Rufus. Have you tried it?

Brooke: 26:04

No.

Leon: 26:05

There's an Agent Agentic workflow right on it amazon.com. Oh, I'll have

Brooke: 26:08

to check that out.

Leon: 26:09

Uh, Rufus is pretty good at like helping you sort through stuff.

Brooke: 26:13

Cool.

Leon: 26:13

And Amazon has said that customers using Rufus have a higher propensity to buy. So I think intentionality is there now. I, I think people are gonna be more confident and that's gonna raise volume versus going to a store where physically you really need to take a lot of extra steps to get that digital assistant to help you. Like you have to take pictures, you have to scan barcodes. Mm-hmm. There's a lot of extra work to do. I actually went through that experience in Home Depot, like I have to buying a tool. Okay. And having Chat GPT assist me, it's a lot more cumbersome than doing it online. So next year, more volume, but higher intent.

Brooke: 26:49

Wow. I'll have to try Rufus tonight. I have a sibling birthday coming up, so maybe Rufus can help me. Yes, pick out a present.

Leon: 26:57

I'm, I mean, I'm not a shopping guy, but what my wife tells me is some pretty cool tools out there.

Brooke: 27:04

Amazing. So what's next for you? What projects are you working on? Of course, I don't want you to disclose any secrets, but I'd love to hear, you know, sometimes I ask my guests a fun question if you were to wave a magic wand or vibe code something tonight, do you have any passion projects and side tech companies you're building or anything coming up? What's next for you?

Leon: 27:27

So at Cast, we're building some really interesting, so all of 2026 is gonna be about this new capability that we've introduced in the market called APA. It's a little bit technical, but I'll kind of explain it to you. It stands for application performance automation. Mm-hmm. And what it. In general in software enterprises buy this this thing called a PM stands for application Performance Monitoring.

Brooke: 27:51

Mm-hmm. But

Leon: 27:51

monitoring is a problem, like I feel it's the Emperor's new clothes. Like, okay, you've told me I've had problems.

Brooke: 27:57

Mm-hmm.

Leon: 27:58

Now I have more problems. Like you haven't helped me. And your recommendations are only as good as my ability to implement those recommendations and look at it from a human perspective. If you get a system that recommends making a change, it's still on you to make that change correctly. Mm-hmm. And so people kick in with conservative approaches to taking or never taking those recommendations at all. And our goal with APA is to create. A human at the end experience. So what do I mean by that? Right now a lot of your chat experiences, even when you're vibe coding,

Brooke: 28:32

mm-hmm.

Leon: 28:33

Are human in the middle. Yes. You are working with an agent. The agent is doing some stuff. You're giving it feedback, but it's a very back and forth process. So your time is taken like, like when I use Claude code, for example, it has a five hour wall clock and very annoying. I'm gonna send the CTO a text. It has a five hour wall clock, so I, the maximum I can work with Claude code is five hours kind of in a window. Oh. And then it's a force break, but, so that just emphasizes the human in the middle approach.

Brooke: 29:04

Hmm.

Leon: 29:04

What we're trying to create with APA is an agentic workflow. That's a human in the end approval process. So you have a very complex problem, and we're gonna let these agents that have runbooks and a runbook is just a set of scenarios it needs to solve for.

Brooke: 29:20

Okay.

Leon: 29:21

Security is a great one that we just released. How do I make my application secure? It will go with various hypotheses looking at the vulnerabilities that you have today and systematically go through a bunch of experiments until it comes up with the best possible resolution with what's available today, and then produce an answer for you that's fully baked into your code base. Wow. And it'll raise what's called, uh, in technical terms it's called a pull request, which is just, Hey, I've made all of these changes for you. I've done all the research, I've tested it on your platform to death. Here's a bundle, Mr. Engineer, will you approve it? And then all the engineer has to do is say, I've looked through and I approve, and you can push it to production. Or an engineer can say, no, I looked at it and it's terrible. Here's my feedback. Go learn for next time, you know, and, and fix this. So I don't wanna remove humans from the, from the equation. I want to move them into a higher order thinking approval process workflow that lets the agentic workflow do all of the work upfront.

Brooke: 30:28

Wow, that's incredible. And you still code a lot on your own?

Leon: 30:31

I do. And I'm not. My team doesn't like me coding, but it still happens.

Brooke: 30:38

Well, you shared so much today about what's coming in the future and a lot of really great use cases for using cast ai. Do you have one more key takeaway that you wanna share? I always like to leave the space open for anything that you'd want to leave listeners with who are maybe starting their tech or AI journey or just starting to build themselves.

Leon: 31:01

Yeah. So for those new builders of the world, and I think we talked about it, a sidebar a couple of days ago. I, I really want to emphasize the chaining of tools for people who are starting and to build, like you can use the best tool for the job, but let them feed each other. So I'll just give you one example, right? Okay. If you're designing a new application, you're doing market research and you've got a great marketing plan to start validating market fit, maybe you're using GPT for that. You can ask chat GPT to produce a Loveable prompt that will then go build the front end of that application as you've envisioned it, and it'll do a really good job. And then you can ask, and maybe you can chain Claude code to do it for a back, for a backend service. So by bringing these agents together and having them know about each other and play off their strengths, you're acting as the conductor, but really still leveraging each model to become the input of the next chain of thought.

Brooke: 32:10

I love that. That's really like system and design thinking and going into the whole operations because so many times we just jump in. We're not really thinking of the whole SOP of what it is that we're doing. I think that's such a key takeaway and last point. Thank you so much for your time today. If listeners wanna reach out or connect and learn more about Cast AI, how do they find out?

Leon: 32:34

Cast.ai. We actually have a free signup process and, uh, free immune product that people can try. Oh. If you're running any type of infrastructure in the cloud, this is a great place to start.

Brooke: 32:44

Mm-hmm.

Leon: 32:45

Um, and I think we have a great LinkedIn profile as well, which we can share in the show notes.

Brooke: 32:50

Wonderful. I'll be sure to link all that out as well. So thank you so much for your time today. Thank

Leon: 32:55

you, Brooke. It was great. Thank you.

Brooke: 32:57

Wow, I hope today's episode opened your mind to what's possible with AI. Do you have a cool use case on how you're using AI and want to share it? DM me. I'd love to hear more and feature you on my next podcast. Until next time, here's to working smarter, not harder. See you on the next episode of How I AI.

Brooke Gramer

Host