Stan Christiaens — The Billion-Dollar Problem Behind AI | Future Ventures Podcast Episode 37 Artwork

Future Ventures: Scaling with Clarity

Future Ventures: Clarity at Scale is the podcast for founders, operators, and investors who are building companies worth owning for the long term — and who need to think clearly about capital, structure, strategy, and growth to get there.

Each episode cuts through the noise around scaling: how to structure a deal, how to position a business for institutional capital, how to build operational leverage without losing control, and how to make the high-stakes decisions that compound in value long after the moment has passed.

Hosted by Maxim Atanassov — a four-time founder and the Managing Partner of Future Ventures Corp. Since 2018, FVC has invested in, incubated, and scaled companies across sectors — with a focus on platform opportunities that compound in value. Maxim's background spans executive leadership inside Canada's largest energy companies and senior advisory at Deloitte and EY. He's a CPA-CA who has sat at the table where capital gets deployed, governance gets built, and hard decisions get made. Now he helps founders get there faster.

New episodes every week. Subscribe wherever you listen.

All Episodes

Future Ventures: Scaling with Clarity

Stan Christiaens — The Billion-Dollar Problem Behind AI | Future Ventures Podcast Episode 37

May 20, 2026 • Maxim Atanassov

0:00 | 55:53

Send us Fan Mail

Stan Christiaens is the Cofounder and Chief Data Citizen of Collibra, one of the companies that helped define the modern data governance category. What began in 2008 as a spinoff from a semantics research lab at the Free University of Brussels has grown into a global platform used by some of the world's largest enterprises to manage data trust, lineage, governance, and increasingly, AI oversight. Eighteen years in, Stan has a vantage point most operators do not — he has watched governance get sidelined every 5 to 10 years by the next shiny technology, and he has watched the same data problems resurface every time.

This conversation matters because AI has changed the math. Companies that treated data as exhaust instead of as an asset are now discovering that their AI ambitions are bottlenecked by foundations they never built. Stan and Maxim get specific about what those foundations actually look like, why the Chief Data Officer role is at an inflection point, and why the iceberg of unstructured data — roughly 80 to 90 percent of an organization's information — is suddenly the biggest question on every data leader's desk. If you are building, advising, or selling into enterprises right now, this is the conversation about why data discipline is no longer optional.

Key topics covered

Data as asset vs. data as exhaust — why most organizations are still "growing up" to treat data as an asset, and why fragmentation from every shiny new technology makes the problem worse.
The foundations of governance done well — find it, understand it, trust it, with assigned responsibility, a repeatable process, and a system of record sitting underneath.
The evolution of the Chief Data Officer role — Gartner's five versions from defensive posture to data products to "startup person," with version 6 due, and why the AI moment is the CDO's biggest opportunity.
Data confidence and the AI brain — why five-nines reliability for agents is achievable but requires scaffolding around the model, and why asking an LLM not to hallucinate misses the point.
Selling into enterprise as a founder — why enterprises are slow by design, why they are never greenfield, and how to find the innovation pockets that let you accelerate.

Three key insights

You cannot fix data after the fact. Organizations that treat data as a byproduct for years and then suddenly need AI cannot retroactively make that data useful — they have to start treating it as an asset first, which is a cultural shift before it is a technical one.

The model is not the problem. No matter how smart the frontier LLM gets, it will not make the right decision without the right context at the right time. The work is building the harness around the model — the responsibility, processes, and curated context — not chasing the next model release.

Patience is the entrepreneur's hardest skill. Plan for a 10-year journey, not a three-year sprint, and make sure your business ambitions and your life at home stay in harmony — because the overnight successes everyone admires were a decade in the making before they looked obvious.

Links

Collibra: https://www.linkedin.com/company/collibra
Stan Christiaens on LinkedIn: https://www.linkedin.com/in/stijnchristiaens/
Future Ventures on LinkedIn: https://ca.linkedin.com/company/future-ventures-corp

About Stan Christiaens

Stan Christiaens is the Cofounder and Chief Data Citizen of Collibra, which he helped spin out of a computer science research lab at the Free University of Brussels in 2008. Over 18 years, he has built Collibra into one of the defining companies in the data governance category, working with many of the world's largest enterprises on how they organize, trust, and use their data. He is a recognized voice on the intersection of data governance, AI, and enterprise transformation.

SPEAKER_00 0:02

Welcome to the Future Ventures Podcast on Scaling with Clarity. On today's show, we have Stan Christians, co-founder and chief data citizen at Colibra, one of the companies that helped define the modern data governance category. What started out as academic research in semantic technologies in Belgium has grown into a global platform used by some of the world's largest enterprises to manage data trust, lineage, governance, and increasingly AO oversight. On today's episode, we're going to explore why AI is forcing companies to rethink their entire data architecture, what most organizations still misunderstand about governance, and why the future of enterprise AI may depend less on models and more on context. Stan, welcome to the stage.

SPEAKER_02 0:55

Hi Maxim, this is Stan.

SPEAKER_00 0:58

It's a pleasure to talk to you um today and find about find out more from you in terms of how you're thinking about this and what led you to this. I mean, data governance has always been an issue. I used to work with Iwai and Deloitte. I remember us having data governance engagement as far back as I can remember. Um, but now with AR, this has raised, uh has risen in prominence. So kind of like what walk us through kind of like the origin story. Why do you decide to focus on the research and kind of how did that emerge uh evolved into the company that Kaliber is today?

SPEAKER_02 1:39

Well, uh Maxim, you're uh taking me back to the early years, right? Because we've been doing this now for about 18 years, yeah. Right, so we've we have a pretty good view of how governance requirements of large organizations are evolving, right? And they're always subject to technology changes, of course. Um but going back to 2008, indeed, we are uh at that time a spin-off uh from a university in Brussels, the Vraie Universiteit Brussels, which translates to Free University of Brussels. Okay, but if you translate that back to French, for example, you would have Universite Libre de Bruxelles, which is a different university on the same campus. So you know, but yeah, potato potato, we came from a computer science lab that was doing research around semantics uh back in that uh time. This is also the day of uh you know, like Tim Berners-Lee semantic web uh was also big then. Um and we decided okay, with what we're doing at the lab with this research around semantics, both how to you know apply semantics in computer systems, but also how to collaboratively you know create, curate, and manage context as people. Um both were research topics at our uh lab. And then we said, okay, let's take this into the wild world of the market and see how far we get. Um and that's how we discovered uh um that larger organizations had data problems, have data problems, uh, and that with AI today those data problems are becoming significantly larger.

SPEAKER_00 3:22

And uh I I completely agree with you on that point from your vantage point, Dan, as Dan, what are kind of like the the foundational elements that they're missing uh on from a company's side around data that uh prevent them preventing them from actually unlocking the full value of AI?

SPEAKER_02 3:48

Well, I think it has to do with how you look at uh data specifically. I think a lot of organizations um the way that they treat data is more like an exhaust rather than as an asset. Uh, if you think about data as an asset, just like you think about money as an asset or you know, um IT technology as an asset, then you organize for that asset, right? So there's responsibilities assigned, there's processes set up, whether that's ceremoniously or more agile, uh, there's organization around it. So essentially, the organization treats the thing, data in this case, like an asset or like an exhaust. If they've treated it as an exhaust, and then they're like, oh, now we need to get value from this, then it's gonna be hard because the first thing you have to do is start treating it as an asset. Um, and I think that's the biggest problem that many organizations still have today, is that they're still, let's say, growing up to treating data as an asset. And in the process of growing up, Maxim, what happens is every five or ten years they're um almost sidelined by a new technology, you know, self-service BI, the big data days, what I call the puberty phase of the data management software. Uh and now, of course, the the huge gorilla in the room is um AI. But essentially, every five or ten years, another shiny widget technology pops up, and then they're like, okay, the the some of the data gets distracted in that new platform. But ultimately, these larger organizations just end up with a whole bunch of those shiny widgets, right? And um, that fragmentation also makes uh the treating the data as an asset a bigger problem, right? Because the more fragmentation you have, uh the bigger the challenge to you know taking care of it.

SPEAKER_00 5:49

For sure. I mean, I I I I I can provide it with real life examples of kind of like what I've seen with the companies I've worked with. Um, like we were doing an S4 HANA, SAP S4 HANA implementation, and either the data didn't exist or it wasn't usable. So if we're talking about work streams, the one that was always um behind the one that was always like creating dependency on was the data stream. Uh we've worked with uh like building out data lakes, like uh like and and so as you know, garbage in, garbage out. Like uh we've worked on building out uh ESG platforms uh to be able to calculate scope one, two, three emissions, and uh we spend an exorbitant uh amount of time and resources being able to being able to build the data because I mean AI is compute, but it it has to compute something. And so the other data point for me that kind of really stands out in the data is like um I live in Calgary, and in Calgary it's it's very much an energy resource economy. Oil and gas is like we're kind of like the Texas of Canada. And so the oilfield services companies uh be like the studies like led by university of in Australia, I think was the Adelaide University that proved that uh the services uh oil field services company like Schlumberger, I mean Schlumberger changed the name to SLB. Um they became more ten times more valuable than the actual uh oil and gas companies because they took the data that was being generated at the oil and gas companies and turned it into assets. So to your point, a lot of companies were generating a lot of data, whether it's from IoT devices or whatever it whatever it was, but they were not using it in any kind of like meaningful way.

SPEAKER_02 7:42

Yeah, exactly, Maxime. And uh, if you are the taxes of Canada, then uh I hope that uh includes also the barbecue effect. So I'd be more than happy to come join you and you know, discovery of uh Calgary barbecues. Uh, but exactly to your point, right? You've seen it in these large companies, uh, that getting anything done which depends on data always hits that wall of like, oh, we need to take care of our data, right? So organizations very often re-encounter the same problem, right? And always take uh brave endeavors to try and make it better. Um, and I think what will happen with AI is it will just accelerate um the need for solving that problem properly. Yeah, I think the ones that do it properly will benefit more from AI than the others, and we all know that AI acts as an um accelerant today, right? So the ones that do accelerate with AI will pull away far further, uh faster than the others.

SPEAKER_00 8:43

So Stan, um Calgary. Have you been to Calgary before?

SPEAKER_02 8:48

Uh I don't think so.

SPEAKER_00 8:50

Oh my gosh, you're missing out. It's uh it's one of the most dynamic cities in North America, it's one of the top 15 in terms of size and population, uh in terms of two million uh population. But we're talking about barbecue and tailgates. Um Calgary has the um the biggest outdoor show in in um in the world, like it's it's called the Cowgirl Stampede for 10 days. Cower Stampede, yeah.

SPEAKER_02 9:16

Oh, this sounds interesting.

SPEAKER_00 9:18

So it's a cowboy show, like with with Jack Wagon races, with like uh barrel racing, like uh like um bull riding. It's like it's it's it's incredible. It's like the city shuts down for 10 days, and it's nothing but hard. So it starts the first week of July on the on the first Thursday in July and goes for 10 days. Um, and so concert after concert at the concert, it's it's I mean, I I love Calgary. I've lived here for 20 plus years. I love Calgary, it's an amazing city.

SPEAKER_02 9:47

Sounds like a wonderful experience. I might have to free up one of what I one of my first weeks of July, either this year or next year to experience it.

SPEAKER_00 9:55

Absolutely, absolutely joke it aside. I think you should come to Calgary. I mean, just to give you context, the ESG data, just the ESG data platform build-up was about 30 million dollars. The the data platforms on the customer and then the enterprise, we were spending like on the data platform, I think we were about 20 million dollars, and the overall platform was 60 million dollars in just building out the data platforms. So these are big, big dollars amounts, but those are uh foundational to be able to do anything. I mean, you mentioned uh business intelligence, we're talking about AI. They this is kind of like the foundation upon everything else rides on. So can you kind of if you if you overlay that there's a lens, what are the foundational elements that enterprise companies should have in place to ensure that they're continue that they're developing a data quality standard, the data governance, that they're ingesting the data, that they have that they're doing everything in absolutely the right way to be able to have quality data on a continuous basis.

SPEAKER_02 11:04

Well, I think the essence of it is not rocket science, Maxime. Uh now the way in which you establish it uh is another part of that answer. Uh, but let me maybe start with the what, right? So the essence of the what you need to do is relatively straightforward, I think. Um, anything that you treat in a large organization as an asset, uh, you need to be able to do some things with that asset, right? You need to be able to find it uh so that people know, okay, I know the asset exists. You need to be able to understand it. Like, what is this asset and what can I do with it? Uh, what can I not do with it, maybe? What is some of the the context of it, right? Like if you have a customer data set, for example, well, what kind of customers are it? And and you know, you can use it for those marketing purposes, but not for those purposes, for example, right? So, the simple example of of the understanding of that context, and you need to be able to trust it, of course. Uh, and to do those things, uh you need to have uh a place, of course, to do all of these things, like what we call a system of record. Kodipa is such a system of record for the data assets, um, and then you also need to establish some responsibility around it, like if you have the customer data set. Uh, well, you know, if I have questions about it, do I talk to Maxime or do I talk to Stan? But then multiply that question by 10,000 or 100,000 in the number of employees, right? So essentially, for that specific asset, who do you go and talk to if you have questions about it of any sort? Uh there could be small questions, they could also be very big questions, but you need some kind of responsibility established. And as you know, in a larger organization, uh responsibility is not just established through individuals, but also through organizational hierarchy, right? So typically people refer to that as like the data office, right? This is just like finance is responsible for the money assets, or HR is responsible for the talent asset, people. Um, so the data office is responsible for the data asset and figuring out uh um what needs to be done to make this a sustainable practice in the organization, right? So you have some responsibility, individual um as well as organizational. Uh, and then you need some process as well. I'll give you some examples, Maxine. But essentially, if you go and say, okay, I'm gonna look inside Codibra, find there's a data set for customers or 100, right? So I narrow it down, uh, it could be structured data, unstructured data, I narrow it down to the essential data set that I'm looking for, but you don't have access yet, right? Because your access to that data set depends on what purpose you have in mind. If you say, Well, I'm gonna take this data set and I'm gonna use it because I'm leaving this company tomorrow, I'm gonna use it to take it with me on my USB stick and give it to the competition, obviously that's not gonna happen, right? But if you say, Well, I'm gonna use this data set because I'm gonna do some sandbox type you know um experimentation, some rapid uh insights of uh whether there's value in the data set for my purpose of marketing or you know, improving our products or services or whatever else you're planning to do, well, you need to put in that request, which could be a simple form, and then either by automated policy or by manual okay or not okay from the owner of that domain of data, you get access to the data. But there might be multiple steps in that process. If you're an organization who's maybe more regulated or the data is more sensitive of any sort, uh, or the purpose that you have in mind to do with the data, uh, then maybe there's more steps in the process, right? Or maybe uh from somebody from privacy needs to take a peek, or maybe somebody from security needs to do a review, for example. Um, or maybe somebody from the business wants to talk to you and say, hey, well, what do you have in mind? Uh can we talk about this a little bit? Because I do believe there might be value, but it also comes with costs. So let's see if we can you know get that balance going, right? So you might have a process with many steps involved, or it could be very agile, right? You just see that data set, say you want it, and because you want it for sandbox purposes, just to do some uh rapid investigation, maybe you're getting an automated approval and there's no steps in between. Either way, you need responsibility assigned, you need process established so it becomes repeatable, and you know, it becomes normal uh for people to do it this way rather than going in hundreds of slacks and teams and emails until you you know, via game of find a friend that finds a friend, you find the one to talk to. Uh, you can just do it in a uh consistent and repeatable way, right? So responsibility, process, and that system in record in place to enable uh the people and the process to work uh this problem at scale.

SPEAKER_00 16:19

Based on your work to date, uh, how many companies actually do have that in place and what's the biggest misconceptions that uh executives have about uh a uh data governance today?

SPEAKER_02 16:35

Well, uh our market is the uh enterprise uh Maxim, right? So if you look at those, I would argue that uh over the last 10-15 years, all of them have some responsibility established around uh data. So all of them will have a data boss or even multiple data bosses, right? They might have a group uh level or a firm wide level or a corporate uh data boss with its corresponding data office, right, that might or might not own also the data infrastructure, like the data platform, uh, and then you might even have uh data bosses or data offices in every one of those lines of businesses, or by geography. If you go back to let's say 2012, uh the total count of chief data officers would have been, I don't know, 10 or something like this. Uh, whereas now Gardner estimates that there's probably over 150,000.

SPEAKER_01 17:34

Oh, wow.

SPEAKER_02 17:35

So I think we've made a lot of progress in the industry, and more of that responsibility is challenged. Uh it is um, more of that responsibility is is needed. Uh, it needs to mature further, right? To become because you know, taking care of your data is gonna be business as usual, again, especially because AI is really demanding uh to eat high-quality data. Um, but essentially the challenge that you have with it, um well, you know, I'll talk about the word governance itself. Governance is always a little bit seen as oh, it's gonna slow us down, or oh, it's important, but somebody else has to do it, or we'll do it later, right? Which is also doing governance, just the one with the highest risk involved and no considerations of risk whatsoever. So you're always doing some form of governance, um, essentially. But what I've seen is that the chief data officer, because it's such a new role, uh, you can see that Gartner has already developed four or five levels of it. Um, and I believe that with today's AI situation, that is a unique opportunity for those data bosses if they play their cards in the right way. So, and that for that you need to go back a little bit in history, right? So, Gartner identified version one of the chief data officer, is the one that has a defensive data uh posture. Version two is all about offense, version three was all about supporting digital transformation, version four was all about uh data products, version five was more about the chief data officer as the startup person, I think. And version six is about to come out, if I'm not mistaken, this year, so we'll see what it is, right? But it will have to it will have to do with AI and it will have to do with transformation, of course. Now, because the uh function started from that regulatory nature, that version one, um, the defensive data posture, there is this perception of the role that it doesn't score goals, right? Because if you only have a defensive data strategy, you will never score a goal. I mean, the the time that a keeper in soccer magazine scores a goal uh by you know defending kicking the ball really hard and actually scoring to the other side of the field, those are unique times, right? So a lot of those data programs or data offices still carry that old history, the legacy with them, like an anchor, and they really still need to you know move or mature into those other versions. Um, you know, the the evolution of the role, I would say, and that's where the opportunity now lies. Because with AI, data will be needed, uh, inevitably. You know this, I know this, but the challenge that we have in the market, Maxime, is that not everybody understands this yet. Uh, for a lot of people, AI is this chatbot in the cloud uh that they can use today, right? So, why would I need to invest any kind of time or resources in my data foundations, right? They don't always see the connection. And our people, the data people, don't always help because a lot oftentimes they'll go and they'll lean back and they'll say garbage in, garbage out, and they sort of wait for the AI people to come to them, right? Or worse, they'll throw up all sorts of objections to the people doing AI. They say it hallucinates, you know, it needs good quality data. So, you know, just give us three years to invest in that data foundation, and then you'll be able to do your AI. Now, whether that's true or not, uh, Maxim, obviously, that is not a message that anyone doing AI today wants to hear, right? Because yeah, AI is here now. I gotta do something here and now. Uh, so those experiments that people uh will do with AI will happen, uh, and many of them will hit walls. And I think the best uh posture you can take as a chief data officer or a chief data and AI officer is to take a constructive, collaborative um approach to uh those AI projects because they will need data, and you want to be seen as a person that's helping you. Don't want to be seen as a person that's hindering, I think. Um, so that's our opportunity, and I believe that in that way the the chief data officers can become chief data and AI or chief data analytics officers, or otherwise, uh a chief AI officer might arise and just subsume the data responsibility.

SPEAKER_00 22:21

I mean, from uh what what we're seeing on the ground when working with enterprise companies, like um one mental block that's preventing companies from advancing forward is perfection is the enemy of done. Um so if you're waiting for the data to be perfect, that would never be the case. But if you embed, if you build out processes that build out the data, if you assign different levels of confidence on the accuracy of the underlying data, you're far more likely to start uh you know, developing use cases for what you're trying to accomplish. When we do maturity assessment, quite often um enterprise companies, I'm talking big enterprise companies who score quite low on a majority assessment. It doesn't really matter if we use uh Gartner's maturity models, or if we use the corporate executive board, like the varying degree of maturity uh that we're seeing, quite often companies would score kind of like in the one and a half to two and a half. We rarely see uh leading data governance functions within companies. Um at least that's kind of the perception that we have based on the companies we've worked with. But um is there, I'm curious from your perspective, Stan, are there industries and verticals that they're on the leading edge? Kind of like that that those are the uh the companies that we should be looking at. Like, for example, when I'm working with more traditional industries, like they're saying, well, we should go and hire people from tech and come work with us. Um, kind of what what is your perspective? Kind of like where are the pockets of really leading edge companies?

SPEAKER_02 24:04

So I think they do exist, um, but I think you have to look for pockets inside industries on the one hand, and then for pockets inside the organizations on the other, Maxim. Um, I'll give you an example. Remember that fourth level that I mentioned earlier, uh, the chief data officer version four, according to Gartner, all about data products. So, what you're seeing a lot in the market right now, uh, I mean, I was at Gartner last week in London, and you could see it a bit all over, uh, is talking about uh you know my data marketplace and data products, and the data marketplace being used as an instrument to you know share and socialize data products. Um, and you're seeing a lot of organizations embark on those data marketplaces. Um, so that shows that uh people are active on it, and some have progressed further than others. Uh, you mentioned you know, people in technology spaces, industries where uh caring for data is more in their nature compared to others will typically be first. Uh, I agree. Uh, but then it depends a little bit, like I said, on the pocket inside of the companies, because when you establish uh a data program that is successful, you establish that Maxime through use cases, uh you know, with with which you, in my opinion, every use case you score new hearts and minds of data champions, and those data champions become advocates inside the organization to find your next batch of data champions until you know the culture has matured into well, we're taking care of data because that's how businesses run. Um, and uh, for example, if you would go look in banks, let's say, you could see a lot of activity around you know taking care of your data and making sure that your AI models know where you know the data is coming from and how it's being computed, uh, and those hops uh along the way and the responsibility assigned to each hop. You could see a lot of activity in the large banks if you go back to 2012 and 14, right? They were very active. If you would then go in those banks, you would see a lot of activity, and then if you go to the second half of 2010s, you would see that activity fragmented or fizzled a bit, and then you go back two years, uh three years, the BIS came out with the report uh for BCBS239, and boom, now there's a lot more activity again, right? So if your use case uh around uh governance and data and AI is more regulatory in nature, then you have to look for that activity in those regulatory pockets uh in the time frame that the regulation is coming at them. Um, if you would uh go elsewhere, maybe you might not see the same thing. I'll give you another example, Maxim. So uh privacy is another use case for taking care of your data. For example, under GDPR rules, you have to establish uh you know what are you using this data for? This data is using business process X, Y, or Z, and it has uh been subject to these assessments, right? Whether personal data had no risk of leaking out or uh whatever, right? So if you would go in the pocket of uh the privacy teams inside of organizations, you would see them perform certain types of governance functions. So at this moment, I would say uh governance is very active, taking care of your data is very active, but typically it's still more in a fragmented state, uh, with certain companies leading the way uh that are you know trying to make this more of an organizational enterprise uh initiative, where then, for example, you'll see them leading with more what I would call offensive use cases, like um you know, data marketplaces where anybody in the organization, as a citizen data scientist or a citizen AI person can really tap into that data as a democratization initiative.

SPEAKER_00 28:13

Makes sense. Um, I have a question for you how are leading organizations organizing themselves in order to propagate this as a capability? Um and and um what I mean by this is like some some companies adopt a bit of a centralized approach, some companies adopt a bit of a community of practice to a community of excellence where they there is somebody that's kind of stewarded and championed, but essentially they're trying to embed the capabilities within the different pockets within the organization and just provide the the rails upon which this can be built. Like, what are you seeing that leading organizations are doing that actually is taking route?

SPEAKER_02 28:56

Yeah, I like how you're calling it a living organization, Magazine, because indeed um an organization is always subject to change. I mean, you've probably experienced it multiple times, and maybe many of your audience as well, but every three or five years there's an organizational change happening, right? Especially in large organizations. So oftentimes you see that they swing like a pendulum from a centralized approach to a decentralized approach to a federated approach, and then we're swinging all the way back again. Um, so I think any program that you have to that you're putting up has to take that reality into account that your plans or strategies and your execution on the program, however beautiful or successful, is always going to be subject to those dynamics. Um, so you need some resiliency built in. Uh, is there a right answer? Uh well, should it always be centralized? Should it always be federated? No, there is no right answer. I always say, Maxime, that just like CRM or ERP, uh, there's no single way of doing this. Uh, it really depends on a number of variables, uh, like, for example, your organization's culture, uh, your organization's uh business strategy, uh, your organization's structure itself, right? How the company is organized, uh, but also you know, more invisible things sometimes, like your organization's uh power play and politics, um, and of course, uh your organization's technology ecosystem. All of these variables make it so that um uh setting up an operating model for governance, be that data governance, AI governance, or both, which it has to be in my opinion, um, always is going to be a moving target within which there's no single way of doing it. Uh, so what's the way you you described it as a living organization, I think that's the most important. Just like with any process you put in place, you need to be ready for iteration. Because even if all the variables remain the same, uh Maxine, right? And you're the same organization from one year to the next, uh, when you're successful at doing this, you know, you're raising the data and AI maturity in your organization. You're more data literate, you're more AI literate. So one year later, uh, there is a higher maturity level, and that higher maturity level might again require uh a change to your operating model, right? Because you're now more mature, you can do more things, or you can do more things differently. Uh, so I think that iteration, that living and learning um how to do this is so important. So, what I've seen organizations do is they set up like data days or data and AI weeks. Uh, the CEO comes in and speaks about how important data is to them, or you know, what their AI plans are, and so on and so forth. So, you see that a lot of the smart organizations are really uh tackling this as something that needs to be carried by their community of practice, the data champions all across the organization.

SPEAKER_00 32:14

Okay, okay. Um, one topic that's emerging quite a bit uh amongst uh enterprise, well, business in general is is is what some companies refer to as the AI brain, but the AI brain is really predicated upon some kind of a data set or data set to exist in what is they're architected via RAC or something else. And Khalibra talks about data confidence now, kind of like what does this actually mean in terms of operation license operationalizing it with in inside a Fortune 500 company or or a bigger company, kind of like the the ones that you work? How do you how do you drive uh I'm a CPA by background, so obviously I I always want the highest level of assurance with the lowest level of risk and as much real time as possible. Kind of like how do you guys think about it at Kulibra?

SPEAKER_02 33:12

Well, data confidence um is uh is the balance between what you just described, right? Um uh highest quality data, lowest version of risk, it's that ball, it's all about that balance and knowing where you are on that balance. Because, for example, Maxim, you could have you know a pretty shitty quality data set, but for your purpose of like maybe maybe a sandbox AI experiment, it might be good enough, right? But then if you say, Well, for us, we're learning in that chatbot experiment, right? Because our boss has to do to do something with AI, right? So we've created this chatbot and it used the you know less high-quality data set, but it was good enough for our experiment to learn. Now we have these learnings, and we want to scale it up to the larger organization, but we know that it's only going to possibly be possible if we have a version of that same data set which has, I don't know, duplicates removed from it, or the quality elevated on accuracy, and so on and so forth. Um, so uh that's also an example of data confidence, just knowing what you got and knowing what you can use it for, and knowing if you have a challenge, I don't know if I can use it, or I don't know how to understand it and interpret it. You have somebody to talk to that is an authoritative source on that subject matter, right? That comes maybe with some uh business knowledge and background. All of those are examples of um data confidence, and on our in our AI reality that we live in today, which is very transformative, Maxim. So the way that we're doing things today versus the way that we will be doing them in 12 or 18 months, and definitely in 26 months, is going to be significantly different, I think. But ultimately, data confidence in the AI world means that when you do speak to a chatbot, your company's brain, if you will, uh, or when you do have an agent you know operating on your behalf uh with a lot of you know autonomous execution capability, right? Which makes the problem a lot a lot bigger, right? If you have like a chatbot where you ask a question, you're still having your own critical thinking capability that kicks in, right? So you can still make a judgment call. You know, do I ask for more questions or do I you know send the chatbot to investigate a little bit further, and maybe I'll take the answer and process it before I do something. But when you go to an autonomous agent, you're having those decisions made automatically a thousand times per second, right? So, what data confidence means in that AI reality is that you know you can trust either the chatbot what it says and you know how you can process that answer, and more importantly, that you know what agents are out there and what they're doing, and that they're operating within let's say certain safety parameters. Um, just like we would with a self-driving car, right? We accept a self-driving car on the road. If we've defined and established safety boundaries within which it can operate, uh then we trust it. Otherwise, we're like, well, let's maybe leave it on certain roads, but not on all the roads, right? We need to collect some more data. So I think that uh data confidence means we have AI running, we have agents running, and we trust what they're doing.

SPEAKER_00 36:41

So, Stan, I just want to unpack this idea a little bit more. Are we able to achieve, or if we're not able to achieve, um, I'm assuming that we're able to carve out data. So, are we able to achieve such a high level of data confidence that we can get to zero or near zero hallucinations? Can we move from a probabilistic to a deterministic model in terms of well, kind of you mentioned chatbot in terms of whatever the response that we're getting? So the inference or what gets created is free of hallucinations, and then um get to a point where uh we have near certainty that whatever agents that the whatever actions agents are taking are virtually like kind of like you know, in data center, we talk about the five nines in terms of app time. Can we get to a point where whatever actions agents are taking on behalf of organizations or people uh have this kind of five nine level of confidence?

SPEAKER_02 37:43

I think we will, yes, we can do that. Uh what it requires is that we are clear about what those five nines mean, right? Uh and that we have the uh elements in place to make sure that those five nines can be guaranteed, which is bringing us back to you know earlier in the conversation, right? Where you need to have some responsibility established, some process established, and so on and so forth, so that you can actually know and measure, and when things do break down, take action uh and learn so that it doesn't happen again. When you have all of that in place, yes, you can work to that five nines kind of uh metaphor for your uh AI systems and your agent. Uh and I think that requires um obviously, no matter how smart the agent is, or you know, what version of the brain inside of the agent the LLM, you know, whether you're using a frontier model or an open source model from last year, no matter how smart it is, if that brain doesn't have the right context, uh, you know, which could be structured data or unstructured data that is relevant to make its decision at the right time, then it also won't be able to make the right decision, of course, right? Because it's just missing certain information to determine what's right. Uh so I think if you have those elements in place um that allow you to increase systematically, consistently the confidence level, uh, and if you have that mechanism in place, uh what I call context engineering being a business process, right? Then you know your agents have the right context and you have the right scaffolding in place, then yes, you can get to those five nines consistently and keep them there. Um, but Maxime, if your question is different, like hey, if your question is, hey, how can I make the LLM not hallucinate? Well, that's like sort of asking a wheel not to turn, right? Uh, the the way that an LLM functions is by nature generative, right? So, by nature, the way that this uh tool works is it just generates information, it doesn't have necessarily a mechanism to know whether that information is right or wrong. That's why the harness around it, which is the agent or the multi-agent system around it, yeah, and the business context in which that scaffolding lives, right? And the context that is provided to those agents, that whole system, yes, you can get that to five nines. Um but that requires putting other things in place rather than just a thin layer of a chatbot around an LLM.

SPEAKER_00 40:30

Yeah, I mean, I completely agree with you. Like, I think you you answered the the question they had perfectly. Um I mean in consulting, we have the like the phrase and we use it already, like garbage in, garbage out. So if you have the right framework in place to continuously elevate a confidence around the data, and and you pointed out some of the kind of the foundational elements, data ownership, data stewardship, data governance processes, processes that can continuously enforce that the data, the out the output is naturally going to be of higher quality. But I have a slightly different question that they wanted to ask you, Stan. Um so in in terms of data, my personal belief is, and I think you said it already earlier in the conversation, data will become and and it already is a competitive advantage in a moat. So I I think that from a mode perspective, I see I see data, proprietary data in particular, um, velocity is kind of the two most important modes at the moment. Um, one question do you agree? And the second question is um where do you think uh the bigger value will come from structured data or unstructured? I I know that they're both important, but kind of like what do you believe is the less utilized asset at the moment?

SPEAKER_02 41:58

Well, um first I do agree, right? So uh data is a moat, uh, and uh the one who has that moat um can use it in this transformational time that we're in, right? Consider a car maker that has collected data about roads, uh, you know, and driving behavior versus a car maker that hasn't. One can deliver a self-driving car and you know make cars a very different market, and the others just have to suffer the consequences, and so that's an example of how data is a moat um versus not a moat. Uh, on your second mode, velocity. I only partially agree, Maxine, because uh velocity can be a moat, you can have a business strategy that says we're gonna be first, right? Whatever AI means to our business, we're going to be the first in doing that, you know, where you're a category creator, and then you really have to experiment the hardest, right? And accept the most failure. Uh, because you have to learn fastest uh and better than everybody else what the new position is and what the new way of working is, right? So you can have a strategy whereby you say, I want to be first, right? Or in the top three, which means you have to go hard now, and velocity is important. Yeah, you could also have a business strategy that says, Well, I'm a category follower, uh, and whatever the lessons are that the other guys are learning, I'm just gonna copy them cheaper afterwards, right? Because they've learned the lessons, I can just copy them. Both are valuable, uh valid strategies, it's just different types of businesses that you set up. Uh, but I do agree that in AI, uh, velocity right now is uh what matters because it's so transformative. I think we're looking at another three years uh of significant transformation uh in the industry, in the world as a whole, I would say. Um, you know, the this is this transformation, by the way, Maxime, is probably equal. To the internet coming, right? So imagine pre-internet versus post-internet times. Uh, while the world looks the same in many ways, it also looks very different in many other okay. So I think we'll we'll see definitely another three years of transformation for society at large, I would say. Uh, but when it comes to the data aspect, and which is more important, so we can chunk it all under the big banner of context, right? Yeah, all data is context. So you have structured data, which has already been subject to some type of curation, right? Some effort was already put into it to make it better, to make it cleaner, to make it more uh, and so on and so forth. Uh, whereas unstructured data is almost like, well, no curation whatsoever has taken place on it, right? Uh, maybe the minimum level of curation is that people put some retention policy on it, you know, to make it disappear. Uh, and maybe people, you know, uh gave the freedom uh on SharePoints, you tag it however you want to, right? So just put some tags, and of course, nobody really puts a lot of tags. So the level of curation that you would see on unstructured data is minimal to nothing. Um, but it's also more probably more than 80% uh or 90% of an organization's data. They always use the metaphor of this iceberg, right? So the iceberg, there's a tip of the iceberg which is above the water, but the majority is below the water, below the visibility level. So I would say the structured data is the one on top where the organization has visibility. What do I got? Uh you know, where is it? And even that visibility is a little bit in the fog. Uh, but the bigger portion of data uh is that unstructured data. Yeah, uh and those worlds, Maxim, until three years ago, those worlds would already almost be completely separate worlds. Um, but now because of AI, which allows you to get more value out of unstructured data, uh, the question on all data offices, the question on all chief data officers and AI bosses is very big question. Help me with this unstructured data problem. How do I get that big bulk of data and provide it as you know curated context to my agents? That's a big challenge in the industry right now.

SPEAKER_00 46:30

I completely completely agree with you. Um in in and I think that uh um data, unstructured data hides a lot of nuances that if uh if explored uh intentionally can unlock massive opportunities for companies. Um one other question that they have for you, Stan, is what advice would you give technical founders who are brilliant builders but they're struggling to translate deep technology into enterprise value?

SPEAKER_02 47:10

Um, well, first of all, uh if you're a founder in the enterprise market, first of all, congratulations on starting your company. Second of all, congratulations on identifying your ICP, right? Is enterprise your market or is down market your market? Both are fine. Um, and you can live in the illusion that you can handle both of those markets or segments of the market at the same time. You can live there for a while in that illusion, and maybe some outliers can actually handle both of those segments of the market. But in practice, you're better off really focusing and choosing one of those uh yourself. Now, if your market is enterprise, know that enterprises, by their very nature, are slow, right? So they have slow buying processes uh which typically span multiple quarters, sometimes even multiple years. Um, right, and they have those processes in place for very good reasons. And sometimes I have founders asking me, How do I shorten that sales cycle? Yeah well, you can do that in the SMB, but you're gonna have a really hard time doing that in the enterprise because it's their process, not yours. The only thing you can do is be ready, uh quicker, know their process, understand it, and you know, try to navigate through it as fast as possible, which typically is a little bit of a game of hurting cats sometimes. Uh, and cats can be herded, but it's not always that easy. Um, but do know that um these large organizations also typically have ways to circumvent certain processes. Yeah, for example, if you're in some some type of innovation bucket, uh then it becomes easier to get something done quicker. But then there are probably boundaries on what you on what you can get done, but at least it gives you a way to get in faster. Uh, but then again, you'll you're good, you're probably gonna have to be tied to some sort of innovation bucket. Uh, like if the CEO of a large company says, I need agents now, right? And you are uh associated with that initiative, you're probably gonna have an opportunity for acceleration. Uh, if you're not, then it's very likely that you're stuck in the regular uh buying process. Uh, but ultimately those organizations will bring you in fast or slow because you have a unique expertise and you're solving something that is a reality in their uh organization. And the one thing that I would advise again, if your market is uh enterprise, enterprise is never a green field, right? Uh so yeah, it's great if you can with your with technology widget latch on to the latest clouds uh or the latest uh open AIs, wizardry, that's great. But realize that these large organizations, for them to get value out of that new shiny widget, they have to somehow associate it with their older legacy, right? That might be older technologies, older data platforms, existing workflows, uh, you know, whatever it may be. So that legacy is also a reality in these large organizations. So don't discard it, but potentially see it as a business opportunity.

SPEAKER_00 50:38

Completely agree with you. And the enterprise companies we work with, they they do have um innovation functions. I mean, essentially, how do I get into a pilot as quickly as possible? In some cases, their goal is to get in within 24 hours from like a pitch to a pilot. Um, but it it it's not as common. So, uh Stan, do you mind sharing kind of like what was your approach in terms of getting into enterprise?

SPEAKER_02 51:05

Um, well, um, I think in our case, uh the problem of data was just biggest in enterprise. Uh, and if you look at the origins of the CDO, again, uh, the Gartner Research from the Gardner Research Board is pretty good from that. They have a function dedicated to investigating that profession, uh, and that shows it very well how uh the chief data officer role really originated in the enterprise. In other words, Maxim, there was no market for us in the other places at that time. Uh, so that's where we just um started finding our groove, I would say.

SPEAKER_00 51:47

Yeah, yeah. I mean, it it and I think that advice applies universally to any founder. Find the problem that can be described as is hair on fire, like that that he just like is driving the enterprise to go and buy decisions. Because I mean what we've seen through uh after with working with a lot of enterprises is like yes, the IT budget or the data budget may be growing, but it's it's a finite budget, and in order for you to find out budget, uh it's typically either displacing something that already exists, or maybe it's incremental, but incremental gain is small. Um, Stan, just one last question, uh, because I know we're at time. What was the best advice anybody has ever given you in your uh in your career?

SPEAKER_02 52:38

Well, I've received a lot of great advice, uh, you know, Maxime. And mostly I've been stupid enough not to always listen to it, but sometimes I have you know been wise enough to to listen to it and act on it. Uh so maybe I'll give I'll throw out a few random ones that that come that first come to mind. But remember, there's there's there's many other things, many other learnings. It's been 18 years, and uh, you know how in a in a startup or a scale up every year is like counts as multiple lifetimes. So I feel like a dinosaur sometimes. Uh, but essentially, one piece of advice to founders that I would give is um please, and I say that with the word please in front, hoping that you'll pay more attention, is to look at your entrepreneurship journey as a 10-year journey. You know, don't plan it as a get rich quick scheme uh within the next three years, because you'll only end up fooling yourself. So, really look at what are the next 10 years, the next decade, what will my life look like? Because you'll have your business to think about, but you'll also have your uh situation at home to think about, and those two need to be in harmony with one another, otherwise, they both end up hurting, I would say. So, look at it like a 10-year journey. Uh, and if you don't believe me, that's fine. Uh, but look at you even OpenAI or Facebook, all these big overnight successes, they've been at it for 10 years as well, before they started to really be visible uh at the world platform. So that's one piece of advice, and a second piece of advice that's from our uh chairman at the time, uh, who is still a mentor uh for us today. Um, and he always said, you know, uh sleep on it. So if you have a big decision, uh, or you know, you're getting an email in that's making you angry, or whatever it does, yeah, and you're typing away, like, oh, let me, you know, why don't you write it down? And he would always say on a half piece of paper or one side of a of a page, you know, because then you know brevity brings its own clarity, it forces you to think clearly and communicate clearly, but sleep on it, right? Because the person that wakes up the next morning is a different person from yourself yesterday. If you then look at what you've written down, you'll see, oh, what was I thinking? And then you're just editing a bit and you make it better. Uh, so 10 year journey and sleep on it.

SPEAKER_00 55:09

Completely. I mean, patience and endurance are probably the two values that every entrepreneur should uh enshrine or um uh adopt because without them it's it's really hard to persevere. Everybody sees the success, but like to your point, they don't see the hard word that led to that point.

SPEAKER_02 55:27

Stan, it was another of course, Maxime. The word that you use, patience, as a hard one. Uh, as an entrepreneur, uh, sense of urgency is in your DNA, right? Of course that makes it hard to sometimes have patience, but you know what? Do some yoga or something like that. You learn, right?

SPEAKER_00 55:44

I agree, Stan. It was an absolute pleasure having you on the pod.

SPEAKER_02 55:48

Same. Bye bye.

SPEAKER_00 55:50

Thanks. Bye.