AI Proving Ground Podcast

The Data Traps That Are Killing AI Initiatives

World Wide Technology

Nearly half of today’s AI initiatives are dead on arrival. The culprit? The data. From disjointed systems and undefined use cases to cultures that overestimate readiness, WWT data strategists Bill Stanley and Jonathan Gassner break down why many organizations, and detail how you can build real momentum for AI transformation.

Support for this episode provided by: Dataminr

Learn more about this week's guests:

Jonathan Gassner is a Technical Solutions Architect at WWT specializing in data engineering. He leads pre-sales efforts, aligning sales and delivery through consultative engagement. Jonathan crafts executive briefings, workshops and GTM strategies, supporting clients across sectors. He advances AI R&D, mentors future leaders, manages partner relations and drives innovation by staying ahead of data and AI trends.

Jonathan's top pick: Data Maturity Model

William Stanley is a Chief Technology Advisor with nearly 30 years in IT, specializing in data strategy. With an MBA and BS in Computer Science, he aligns data, technology and business goals to drive outcomes. A trusted advisor and innovative leader, his expertise spans IT strategy, architecture and analytics. A lifelong educator, he also teaches a graduate big data course and brings deep, cross-industry experience to every engagement.

Bill's top pick: A Practical Playbook for AI: Driving AI Adoption in the Enterprise

The AI Proving Ground Podcast leverages the deep AI technical and business expertise from within World Wide Technology's one-of-a-kind AI Proving Ground, which provides unrivaled access to the world's leading AI technologies. This unique lab environment accelerates your ability to learn about, test, train and implement AI solutions.

Learn more about WWT's AI Proving Ground.

The AI Proving Ground is a composable lab environment that features the latest high-performance infrastructure and reference architectures from the world's leading AI companies, such as NVIDIA, Cisco, Dell, F5, AMD, Intel and others.

Developed within our Advanced Technology Center (ATC), this one-of-a-kind lab environment empowers IT teams to evaluate and test AI infrastructure, software and solutions for efficacy, scalability and flexibility — all under one roof. The AI Proving Ground provides visibility into data flows across the entire development pipeline, enabling more informed decision-making while safeguarding production environments.

Speaker 1:

Right now, nearly every major enterprise is talking about AI how to deploy it, scale it, trust it. But according to a new report from S&P Global, more AI projects are failing this year than last year. In fact, nearly half are being scrapped before they even show results. It's not for a lack of ambition or investment. It's the data, the condition, it's in the strategy behind it and the culture that supports it. On today's episode, we'll talk with two of WWT's leading data strategists Bill Stanley and Jonathan Gassner. Both of them work directly with Fortune 500 companies, navigating the messy, complicated path from legacy systems and siloed spreadsheets to data maturity and meaningful AI outcomes. We'll explore why so many organizations stall on their journey, especially in that awkward middle stage where things are better than before, but still far from optimized. We'll also get into what being AI ready actually means and why building a culture that trusts the data is just as important as the tools that process it.

Speaker 1:

This is the AI Proving Ground podcast from Worldwide Technology everything AI all in one place. In today's episode, we'll get to the heart of every AI investment right now. Can you really scale AI without getting your data house in order first? Let's get to it. Get to it, okay, well, bill Jonathan, thanks for joining us on the AI Proving Ground podcast today. How are you guys doing? Doing well? Thanks for having us here, spectacular. We are here to talk data maturity and some of the initial things that we can do so that our clients can do to get themselves along that path of data maturity. I did read an interesting article a couple days ago. It came from S&P Global Market Intelligence that said more AI projects are actually failing this year than last year. I think it was 42, 45% of organizations are scrapping AI initiatives. How much of that percentage you know without knowing the exact results?

Speaker 2:

do you think might be attributed to data? Who wants to go first? I would guess most of them are related to data and not having the outcome in mind before they get started.

Speaker 3:

Yeah, I think a lot of it is. When you're you know we don't have your data ready for any AI initiative and you just throw AI at it, it's going to magnify any data problems that you have. You know we had mentioned that during our SCLAI day that we had a presentation that says, once you put an AI program or whatever software you decide to use, I think it's going to magnify any data security holes that you have anything missing. And I think that's probably a good reason why these initiatives are failing. They're kind of wanting that end result and everybody wants the end result, but there's some prep work that needs to happen to get to that point.

Speaker 1:

Yeah, and we'll get into what that prep work actually is. But I am curious. I hear so many leaders, so many organizations talk about we're AI ready or going to be AI first. Organizations talk about we're AI ready or we're going to be AI first. Where is that disillusion coming from? Where perhaps the C-suite or executives are saying we are data mature, we are AI ready, but maybe kind of the rest of the organization is saying, well, let's hold on here a second.

Speaker 2:

I don't hear a lot of folks saying that they're data mature, like when we look at the data maturity curve and we ask folks to point out where they are, it's usually maybe a two or two and a half. I've really never encountered anyone. But to your point, though, maybe executives aren't aware that I can't answer.

Speaker 3:

If I was to take a stab at it. I wonder if it's more of a fear of missing out. You know, kind of like everybody else is doing around. So if you do not have the term, you know, any ai initiative in your company or something, you will be left behind the competition, and you know. Therefore, it's going to take even longer to catch up yeah, I always go back to data strategy.

Speaker 2:

right, because that's where I like to focus. But if you think about data maturity and we often think about it as linear right, it's one step and then the next step and then the next step. But you're always going to encounter disruptors like generative AI. It's a huge disruptor. It disrupts everything, especially your data maturity journey. So often you don't want to wait until you're, you know, stage four or five to take advantage of this new technology. As long as you have a strategy and you have the North Star, that guiding light that you know you can align those activities with as well, you're not going to get too far off the journey. But if you don't have a solid data strategy, yeah, generative AI can take you to a place you really don't want to go, and that has to do with the data right and not having some end goal in mind as you develop your generative AI solutions. But that, when you say I AI now, most of the times, I think about that and, like the article you mentioned, I'm imagining they're probably talking about generative rather than applied AI.

Speaker 1:

Yeah, yeah, no, absolutely. Well, I'm glad you mentioned the data maturity model that WWT has come out with, glad you mentioned the data maturity model that WWT has come out with. A lot of what we talk about here today will be either included or is complementary to that data maturity model. Bill, can you just give us an overview of that tool, why we developed it and just a pitch, so to speak, on what it is?

Speaker 2:

Yeah, it's absolutely necessary, in my opinion. I always have opinions and usually happy to share. It's very important to understand where you are and where you want to go. Then you can define the steps to get there, and this tool that we worked on really helps you do that. It's easily consumable. You can look at it and say, oh yeah, that's where I am Now. You might not fall perfectly at one level, you might be in between, you might be like a two and a half, but it's easy to see then what is the next step, and we actually broke it all out. These are the next steps.

Speaker 3:

Yeah, and I think it provides a good common language between you know anyone that might be coming in or doing an assessment, and also within your organization you know such that you might have. You know one person like I believe I'm here, let's say I'm at, I'm at a four, but when you start doing the kind of pulling back the layers and you're like you're a little bit further behind and at least we have that, this baseline tool. Yeah, for that as well. Yeah.

Speaker 2:

And there are a lot of tools out there, right, really cool tools, and it's. You have to be cautious. Yeah, cautious, because there there are lots of neat things to do and we always want to explore and try new tools, but if you don't, you don't have a roadmap, so to speak, or or?

Speaker 3:

some kind of, and I like that. One more thing on the tool we mentioned this in the article is you know it depends, and it all depends, on the organization applying this. So you might have a really small organization that they may only have just a few sources, and you know they can apply it organizationally as a whole. But if you get a really large company maybe, like you know, Fortune 500 or something you might have different business units at different levels. You know, depending on what their individual goals were, what they're trying to do, and that's okay. You can use it in that as well. It's very, you know, multi-faceted in that way.

Speaker 1:

Yeah Well, I'll just articulate a little bit. So we have, you know, level one through five, one being kind of the lowest, five obviously being optimized, and that's actually what it's called. But level one, initial silo data sources on disparate systems. Level two developing initial efforts to integrate data from multiple sources, beginning to standardize data. Next stage defined, where you have centralized data platforms that integrate data from various sources. Step four, or level four, I should say, comprehensive data integration, data automation and management capabilities. And then obviously, level five, that optimized stage where you have enterprise level data fabric or data mesh. So that's one through five, bill, where do we see? I know it's going to, you know, you know be kind of all over the place, but are we seeing a bulk of our clients or a bulk of the industry kind of being in that? You know, lower stages? Are they starting to creep?

Speaker 3:

up. Two and a half is the most common. The most common, yeah, why? I think a lot of it.

Speaker 2:

You know, if you look at the stage one, it's like everyone's kind of getting yeah, right, because they have all of these different data sources and, as you were saying earlier, some of them may be more mature than others. But to do that at an enterprise level it becomes very difficult, especially going back to data strategy and now that's the last time I'll try to say data strategy, but if you don't have that selected, it's very difficult to bring them all up together.

Speaker 1:

Yeah, but why is it so hard to go from that two to three If you're stuck in two and a half? Where is the stall happening?

Speaker 3:

Yeah, I was trying to think about that last night actually, I think what happens, you know, when you're in that level one, we'll start at level one. You're excited. You're like, okay, I don't like this, I don't like this, I don't like how this is working, I want it to be better. So you put that effort in and you get to level two and it kind of like your immediate needs are taken care of and things are fixed. And then you become complacent because it's really kind of hard to like future plan and I think that might be, you know, just just a natural roadblock that people experience, like, okay, things are working. Now it's better than what it was. I could see the benefit from going to level three, but maybe I don't need that right now and they might kind of get stuck in that cycle and redo it.

Speaker 2:

That was kind of something I was thinking about last night, I think another perspective is when they don't have a business outcome, to align that IT activities to, it becomes very difficult. It becomes a technology exercise and without the outcome defined first it's really hard to meet the objective because you don't know what you're designing toward. And then sometimes there becomes this overload of I'm going to get all the data, I'm going to fix all of the data and, yes, you can't boil the ocean, you have to eat the elephant one bite at a time. All those expressions are true and that's why we use them. But I think that's it.

Speaker 2:

And if we can bring folks into a methodology like we leverage, where you know we define what is the business outcome, what are we trying to accomplish with this data? Now let's get the data to accomplish that. What do we need to do with that data? And then you know I'm kind of stealing Jonathan's thunder it really gets into the data engineering. What do we need to do? Do we need to? First of all, does it have the quality of the timeliness? Are there governance risks associated with using the data the way we want to? And then, what do we need to do to combine, refine, enrich that data and prepare it to answer these business problems? So having that and making it a repeatable process to deliver business value, I think that's really the best way to leverage the maturity curve and to step through that process in in iterative fashion, rather than I'm going from step two to step three. But why, yeah, why am I going?

Speaker 3:

there and you know, I think you kind of look like you know without that, without that. You know that north star initiative. You know that's where you get stuck at two. You fix, you know kind of what I was saying. It fixes the immediate needs. So it's like, okay, well, that was, you know, my goal, or maybe my business unit's goal, and we accomplished it. There's no, there was no further thinking. And I'd hate to say top down, you know, because you kind of want that. You know you want that change to happen from within the organization. But sometimes you might need that, that higher overpressure, like okay, as an organization we are going this way, how do we enable that? And that's kind of where that you know each business unit or whoever's in, whoever's working on that, might have a different approach, but they all know that they're trying to work to that common goal.

Speaker 2:

So you know definitely a gen a on organization is a top down thing and that's that's a good common goal to go for anything data, really right, A center of excellence or, depending on the size and shape of the organization, it could just be a team, a team of people, right. But, like you said, you need some kind of executive sponsor or a leader to champion those efforts data efforts or AI efforts and then you need business people representing companies from across the enterprise, right from the different business functions, to collaborate, and then you need the delivery folks included in that as well. So when you have a team like that in place, it's a lot easier to iterate through that and really define things that you can execute and deliver on.

Speaker 1:

Well, we talked about kind of working backwards from, or, I guess, working forwards from, the business outcome that you want. How specific do you have to be? Do you have a prescription on how organizations should think about those outcomes, or is it really just thinking about what's best for the business and then do you then go find the data that can help enable that, and do you then go find the data that can help enable that.

Speaker 2:

So for me, I think, in terms of performance management, what are the metrics? How will we measure business success? And that's usually very, very specific, right. What is the number? How is the number calculated? What are the variables we need to include to calculate that? And that's where it comes from, but, yeah, it's usually very, very specific.

Speaker 3:

Yeah, and I think when you're talking, like in those lines, customer engagement is because that's the thing, but you might not know. So you do a brainstorming session. I'm like, okay, here's really kind of what I want. Let's sit down a couple hours, let's just hash it out on a whiteboard and then you kind of build your little price prioritization matrix and like, okay, I'm, you know, I got some high priority ones, but these are kind of high efforts, these are later. But you kind of want to identify that like that really strong, impactful one. That's kind of a low-hanging fruit, that that gets that initiative, that gets that that momentum going forward. Yeah, and that's and that's usually how you in all the customers that I've worked, that's how they get that success that continues on those multi-phase projects. And you get that data buy-in. It's like, okay, this works and it really works. I solved a problem. It actually impacted the business in a positive way. Now we can tackle the other ones because we kind of have a process that works.

Speaker 2:

Yeah, that's a good point, and I didn't. I didn't mean that that they come to the table with knowing what all of those metrics are and how they're calculating. That's a process. That's that's working with folks and iterating through that and and really asking a lot of questions Well, why is that important? Or what numbers do you look at to to figure that out? Is there a way to bring all those together? People usually don't walk into the room with those answers.

Speaker 1:

Yeah Well, I was just in my head. I was thinking to myself once the business has an outcome defined, I would assume that they would want to then move, or think that you're in a position to move rather quickly to go tackle that. But there's probably another part, or maybe your team saying well, we have to go find the data, centralize the data, make sure it's all here, digestible, clean. So what has to happen to that data? From like going to find it to making it usable, so to speak.

Speaker 2:

Well, if we could back up just a minute, I like to think of it in terms of five steps. Right, there's the data strategy, there's management and governance, there's engineering, and then there's applied generative AI, and then there are analytics and visualizations. Those things don't happen necessarily one after another, but they overlap to a certain degree so often, starting at the end too, with the outcome in mind. The analytics and visualizations if you start with a wireframe for the visualization, you say, well, what is the data that you need to see to make the decision? And then I can work backwards. Then data engineering where is that data? And then the analysis of Exactly.

Speaker 3:

Usually in customer engagements that we have worked on, it starts with you know we have a critical business report. We'll use analytics because that's a common thing, you know. I know the big topic is AI, but there's also businesses always need reporting. Yeah, and like this process is horrible and it needs to be fixed. I need faster turnaround time report, or I need to refresh quicker. You know whatever, and you start working backwards and identifying everything that impacts that report and more often like, say, you know how do you find the data and stuff. That's an exercise usually in itself. You know, if we kind of look at that level one, that level two, where it there's a lot of maybe ad hoc processes, there's people using Excel sheets. They're using the tools that they had at the time to build this report. They're trying to make it better, but you may not have that complete picture on who's who. You know names on jerseys. Essentially, who owns the data? How can you access it? And one, how can you do it securely?

Speaker 1:

We talked about how it's an exercise, a process. How long are we talking here? Days, weeks, months.

Speaker 3:

I would say it depends on the size of the report. Usually when we do engagements, 12 to 14 weeks. Maybe you can go up to 16 or 18, depending on complexity yeah, complexity or the requirements it takes to get into a customer's environment. I know that with you know several of our public sector, you know federal already locked down. There's background checks and stuff that have happened. So onboarding can take those extra times. But usually that initially, that initial phase, I like really kind of wrap your arms around us about 12 weeks and that's how we should try to get our engagements.

Speaker 2:

And that doesn't mean that the next ones have to run consecutively. They can run concurrently to a certain degree and you do build muscle memory right. After you do that first project and you get familiar with that methodology of how you quickly deliver value, you should see some return on that investment right and shorten. But again, it depends on complexity, because some efforts are just more complex and they're going to run the full time that he described. But that's probably one of the most frequent questions we get asked is well, how long does it take?

Speaker 3:

Yeah, and I I think with that that first phase, you know, one thing to keep in mind for anybody listening out there and internally is is just that onboard it's like you're cutting teeth, you know, you're, you're, you're getting those initial things out there and then once you get that process established and you're, you know, we're into the customer's environment and we kind of have a good picture on what it is any next thing that we can pipeline along with it. It does significantly speed up and go time to value and you can usually get, you know, faster results there because you've done a lot of that initial legwork up front. It's kind of like it's a really steep cliff you have to climb and then you can go up the gentle hill to get on.

Speaker 2:

So yeah, and the first efforts are foundational anyway.

Speaker 3:

Yeah, exactly, that's always the hardest. Yeah, why is it the hardest you can run into? You know people might be on, we'll take it. People might be on PTOs. You're trying to get things kicked off. You may not have the right resources identified. Sometimes, unfortunately, you might run into individuals with an organization. They're just, they just don't want to embrace that change. Yeah, um, and it can take sometimes some kind of stuff. Like you know, we're not here to replace you or take your job. We're here to actually make your job better. You know, so you can actually, you know, use your individual talents and and contribute to the organization, um, some, some things like that. But more often than not, it's just trying to locate, to locate who get the right people in the room and find those names.

Speaker 2:

On Jersey, I think sometimes there are even tools and hardware that need to be put in place yes, that they don't already have set up. So that could be part of that initial foundational exercise.

Speaker 1:

Well, yeah, dive in there a little bit, because you know, beyond just the data, the fact that you need to put in tools, potentially hardware, whatever else it may be. What types of considerations there and any you know, details, or maybe not recommendations, but just how you should think about in terms of implementing.

Speaker 2:

I'll give you the consulting answer. It depends it's a favorite, and I say that a little on Jess, but really it varies greatly depending on the organization you're working with and what they're dealing with. Sometimes it starts with tools rationalization. They have every tool there is and they need to produce that tool footprint and get to just what's manageable. Sometimes it's out of control, but sometimes there's nothing in place right? Sometimes you're starting from scratch. Sometimes it's tool rationalization and reducing all that overhead, and sometimes they don't even know all the tools they have.

Speaker 1:

Yeah, yeah. So it's not just a get your data ready to feed the beast, it's get the beast ready to eat the data, yes.

Speaker 3:

I like that. That's good. That's good, get that foundational, and that's that. Yeah, we've done a lot of tool resolution, you know, and more often than not, you'll get something like you have an individual, like a team of people. We're going to do it this way, we'll do it in Azure, but the company's an AWS shop, so now you have two different cloud environments, right, and which one takes precedence? That's, you know, real specific.

Speaker 1:

Yeah. So I guess that would speak also to making sure that you have all stakeholders represented, because Because I mean, do you, can you work both tracks at the same time? Can you be working to get quality data while you're getting your systems and IT stack in shape to digest that data?

Speaker 2:

Absolutely, and there are some great tools out there and you know I or should we, nemo right, that Nemo framework. You can run that on the cloud and you can develop up there and then when you get your hardware in place, you get it all installed and configured. You can bring that code back down and run on-prem so you can start your development in different areas. There are tools like that that allow a lot of flexibility. Absolutely, folks can take advantage of that. Some tools kind of lock you in a little bit, but yeah, are you looking for flexibility?

Speaker 1:

Are you looking for?

Speaker 2:

power Are you looking for? Most customers are looking for flexibility and they don't want to be locked in. That's a very weird.

Speaker 3:

I hear vendor no, vendor lock in. They try to avoid that at all costs.

Speaker 1:

Yeah, so well, I do want to touch on the data part too. Right before we started recording, jonathan, you know we were joking around. What's your favorite type of data? You said clean data, clean data. Yeah, what is clean data? How clean does it have to be? What does that even mean?

Speaker 3:

I would say clean data, in my view, is something that you can use and does not cause you friction. You know, let's say that that's that's beneficial to you or the organization early, whatever that doesn't cause any friction. Um, you know, because without it you're just spinning your wheels. You know, you, you, without clean data, you don't know. Is it right, is it trustable? Can I, can I? You know, if I make a decision based on this, especially, let's say, we're gonna financial decision on this, is it right if you, if I make a decision based on this, especially, let's say, we're going to get a financial decision on this, is it right If you get, if you give a report that has two conflicting numbers, which one's, which one's correct? And that's why you know clean data is important to me. Well, how do you, how do you clean it?

Speaker 2:

Oh, I was. For me it's consistent data. That's like a pet peeve of mine and has been for a long, long time the consistency, and a lot of times it goes all the way back to application design and I feel like there's a disconnect a lot of times between different steps within software development and just IT in general. Right, we have software developers, then we have data engineers and then we have data scientists. But we don't often have a good feedback loop and so you may have a source system, an application that's designed, and when they create one of the input forms, it's free form. You could type anything in there you want.

Speaker 2:

Now, that's great. It allows a lot of flexibility for the end user. But when you go to do analytics on that, it's extremely difficult to make sense of what was in the mind of that end user when they typed into that box, and sometimes it's, if it's a required field, they might just type just about anything in there. So, having that feedback loop from analytics and engineering to say, you know, well, let's restrict it to these couple values, because these will at least provide some insight into the activity.

Speaker 1:

So consistency for me is probably yeah, well, we talk about we want clean data, we want consistent data. I think, bill, I've heard you say a bunch of times AI fails silently and confidently. I think that's Jonathan. Maybe it was Jonathan it was Jonathan.

Speaker 2:

Well, jonathan, because I love what he says. Every time it puts a smile on my face.

Speaker 1:

Yeah, so if that was true, if that's what happens, if it fails silently and confidently, what are some indicators that organizations can look for to at least see that happening as close to real time as possible, so they don't get down the road and have a big fumble.

Speaker 3:

I think, just honestly, that keeping the human in the loop, you know, doesn't make sense. You can use the common hallucination how many hours are in strawberry? And you can physically see three hours and it'll say no, there's two. That was a common, common argument that everyone knew about. I think that would be kind of just. You know, does this make sense? Does this sound make sense? And and if you know, if you're using my orbit, I'm a branch I'd be like this report, like chat, gpt, for example, and you look at the sources it's pulling from.

Speaker 3:

You know, I had this happen the other day. I was doing some research and it made a sound argument but I went to click on some of the links, the research, the articles that it pulled from and I was like this doesn't line up. So I think that when the terms like I will feel silently and confidently that's what happens is, you might just blindly trust it. It is getting better, but I think, kind of keeping that, I'd say fact-checking, but making sure that you got eyes on the final part of like, does this make sense?

Speaker 2:

It's so confident.

Speaker 3:

Yeah, it's easy to fall for it too Very recently.

Speaker 2:

I love Google Colab so I experiment with it a lot. I had a little data set, little CSV, and I created quite a lengthy Python script to go through and analyze this data and I put it all into ChatGPT. I gave it my data source and my file and I said validate all of my, you know my recommendations and findings on this data. And it came back and was saying all these things and it was quoting attributes that weren't in my data and I called it out. I'm like you're making stuff up. And it was kind of like, yeah, you got me. Well, no, I'll use your data this time. I'm like wow, I mean, why not just put some disclaimer in there? But yeah, I was. And seriously I was so surprised the response oh, you caught me. Yeah, I made stuff up. Yeah, wow, yeah. So yeah, you absolutely have to have a human in the loop and validate it yeah, well, I know you know.

Speaker 1:

One of the questions that I was going to ask was can ai help help get our data in shape to make a more data mature organization? But if we do still have that trust issue potentially with AI, how much can AI contribute to making this quicker?

Speaker 3:

If there are tools out there, it can get you started. So it can. You know there's tools out there that can do, like you know, identification, like what is this thing and you know what does this look like? Is this the right format? How do we think about maybe building our strategy and governance around this? You know, maybe get your call sets 75 to 80% to review, and then you can, if you're feeling confident about it and you really trust the AI. But I would venture to say they can get you started. You know, kind of like a template, almost, you know, or a form I say a form, of course, but a template that gets you started. Like, okay, this is kind of what we're looking for. I had accelerated that initial time of the data discovery, building the governance framework, the building the governance framework, the whole nine yards. I'm like, okay, I got a pretty good picture, view of my data. Let's start here, yeah, and then go from there.

Speaker 2:

I feel like a lot of those tools are just on the edge, though. They're just starting to appear. They're not very mature yet, but, yes, the future is extremely bright with using AI tools to discover and categorize and label and help us define our data asset and organize it. This episode is supported by DataMiner. Dataminer delivers real-time alerts on high-impact events to help stay informed and respond promptly. Gain actionable insights with DataMiner's AI-driven alerting platform.

Speaker 1:

Well, as we start to implement some of those tools that'll help us theoretically move faster, start to implement some of those tools that'll help us theoretically move faster. Faster isn't always necessarily better. If we start to see progress or what we perceive to be progress, is there a caution that needs to take place with the business side of things Like oh, we've got progress, let's keep moving as quick as we can. What's the balance there with speed versus solid maturity?

Speaker 2:

Yeah, I think you's. You always have to be cautious. You have to be careful what you feed it. So having to do some discovery, but even what you're going to let it do discovery on, you probably need some manual review of what you're going to let it consume.

Speaker 3:

And I would say manual review, probably from any of your stakeholders, business leaders, someone that intimately knows the problem. You're what you're trying. The problem is the business thing that's trying to solve. So, let's say, you're trying to do a financial report. Okay, from a tech side, it's really easy to get these tools like, yes, we're making progress. Who, this saved me a lot of time, I can go forward faster, like you said. But if there's not anyone there, you know I may not understand the end reporting or what that. So, having this home, that's like no, this is wrong. And then here's why it's wrong. I'm like, ah, I can see that and we can go back and fix that. A data engineer, yes, well, I would say the data engineer would go back and fix the pipelines. Okay, and we could also work to. You know we would work. You know, hand-to-hand with strategy. Make sure anything that was in the strategy and framework needs to be updated. But usually in your reporting it'd be like a BI, an analyst executive, those kind of things.

Speaker 2:

I think there's a delineation between applied and generative AI and I put data analytics and applied AI kind of in that same general area. There's a lot more data engineering that needs to happen right and to prepare and clean and prep the data. Where generative AI focuses more on unstructured data, you still have to be cautious with what you feed it, but that's more document review and making sure that the documents that you're letting it have access to have quality. So there really is a different approach depending on where you're headed, if you're focused on applied or generative AI.

Speaker 1:

Yeah, I mean, it seems rather intuitive about structured versus unstructured data, but can you give me a little bit of why each one might be important or what the value is moving forward?

Speaker 2:

No, you don't know. There are absolutely use cases for structured data and that's going to be more of your analytics and your applied AI, machine learning, deep learning, things like that. Large language models are going to focus more on your semi-structured data. I mean, if we wanted to really segregate that, or unstructured data, semi-structured and unstructured data, Semi-structured being documents and the definition really being that the data defines the structure, right, Whereas structured data, we build a data warehouse, we build tables and create relationships and then we conform the data to that structure, we make it fit. Okay. Where semi-structured is, the data itself defines the structure and large language models lean toward that, right, Because it's. Those are more based on language and the way we speak and communicate. So when we feed it a document, it's really all it's doing is looking for the, the relationships between the words. That's why it fails with such confidence, because if you really dig deep down in there, it's just saying, oh yeah, this word should come next. It makes sense.

Speaker 1:

Yeah, how should organizations approach their data estate in general? I mean, we're producing more and more data every single day. The number seems to only be going up. Do we need to prioritize maturing all of that data, a certain section of that data?

Speaker 3:

Do we need to just prioritize it and get it mature in a prioritized fashion? How should we be thinking about going about the whole data, the whole thing? I mean I was still tied back to the whole you know use case. I mean it's a fun, it's a favorite word of everybody, you know what.

Speaker 3:

What is your end goal? Because, like I said, we generate a lot of data. Some of it might not be useful or even needed. So if you apply the whole concept, we're going to get all my data mature, which is an admirable goal, and you can, if you might, use that data for something else. But if you're trying to get to like I'm trying to train a model or try to do something, you want to kind of trim it as you go along from raw to curated and that way any of your models something it's clean you're not having, just it keeps the size down, it keeps your cost down because you're also having to pay the store generally so, yeah, I would go as far as even it might be controversial, but not not all data has value.

Speaker 2:

Yeah, I would absolutely say that. So focus on where the data value is. And you know applied AI, you know machine learning. That's been around for a while and I remember we, when we first started having this data explosion a while back, there was this thought that well, all of my data has value. A while back, there was this thought that, well, all of my data has value. So I'm going to just start sifting through my data, have my data science, data science team do that and look, look for the value. Where are the, the patterns we don't recognize? And, um, a lot of those projects didn't end well, yeah, so again, having the outcome in mind is is is critical, and not all of the data has value.

Speaker 1:

Well, yeah, what data doesn't have value? A couple examples maybe that come to mind.

Speaker 2:

So if the data doesn't have quality or consistency, I would say yeah, and you would hope that you could get some value out of that data, but you might not be able to. And then there may just be additional attributes that you really don't perform analytics on, that just don't add a tremendous amount of value. Like, if you think about the data science process, really the first step is EDA exploratory data analysis and the first thing you do is trim off those excess attributes that aren't going to add value to the model, and that's more in the machine learning context.

Speaker 3:

I think if you look for like a good concrete example, you know, let's say, you're trying to do, you know like an AI, ops or some type of IT asset monitoring and you're gathering logs, most of those logs probably aren't going to provide any value. You know, like how many times you need to know. You might need to know if someone needs to log, how many times someone logged in, but a lot of SPI features system events that popped up that are just normal. Yeah, you know if you could imagine if you captured all the logs and had to sit through everything on your screen every time. You maybe logged a computer, signed a website, plugged a USB flash drive in, loaded a Steam game, you know whatever. That's really daunting and wouldn't provide any value to you. But if you had a hardware crash, that's important and you want to catch that. So I think it's just filtering the noise.

Speaker 1:

Who's determining whether or not it has value? Is that where that center of excellence kind of keeps coming down and having all the stakeholders able to plug in when they need to?

Speaker 2:

Yeah, absolutely. And what are we trying to? What is the end goal? What are the metrics? What are we really trying to measure? So, as you were talking, I was thinking of sensors, right, we were talking the other day in the W the whisk, yeah, right, about sensor data and you, you're collecting all this data. Do you really want to save every second of data coming from that sensor or, if there's a failure or an anomaly, have a certain amount of time leading up to that and a certain amount of time afterwards? Otherwise, it's herculean trying to process through all of that regular data.

Speaker 3:

I mean, you, you might find a nugget in there of gold, but I mean, yeah, even even if you're trying to capture something like sensory like that, so much of it it's going to cost. You know those sensors are producing stuff milliseconds, yeah, just the sheer volume of it. If you're just looking for just having to store it, if you just kept it on a raw form and like it'll, all data is useful to me that's. That's going to eat up your storage costs real fast, yeah. And then trying to process all your data, yeah.

Speaker 3:

I mean, and that's, and that's a specific example. But yeah, that that's a great example of something where things might be going fine. You might just look for the novel and you just need to kind of maybe capture five minutes before five minutes. I need to kind of maybe capture five minutes before five minutes for I'm not in personal manufacturing, but kind of like idea.

Speaker 2:

Yeah, like your dash cam. Yeah, exactly, that's a good point. The dash cam is a good example of that. What's the dash cam? Dash cam in your car, it just writes over that easy, records over, yeah, I mean, and and then if there's an impact it senses or something that it saves, oh, I'll save the five minutes before and the five minutes after. Yeah, yeah.

Speaker 1:

Yeah, is that starting to get into like an ROI on data maturity or is? Are you? Are we talking about ROI in terms of like the actual use case of what, what the data is feeling?

Speaker 2:

I think it's both.

Speaker 1:

How do you articulate then data ROI on the data maturity, when it's when it's just kind of sitting around ready to be used?

Speaker 2:

Well, that that's the key right Is that not all data has value, but data. I feel that data ROI can be difficult to measure and again I feel like I'm starting to repeat myself. But going back to that use case, what is the business value that I'm providing with that data and the solution I'm developing? If I can really point to it and say that metric is impacting the business, it's impacting our place in the market, it's making us a leader in the market, and we can equate that to real business value, then I feel that that's almost what you have to do these days, rather than just say, oh, I've got some cool tools, I'm going to build something and see if it adds value something and see if it adds value.

Speaker 3:

Yeah, I think that's how you get your ROI on your data, like okay now, because it supports. You know I need ROI on this initiative. You know this. I have ROI on my data that supports that.

Speaker 2:

Other statements that people would like to debate. I like to say that data is the most valuable asset and you know it's something you have to leverage to be competitive in the market and remain relevant. There are so many companies that pop up, so many disruptors, and they're new, they're nimble, they're agile, they grow up in the cloud, they're able to leverage their data assets. So I'm going to make sure we're competing with that.

Speaker 1:

Well, another aspect of the data maturity model, at least from like to get to that one and two to a couple of statistics that say you know, perhaps that you know organizations are not as data driven as they may like to think. Where do we see kind of the industry in terms of a culture of data and how do you start to drive that?

Speaker 3:

adoption. I'm thinking I think organizations are starting to realize that early on. You know, we've talked to key customers, you know, especially whether it is like okay, this, you know, my data is just not ready for AI, you know, but they're realizing that. So, like, okay, we, to make this work, we need to change from within and we need to have a different approach to it. You been collecting these things for years, collecting this data for years, processing it, working with it, but to really make it useful, we have to kind of change that. The data driven culture, yeah, and I think having, like you know, your champions, you know, you know some of those, those wins that we talked about earlier, is is kind of showcases the importance of it and why, why it works. You know, like, okay, whoo, this made my life a bit easier, my boss is happy, I want to do this again, kind of mentality. And is Gen AI enough to push that?

Speaker 1:

forward.

Speaker 2:

Oh, I don't know that. It's Gen AI. I think just data maturity in general. It's the analytics, it's the applied and the generative AI. You have to be able to trust the data. That's where I think generative AI is especially tricky. It really can help with your productivity but you still have to check it. But for just data analytics and applied AI, to become a data-driven company and have a data-driven culture, people have to understand they can trust the data, they can trust the numbers, they can trust the data source. And it doesn't have to be all of the data immediately, but when they understand and you grow, that I can trust that data source, I can trust this one and that kind of world of data is expanding for them. That's for me the most critical first step that I can trust that data and I can rely on it to make decisions. It's not going to come back on me.

Speaker 3:

Yeah, I think if you were to expand that just a bit, if you take that first step and then you kind of start building like I could trust the data and you know we'd talk later on the article, it gets a bit more self-service. So you know, any large version knows it's like I, I need to find some information. I don't really know where it is, but maybe now I know I have a spot for it specifically so they can do kind of like maybe their one-off reporting or something that's necessary, without having a formal process. But again, they trust the data, the data-driven culture, and they're like okay, no, this is how we're going to do it. You know this is important.

Speaker 1:

Yeah Well, we're getting up the maturity model here. Bill describes to me that stage five, the optimized state. I'm gonna think that it's probably a ways in the future still, but what? What does that look like within an organization and what is what's possible when you're at that state?

Speaker 3:

oh my gosh um I call that our data nirvana.

Speaker 2:

It is Originally yeah, yeah, and I think it's a journey that it's not like you're going to get to stage five and you're like, oh, we're done, the journey's over, we've reached nirvana, but it's continuous and it just doesn't end right. You always have to. There's care and feeding and there are always new tools, but that's when you're optimized to leverage your data assets, make informed decisions, you can trust the data. People have access to the data when they need it in the way they need it. You have a lot of automation that's providing that data in the timely fashion that it's needed, and you're able to leverage the most cutting edge tools the AI tools to help you find additional value from your data assets and look more toward leading rather than lagging at indicators.

Speaker 3:

Yeah, I think with that I mean it's not I don't think Bill was in a bad light where it's like you know you're never in. You know it would be nice to hit that final goalpost of like ta-da, we've made it Touchdown, yeah, but at that stage you know you're really, you know that your processes work. You've ironed out a lot of the kinks, you know, as technology changes because it will you're ready to ingest the new data. You've already got the teams in place, you've got the rough, you got the frameworks, you have it all nailed down. So there's communication and it just kind of comes in naturally and that enables you, like I said, you can leading indicators, you can adopt new technologies that come in because your data's ready for it.

Speaker 3:

So in a sense, the goalpost is unfortunately always moving at level five. You know I wish it was just an aha moment. We made it, we can relax and take a vacation. Maybe we can do that for about a month and just enjoy ourselves that we've worked so hard to get there. But you want to make sure that you kind of stay vigilant and your organization's data stays ready for any new technologies that come down the pipeline.

Speaker 1:

Yeah, well, that's it. I mean, I love that you say that you know, as new technologies come down the line, whether it's something as near term as a Gentic or AGI or quantum computing. Yeah, exactly, how does that change how data strategists think about data readiness?

Speaker 2:

Tough questions. How does that think about?

Speaker 3:

I can chime in if you want something. I have an idea. I think it prepares their mindset. And bear with me for a minute. You know, if we kind of look back in like the 80s and 90s and how data was managed you know was managed it's a lot different than today. But with you know, if we're coming down with like quantum computing or something down the pipes, way down the pipes, and the data might look different. But you've already, as an organization and such, has been exposed to what that process has looked like. So you're really not starting from scratch. The data structure may be different, I have no idea, but you know at least that you are prepared to handle those challenges because you've been through a lot of that legwork beforehand and you kind of made a well-oiled machine, essentially.

Speaker 2:

I was trying to envision how the data strategy might change. I guess that's the way I took your question and I don't know that necessarily it would. I think there are really two main choices in terms of data strategy, at least right now, and that could change. But it's fabric or mesh, and I don't think that would change. There'd be new technologies to help you get there, but the ultimate goal is that the data is available, it's ready to be consumed, but yeah, there may be new tools. I don't know that the data is available, it's ready to be consumed, but yeah, there may be new tools. I don't know that the strategic direction necessarily would change. Either it's, you know, it's fabric approach, where we're getting everything in one place, or it's decentralized in a mesh.

Speaker 1:

Yeah, because I know this is something that we've brought up before during AI Day, which you know. For listeners out there, ai Day is kind of a traveling roadshow where we're offering you know expertise around AI and practical AI and how you can start your journey, but from a data fabric, data mesh standpoint.

Speaker 2:

walk us through some of those details, why it's important and what it will offer potentially. You know that optionality. Yeah, really, they both seek the same thing and that's to bring the data in one place where it's governed and secured and controlled and people can come and access the data. It's discoverable, it possesses the quality and consistency and timeliness. The difference in the approaches is that Data Fabric you seek to bring all of the data together in one place and curate it and process it and deliver it. That way, where data mesh is distributed, it's decentralized. Think of a multinational organization with very different business operations. Those businesses are probably mature. That company's grown through acquisition over time. They have the knowledge of the business assets internally, they manage their own data. But at the enterprise level you still need to be able to do analytics. So if they're serving the data up as a product to a mesh layer, there you can perform analytics across those disparate businesses.

Speaker 1:

Yeah, that makes sense. We only have a couple more minutes left. It's been a great conversation Based on all the things that we have talked about. Are there any not necessarily small, but quick, but very you know, very impactful moves an organization can make right now to make sure they're at least on the right track, if not making very tangible progress?

Speaker 3:

If I had to pick one help, it would be extremely helpful to get everyone started to just start asking the data questions who owns it? Where does the data located? Maybe the technology stacks that's supported, and just getting a lot of that preemptive information up front that can. If you're in that stage and you just don't have that information, that could save you a lot of time both on an engagement and it can also get you to time and value a lot faster. And I would lead on probably what you might be on a data governance and data dictionary framework as well.

Speaker 2:

I would say just get started with the use case. It's so, so, very important and I can't say it enough that you have to know what is the outcome, what are you trying to achieve, and with that you can get started. But yeah, that's it. I love IT and I love tools and I love technology. But just to go explore with tools or build it and they would come the field of dreams approach, I don't think necessarily works out very well of dreams approach I don't think necessarily works out very well.

Speaker 3:

It, unfortunately it doesn't. I've been on a couple of games where the you know it's kind of like we don't have what you mentioned, nor do we have the things I just talked about. We don't have any idea. I was like, well, we've built, we've bought this tool, we bought this platform and we're going forth regardless. So we're building the train track as we're riding the train down the line and and eventually it just you just out of track, you run out of track, you hit some show-stopping roadblock and you have invested all this time and money and resources and you just don't know what to do and and usually, unfortunately, sometimes like that they just get scrapped and that not only kills momentum for that project but that likely kills a larger ai initiative within an yes, because they'd left that.

Speaker 3:

It had that bad taste in their mouth. They're like this didn't work for this small thing. How is this going to work for AI?

Speaker 1:

Yeah, yeah Well, jonathan, bill, thanks so much for joining us here on the show today. We'll have you back again sometime soon. This was a great conversation. Awesome Thanks for having us. Okay, thanks to Bill and Jonathan.

Speaker 1:

Here's what we learned today. First, data maturity is a journey, not just a checklist. Most organizations aren't as far along as they think, and while the temptation to rush forward with AI is strong, without a clear data strategy, it's easy to lose your footing and fall behind. Second, success starts with focus. A single, well-defined use case, one that ties directly to a measurable business outcome, can spark the momentum needed to go further. Without it, even the best tools can lead to nowhere. And third, culture matters. Becoming a data-driven organization means building trust in the data itself, in the teams preparing it and in the process that governs it. The bottom line AI doesn't fail because it's not powerful enough. More often than not, it fails because the data feeding it isn't ready.

Speaker 1:

If you liked this episode of the AI Proving Ground podcast, please consider sharing with friends and colleagues, and don't forget to subscribe on your favorite podcast platform or check us out on WWTcom. This episode of the AI Proving Ground podcast was co-produced by Naz Baker, cara Kuhn, mallory Schaffran and Stephanie Hammond. Our audio and video engineer is John Knobloch and my name is Brian Felt. See you next time.

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

WWT Research & Insights Artwork

WWT Research & Insights

World Wide Technology
WWT Partner Spotlight Artwork

WWT Partner Spotlight

World Wide Technology
WWT Experts Artwork

WWT Experts

World Wide Technology
Meet the Chief Artwork

Meet the Chief

World Wide Technology