Unlocking AI's Potential Through Data Readiness Artwork

IpX True North Podcast

The IpX True North Podcast is a global industry resource for all things people, processes, systems, and technology created to share conversations with our network of thought leaders, innovators, and founders changing the shape of the digital future. Here we share their stories, impact, vision and tools for success in the areas of process optimization, engineering, the model based enterprise, operational excellence, and digital transformation.

All Episodes

IpX True North Podcast

Unlocking AI's Potential Through Data Readiness

May 19, 2025 • IpX - Institute for Process Excellence • Episode 44

Tony Cahill, the mind behind Crystal Onyx, dives deep into why data readiness is the critical bottleneck preventing successful AI implementations and shares how his platform transforms months of data preparation into weeks.

• Organizations struggle with data readiness when implementing AI projects
• The AI project cycle consists of five steps: problem identification, data acquisition, modeling, output, and evaluation
• 42% of organizations identify data availability and quality as a top challenge for AI implementation
• Poor data quality costs enterprise companies approximately $10-12 million annually
• Crystal Onyx creates a virtual representation of data across various storage systems without physically moving files
• The platform can scan millions of files per hour to create a comprehensive map of available information
• A customer reduced their data preparation time from 10-12 months to just 8 weeks using Crystal Onyx
• The system allows for custom metadata, data classification, and governance without disrupting existing workflows
• "AI isn't a solution, it's a force multiplier. If your data goals and processes are broken, AI will amplify that chaos"
• Organizations should align technology decisions with business objectives and start with small, low-risk pilots

For more information on Crystal Onyx, visit crystallonyx.com or email info@crystallonyx.com.

Stay in touch with us!

All podcasts produced by Elevate Media Group.

Speaker 1: 0:00

Welcome to the IPX TrueNorth podcast, where we connect people, processes and tools. Hello everybody, and welcome back to another episode of the IPX TrueNorth podcast, where we break down the biggest challenges facing organizations in the digital age. Today, we're talking about one of the most frustrating hurdles that companies face when trying to adopt AI data readiness. If AI is only as good as the data it's trained on, getting that data in order is extremely critical, but why is it so difficult? So to answer that question, I'm joined by Mr Tony Cahill, the mind behind Crystal Onyx, a platform that removes the bottlenecks in data management and helps organizations take control of their data and AI strategies. Welcome back to the show, tony.

Speaker 2: 0:53

Thank you very much for having me. Brandy, Great to see you again and really appreciate the time and invitation to join you.

Speaker 1: 1:00

I love it. I love having you back on so, tony, to help our listeners. I love it. I love having you back on so, tony, to help our listeners. You know, help us walk through a little bit of the AI project cycle steps. It'll get us grounded for the rest of the conversation, absolutely.

Speaker 2: 1:14

You know, typically it's defined in AI project cycles, really five steps, and it starts with the problem. You know, problem identification, what are we trying to solve, what's our issue? And then, more importantly, what's the scope. And how do we break that down to a use case, right, so that's solvable and also creates an outcome that we can track. Next is the data acquisition. Where's the data we have? It's a silo.

Speaker 2: 1:44

How do we start to go through, explore it, analyze it and start to curate data sets that we can use for submitting into either the AI or the machine learning and operations, which is really your modeling? And so from that comes the output tokens. What do we get out of it? And there then becomes the evaluation right, that's the final fifth step Evaluate what we've done. Did we hit the mark? Do we need to make changes, do we have to make tweaks? And so at that point it becomes just an iteration.

Speaker 2: 2:17

We iterate and repeat, and you could iterate and repeat all the way back to, maybe, step one okay, we have to redefine our problem or our use case, or is it two? So you can iterate through any of those steps, tweaking all of the elements here to really figure out and getting your optimum outcome, and this is a challenge, just I see, facing a lot of companies, and as you go through this, there's going to be breakdowns, and one of the things that I kind of thought was maybe interesting to talk about today was the idea about data readiness, because I see that as being one of the most challenging issues that organizations really struggle with. How do you get the data structured? Where is it? Are we getting everything we need? Are we getting the right information and the right data? And really we see that this both applies to both generative AI as well as agentic AI projects.

Speaker 1: 3:15

You know, and sometimes you know you and I have talked in the past about the data value chain. Do you want to talk a little bit about that? I think it kind of lines us up as well for this discussion.

Speaker 2: 3:24

Sure, thank you for asking that. So the data value chain really goes through and it really becomes part of the whole AI project. Because one is how do you capture, how do you identify this data? And, as we're going through it, every organization has just been storing all of their files, all of their data for years. There's value to it, right, there's value to the organization. If there wasn't value, why are you storing it, right?

Speaker 2: 3:53

So, with that, how do you start to identify, how can we convert this from just being able to store it and start to create it as an asset that actually can be monetized or has an ROI component?

Speaker 2: 4:07

And so, as you're going through this and applying either machine learning or AI elements on this, you're creating inference, you're creating outcomes to better understand the data, so you can then have strategic value of how to create automation, how to create analysis, look at how to do for some different modeling and predictions.

Speaker 2: 4:33

So, with that, the data itself, those files, have the ability then to keep with it metadata, information, more about it that enriches it and starts to really tell the story about what this file actually can mean. And then how it can be applied is where we start to look at OK, the data analytics, the data insights and how that applies, and then can create value for the organization, and so it's an interesting concept because everyone has all of the data, but it's how you go through that process step to really create that value chain. One of the challenges around doing that process is how do we know that we have everything that's relevant over years, 20, 30 years? I have just silos and silos and silos of different data that we are either actively using or have actively used in the past that has a lot of intrinsic value that can be pulled from it. How do you make sure that you can easily get access to those? And so I see that that becomes a challenge a lot of times to a lot of larger organizations.

Speaker 1: 5:44

This is perfect. So, talking about the data silos, and you know, with the acquisitions that companies have, or just becoming global in the multiple sites that they have, it's really easy to see how quickly you know this data can grow and how complicated your environment can get, and so you know. It leads us right into what we want to talk about today with the problem statement is getting back to AI, and why do organizations struggle to get started with AI adoption? And it's really with this core issue we're talking about why is it so difficult for companies to get their data ready for AI? So tell me a little bit about that.

Speaker 2: 6:20

Yeah, it's really interesting because AI has this big mysticism. Right, it's this shiny thing Also it's really interesting.

Speaker 2: 6:26

I mean because AI has this big mysticism, right, it's this, ooh, it's shiny thing. Also, it's really cool. Everyone wants like we need AI, we want to do it, we know we need it, okay, so what, then what? But we're finding what's interesting is, once you kind of get your mind around it, the biggest challenge isn't AI itself or being able to implement the AI project itself. It's to make sure you've got the right data that you're feeding into it. So you know that whole idea about data readiness.

Speaker 2: 6:54

So for an enterprise business, it's typically a high stakes environment. There could be a lot of great and good things that come up from it, but there can also be a lot of caveats and a lot of issues and challenges and possibly mistakes that you want to make sure that you're aware of. So as you go into it, you've got your eyes wide open, but we see that at the heart of any AI project cycle, of those five steps, it really comes down to three key things. First off is what is the problem you're trying to solve? If you don't know the problem, you're not going to get a good output.

Speaker 2: 7:30

It's this whole idea about when you prompt an AI. The more clear your prompt is, the better the outcome of what you're getting, and that's really really clear and really important here. The next thing is what are the data sets that you're going to use? And, within that data set, how do you know that it's everything you need in order to be successful? And, more importantly, are we missing anything? Or are we just doing too much, because you can throw up everything and then realize, well, we only needed maybe 50%?

Speaker 1: 8:04

of what we did. So it's really the idea of is this what we needed?

Speaker 2: 8:09

Is this the right data set? How do we curate and start to get our arms around it? And it's easier when you're smaller and have less data to go through. But you start compiling it and it just compounds. Once you start getting multiple systems, multiple disparate locations and different types of technologies, it can get really challenging to figure out how to go through this. Then the other important piece is probably the last piece of this is then how are you going to monitor, how are you going to test the outcomes and then be able to make the changes, to iterate and tweak it so you can get to that final optimum outcome that you're looking for? So what's interesting I found is that I've been researching this now for a long time.

Speaker 2: 8:52

Probably last long time meeting, you know, in AI project world is probably four months, maybe six months. But what I find is that you know in all recent studies going back and this is articles from Forbes, ibm, different consulting groups that are published on the web that 42% across the board have identified that on their polls, all the respondents identified that data availability and data quality has been identified as one of the top five challenges for implementing a successful AI project and usually it's within the top three, depending on who you talk to. Once you start to add then transparency, data governance, oversight now combining those together, it's 100%. These are the challenges that everyone's dealing with and that, if you don't have a good answer or a good approach for, you're not going to have a successful outcome for your AI project and the result of that is going to be lost. Time could be lost. Does a good behooves any company to create a successful plan for a successful implementation?

Speaker 1: 10:11

Makes sense. So you know many organizations. They struggle because they have old repositories it may be hard to access that repository, or they have in the cloud and their struggle to be able to utilize that if they're trying to pull that on a daily basis or many other potential challenges, and so to be in the cloud and they're struggling to be able to utilize that, if they're trying to pull that on a daily basis or many other potential challenges. And so, to be clear, it's AI isn't the roadblock itself in any way. It's the data disorganization or silos, or access and understanding of what data they have. When you think about this holistically, that companies have to get their hands around before AI can really be effective and prove the value that it's promising. So cleaning up your house per se right?

Speaker 2: 10:52

Basically, yeah, because if you look at it I mean, data is the foundation of any AI system. If you don't have it, you're not going to get anything out of it, and it doesn't matter if it's structured, unstructured third party. But one of the biggest things we see that is a challenge is dealing with that unstructured data. The big data Companies have been buying a bunch of storage, whether it be cloud libraries, archives, all silos, legacy systems and AI projects, the technology. I'm just amazed at how fast it's been developing and it really has gotten to the point where it has potential to deliver some incredible insights, incredible automation and some of the new things that are coming out with the agents. I'm blown away with. I mean, I remember it would take us months to do something that now, with like five, ten lines of code, you can come up with the same result. So it's going to have a very significant effect on how we're going to use technology to align this to how we operate and how we operate on a daily basis within a company and delivering value to customers. And it comes back to if that company doesn't have access to that clean, well-organized data set, then you're going to be sitting trying to figure out how to deal with all the fragmented records. How do we get access to this? And it can get costly just for that aspect of it, and that's one of the things that we really identify this, what we could provide value and be of help and service with.

Speaker 2: 12:19

We also identified that another significant challenge that customers are facing is how do you integrate that data and the data sources, the data from various different sources. So, again, you know, silo data stored in disparate systems, making that difficult to compile and create a cohesive data set. So then you start adding IoT devices, you start adding remote systems, remote sites, and all that starts to add just the complexity of this data integration. And so some organizations will say, well, just give us everything and then we can sort it out later. Well, you could do that, but it's just kind of like taking well, hey, you know, I want to clean up my house by buying another house and moving all our stuff over. You end up just with the exact the same problem.

Speaker 2: 13:05

So the issue with data quality alone. You start to see a cost develop from this and this starts to then, just alone, you start to identify well, jesus, we're going through this. How much revenue have we been losing? How many operational setbacks have we been having? How have we had to do all these gyrations to make this work? So, and you know just that alone, not even including AI, it's estimated that enterprise companies are typically losing probably around 10 to 12 million annually just due to poor quality of data and how they're accessing it.

Speaker 2: 13:41

So kind of cool, by taking this project, an AI project actually gives us an opportunity to go back and clean the house, to really start to make things more effective. And so we see that with Crystalonix, we have the ability to significantly improve the outcome, not just for the AI project cycle, but also just for the organization in general. And our value-add proposition that we're delivering is that we know we can ensure data quality, be able to significantly decrease the time and cost it takes to do this and enable the creation of the data value chain. So if you look at just that, you know decrease in time and cost if we go through and take everything and upload it to the cloud I don't care what provider you want to use You're basically going through and saying, okay, team, go find it, copy everything, move it up there and then we'll figure it out Kind of kicking the can down the road in some respects.

Speaker 1: 14:41

For sure, and I know some of the numbers that we've talked about are quite staggering with regards to what people are paying today or willing to tolerate with regards to time, if they're utilizing, let's say, like a third-party company, or some additional resources to try to sort through their data problem, to try to pull this together, and it's just super prohibitive. You know the timeline and the cost just to understand what data we have before we can even get to this point is it's crazy staggering. Love talking with you about some of the different and looking on your website and reading about some of the companies that AI adoption just really streamline this approach, get them there very quickly, potentially from some of the things we've discussed, over 10 to 12 months of data assessment in organization down to maybe even weeks for certain organizations. So how Talk? How does Crystal Onyx eliminate these obstacles?

Speaker 2: 15:57

Well it really comes down to because we cut our teeth on this early on. Working with high-performance computing centers, we saw that there was going to be a challenge that in these organizations it was kind of a precursor to AI and you had all of these different silos of data that never did not talk to each other. So you had to have different systems be able to pull out, find something, transfer that over to another system that would then transfer it to another system. So it was really really not complex, but it was very manual, right, and you start doing this across large amounts of data sets. That just becomes very unruly. So we kind of got an early view of what we saw was going to be coming. And we worked with organizations like Bosworth doing AI autonomous vehicle testing. We saw the German Climate Center, which is the second largest supercomputing center in Europe, and how they were being challenged with all of the different types of storage environments and what they were currently trying to do to make it work. So we were able to come in and, first off, just be able to scan everything. So what's great about this is we saw this then applies directly to Advantage within an AI project for a company is we can go to any storage system that they have and scan it very quickly meaning that we're doing it a million or a million plus files per hour and start to create a global namespace. That global namespace is just a virtual representation of everything that we're seeing within their environment. So nothing's moved, nothing's touched. We now we just have a virtualization of it. Once that's done, we know everything about the files, we know everything about the storage. It's all there and you can start running reports. You're able to then query and, using our intelligent query system, start to then analyze and identify everything that you have without having to ever have to touch the files. So what's great about that is it's not just a one-time shot, so it's not like we're just going and uploading, adjusting to somewhere. The system actually goes back and, based upon the business rules set by the customer or by the company Is it once every hour, is it once every day? And it's going through and just updating to making sure that we have all the latest, greatest files and everything that's updated. So you're continuing to work with the live model. The great thing about that is we can start using custom metadata, start to then go and create curation sets of everything that's there so what's primary and really customize it and start to break down that data set into what's the primary, what's most critical, what's the second level critical, what's third level critical, and start even creating iteration groups. So when you're testing you can do A-B testing, what's our outcomes. But it's a really easy way to do this and quickly.

Speaker 2: 18:59

And so in our customers that we've done this with, where they were trying to go through and do it themselves. One great example use case is a customer as you mentioned or alluded to earlier. They went through and they had around three petabytes they were trying to create the data set on. It took them about 10 to 12 months. They don't know really what they got. They got some On their second pass, different project, but using the same data set. We were able to go through, get a completely everything and in fact we found that they were actually missing files because they couldn't get access to them. So we found another petabyte worth. So we actually had four that were able to go through, do this, get a complete, curated data set and do it within eight weeks. And then their whole thing went from 10 to 12 months with I don't know how many there's like 12 people involved, down to eight weeks with three people.

Speaker 2: 19:53

So just that cost alone. What does that mean on a successful impact? And then does that mean that, gosh, we could even do two or three or four different projects within the same time period that otherwise it would take us to do one right? So how much more impactful can we be? So you know, it's taking a different approach, it's thinking outside the box and it's realizing that if I have all the data, I don't need to necessarily work with all those files and create all this translation and do everything that we need to. First let's just actually take a snapshot of it, virtualize it and then work with the virtualization, the reporting, the queries, take that curation set, get to what we need to and then take our next step, which is okay. Let's approach the modeling, and then we also enable that too, but that's a separate approach. The first thing, that's the most important, is just getting your arms around this data readiness.

Speaker 1: 20:47

It's just a huge shift, right? I just want to reiterate that's just a huge shift to go from an experience where people are usually doing this manually, where they've pulled in 10, 15, 20 people sequestered to kind of manage. That's what I typically see whenever there's kind of a project where they need to migrate to a new tool or a bunch of data, move a bunch of data, or they're acquiring a company or whatever this needs to be for anything. They just grab, they sequester a large amount of people and it takes months for them to go through this sort this stuff. And so what I hear you saying again is that you can take an example and you've lived this from a 10 to 12 month manual scenario down to eight weeks with just a couple of people managing this process.

Speaker 2: 21:32

And it's unheard of right how fast you can do it if you have the right tool sets.

Speaker 1: 21:37

These are the kind of efficiencies that organizations are really searching for, right? We spend so much more effort for much less of an improvement, so this one, to me, is just quite grand in scale. So I'm just really excited that we're talking about this one. Yeah, thank you.

Speaker 2: 21:52

Thank you and really appreciate the opportunity, you know, and invitation, and what's cool about it is, you know, even though it's a new tool, you're not actually changing anything. We're not saying, oh, you've got to do something completely different, it's just going away a slightly different way about it, and so that's. What's really nice is that we're not forcing anybody to change the way they're going to do business or the way they're going to, you know, change their organization or their approach. We're just going to help make it easier. And what we've found is, because Crystal Onyx is able to connect all the dots you know, and we create the visualization abstraction, we start to enable that flow of data, and this extends across.

Speaker 2: 22:33

You know all these silos. So you've got your traditional NFS and SMB data shares NFS for Linux and SMB, typically for Mac environments and Windows environments. You've got all the legacy storage environments that go back years, even back to mainframes, also parallel file systems. So you've got Lustre, gpfs. These are all very, very fast file systems that are usually used for supercomputing and for AI and also for any tier zero where you've got your critical storage.

Speaker 2: 23:06

Then we also even extend across to S3, object stores, clouds, multi-clouds and even able to manage tape libraries directly. We had one customer where we were able to go in and, over the course of four days, migrate the access of their entire 350 petabyte library archive and make it available for an open system. So it was going from a legacy environment to an open system environment and it took us four days. Everyone else was coming in and saying that's going to take 18 months, 20 months. It's like, well, no, we can just think about it differently and, using what we can do with the virtualization, we can make it available to you within four days.

Speaker 2: 23:47

So those are really kind of the cool things that we can get involved with and really help make it successful and impactful. Difference and the thing I like about it is this just is not a one-stop thing. You don't just upload it and do it. You keep everything where it is. You keep doing your business. We'll come in and just do a scan, and it's a touchless scan.

Speaker 2: 24:08

It's very easy and it's deployed within the environment, so you don't have to rely on a third party or external. This can be deployed within a company, within their data center, in their environment, either on a bare metal server, a VM environment or even, if you need to, a cloud compute environment or all of the above and all work in tandem together. So it's so flexible and because it's a multi-threaded, multi-process system, that's really really the key in this. Because it's so flexible and because it's a multi-threaded, multi-process system, that's really really the key in this, because it's very fast and we're doing everything in parallel, so it's not just doing one thing at a time, it's doing 10, 15 different processes all at a time and, as I mentioned earlier, typically we see a three-node cluster will often go out and be able to do this at a million or more files an hour.

Speaker 1: 25:01

What I love, what you just said, and I think it's worth reiterating, is that Crystal Onyx gives people the ability to bring this in-house. It's the utility that they use that they can do it. You don't have to have a third party do this. You bring the tool into your own facility. Your own people are doing this. They're being able to use and access this data in the same way that you know. You get them all up and running, and then the intent is that they got this on their own.

Speaker 2: 25:28

Yeah, yeah, and that's. That's the thing, right. There's no reason, once you have it, do it on your own. If you want to have somebody else, come in and do it, great, it's a great tool. You know it's a force multiplier. It's not. You're not looking to replace anybody. We're just looking to help make things run easier and faster, because really that's what it comes down to how fast can you do something with the least amount of effort within that affects the organization?

Speaker 1: 25:51

Always, always no-transcript.

Speaker 2: 26:27

What that means really is that you don't come out and brush it right. It's not like you have to clean it up because it got dirty, but it's that the data you have is applying to what you need, right, so it's not a lot of fodder in it. The way we go about it is when you're looking at everything, trying to dig through these haystacks right and figure this out. So if you start looking at trends, you start looking at patterns. We get interesting ability of running these intelligent queries, and part of the capability is not just well, we look at the file system information, or we look at the extended metadata, which is powerful to start with. But then let's even go deeper. You start looking at you know, introspecting a file. Okay, I'm going to open it, see what's here, start looking at the trends and we start identifying okay, what's relevant. It starts to become pretty interesting to identify.

Speaker 2: 27:21

Then we're looking for everything that has this attribute with it and it could be an owner, who created it, what department it came from, what was the time frame, what was the project.

Speaker 2: 27:32

All of that becomes now relevant for how you can classify the data. And that really is it. How do you classify right and then, with that, apply governance. How are we going to manage it? Is this in band? Is this going to be part of our project or is it going to be, you know, out of range? It shouldn't be part of the project because it includes personal information. You know, I know, we don't want anything that's going to include people's social security numbers or their credit card information or anything about their HR information that sometimes can get included unknowingly. So if you start having the ability of using filters, like we're doing, you can then get very granular on how you start to create this data and, as we're identifying, it we're grouping it using the custom metadata fields to then group it into a curated set and then take that curated set and even fine tune it down further.

Speaker 2: 28:35

And it's all being done easy. It's using the power of metadata and creating unique schemas and fields that we can create these granular, curated data sets and do it very easily. Again, it only takes one or two people to be able to go through this and start to fine tune it and then, when you're applying it, you're applying it not just across each file at a time, you're applying it now across hundreds to thousands of files at a time to create that set. So that's where it really becomes of the quality. Is that you're getting something. That is part of answering the question or solving the problem that you've identified to create the best possible outcome.

Speaker 2: 29:21

Once you've got your final data curation, identify that we've got quality, we've got everything we need. Now you can use the workflow engine of CrystalLocks to say, okay, what are we going to do? We're going to start moving the files, we're going to copy them from where they are and then copy them to somewhere else, and Dell will go in and we have what we call an out chest, which is really kind of a checkout. It's a copy out and you can copy out to any system. If you're going up to Snowflake Databricks which is a popular use other GPU providers, other cloud providers you have something you're going to move into a parallel file system and run through GPU locally. It just gives you all the flexibility you need to now take that within weeks, this curated data set that now you're applying to your solution of choice. So it just really streamlines that whole project from start to finish.

Speaker 1: 30:17

Got it. So you get to scan and identify all the data extremely quickly at any repository whatsoever, and then you're using that metadata to. You can change your metadata, you can categorize it, you can protect data, certain data, critical data. You can utilize that as your mechanism to do anything that you need, and what that does is it gives you that power then to make really good decisions with your data and where you want to move it, what you want to do with it, how do you want to categorize it, how do you want to organize that for AI or for any other tool whatsoever, for any kind of product that you need. So it's so. It really gives you the ability to then automate things too. So if you need to constantly move data that's coming in from one source or from one site to another site, you can do that continually, and it's really a lot less manpower than the way organizations are doing this today.

Speaker 2: 31:12

Correct, and that even includes like for really sensitive information.

Speaker 2: 31:16

right, we've got a bunch of different sensitive data and files creating a data room, right, which is kind of we're going to isolate this so that we want to run through an AI project but we don't want to run and create any impact, right? Okay, so wonderful. You can actually use the virtualization power of Bristol Onyx to now even export that as a file system, so you can create a whole virtual data room within your place and not have to actually build or separate it. So it just gives you a lot of capabilities to do things a little bit differently.

Speaker 1: 31:52

You said you should think differently right there, but that's one thing I really want people to take away from our conversation today is why do organizations need to rethink their AI approach Right and so for companies looking to adopt AI or starting to dive into this, what's your biggest piece of advice for either these leaders or even just managers within an organization that are looking to use AI?

Speaker 2: 32:17

The biggest thing is make the decisions that are aligned to what you're trying to accomplish. Align to the business, because everything I don't care what it is, every piece of technology takes a certain approach and that certain approach is the way they are approaching a problem that may or may not align with the way the company is operating, and this includes us too. The way we're approaching is very, very specific and we see that it aligns for a lot of companies, a lot of customers. That doesn't mean it's the de facto standard for everyone. So what I do really quickly is, when we first talk to somebody is I'm understanding, okay, what are your top three challenges? How can we help and resolve? And really quickly I just can tell it's either we're going to align very well or it's going to intersect, and at that point, if it's intersection, it doesn't make sense. It doesn't matter, and I see this over and over again. Trying to take a round peg into a square hole. Right, it takes a lot of effort, you break things along the way. It might not be the best way. So taking a step back and saying let's approach this a little bit differently, and that's really the great thing is ensure alignment, and that includes not just with the culture and the operation, but also the people. And then how do you remove unnecessary steps?

Speaker 2: 33:57

So one of the big things that we look at, and part of this whole idea of the round of virtualization, is why do we need to come and copy everything out first? Why don't we figure out what we can do, what's needed? How do we start to create the data set and curate it and classify it before we start the move? I mean, that saves so much time. It goes from this idea about data lakes right, moving everything up and creating a lake so that you have the ability to do the same thing, so we can get away from silos, get away from disparate systems or put it all together in one lake. But the problem is, when you do that, your lake starts to become swamp, and so what we decided is we have the ability to actually virtualize it and create a virtual data lake without ever having to move, and then get very clear on what this is, using the tools that we have, and then start to then move that up so it becomes.

Speaker 2: 34:51

You know, it's like crystal clear waters at that point, and we found that just by doing it this way, companies actually start to move faster, start to leverage AI, leverage all the different projects and the things that they're wanting to do, and it's without even they don't even have to now outsource their data to third parties anymore, right? So if you're really sensitive and you're worried about creating that these types of processes to move to third-party and clouds and what have you there's always a potential for security breach or that intellectual property or trade city can get out. So this keeps it within the four walls. So you have a lot of power and control on where you and how you are implementing these third-party. And I'm not saying don't use them. I'm just saying let's use them and use all of this better to really get to the outcome that everyone's wanting to get to.

Speaker 1: 35:46

Yep, yep. Use each for what their strong suit is.

Speaker 2: 35:49

Exactly.

Speaker 1: 35:50

And make sure you're focused on that. So you know we spend a lot of time on the front side of this right Data cleanup and all of that getting ready to use AI. And you know, let's talk a little bit here, as we kind of come to the close of this discussion, about the final step, right, closing the AI project cycle loop and I think you know you've talked about AI project cycle four and start the modeling step Now that Crystallonics is involved, and within a matter of weeks we've got data readiness is completed, the data classification steps have been completed, a quality data set has been curated, or data sets and now AI training can begin. So tell me what's next.

Speaker 2: 36:33

Well, before we go to what's next. I know we kind of touched on this, but I would say don't assume AI will solve all of your problems.

Speaker 1: 36:42

Okay.

Speaker 2: 36:43

Right, you know, if you think it's a panacea, it's not. If your data is not ready, it's not quality, it's not set up, ready to go, and you're just focusing on the AI tools and the cool shiny things, it's going to fail, right, Because it really has to start with smarter data management, and that's really kind of what we were just talking about. And so the faster you can efficiently clean up, classify, curate, the better your AI outcome and the better your insights will be. And there's a consultant. Her name's Leah Pinzer. I came across her about a month ago. She's an AI advisor, she's former CDP of Microsoft and she had a great quote.

Speaker 2: 37:24

I loved it. She says that AI isn't a solution, it's a force multiplier. And you look at what's going on and it really is. It's just amazing about what are the things that, if you do it right, the outcome you can create. And she goes on to say if your data goals and processes are broken, ai will amplify that chaos.

Speaker 2: 37:47

And her number one item on her AI project checklist that she preaches is data readiness Make sure your data is clean, make sure it's structured, make sure it's accessible, and so that's the things that we started to focus on originally, and just we cut our teeth on it for high-performance computing and we're working in some of the most challenging environments in the world, where we're going through and scanning hundreds of petabytes to even exabytes in a single pass and then being able to start taking action and copying and migrating and doing all the cool things with it, but doing it at a point where we're doing it at scale. And so I think, as you're going through and starting now to apply this from an enterprise standpoint, there's a lot of things that we've learned over the last number of years that are perfect lessons learned for creating a successful AI project.

Speaker 1: 38:44

You know, now you're speaking my speak right, we're talking about business processes first, and then tools follow, right? Yes, make sure that you've got good ways of working, good best practices, all of that good stuff, and then you want to focus, like we've talked about, on your pain points, your problem statements, and then make sure that we really understand those well before we go and expect any kind of a magic wand to do anything for us. So we just really need to treat it like a true project and really scope it out properly and have clear expectations.

Speaker 2: 39:15

Absolutely, completely, 100% agree.

Speaker 1: 39:17

I love it. Tony. We've talked about a ton of things here today. We've had a lot of good takeaways. First, I'd love you know if people are more interested in learning about Crystallonics or having a conversation with you about anything that we've talked about today, or just the capabilities of your tool and your solution. Where can listeners learn more about Crystallonics?

Speaker 2: 39:36

Absolutely. Thanks, brandy. They can invite them to come and visit our website. It's crystallonicscom C-R-Y-S-T-A-L-O-N-Y-Xcom, or send an email. Send it to info at crystallonicscom, again, c-r-y-s-t-a-l-o-n-y-x. That comes right to my email. I'll see it Respond back. Let us know what we can do to help. Any questions, anything we can do. We're here to help and I just love talking with different people, different organizations, because every organization has a chance not only just to share what we're doing, but also learn more.

Speaker 1: 40:17

Yeah, I highly encourage everyone to do so. I know Tony's been working on some rapid start programs as well to help organizations get a taste test of the benefits of what Crystallonics can provide. And you can do that very quickly and get a little trial, a little pilot run for you before you dive deep. Yeah, thank you, that's a great way to start.

Speaker 2: 40:33

I mean it's low risk, right, because we kept thinking we have this great thing that we spent just years working on that we think we find really fine-tuned it and encouraged and wanted to open this up.

Speaker 2: 40:45

So you know, whenever you start any type of new technology right and implementation goes through an evaluation, we're not going to just risk everything we're doing, put something in and say, well, okay, well, I hope it works.

Speaker 2: 40:59

So, thinking through this, I thought, well, let's come up with a different approach on this. And so let's do a rapid start. And so, for a very nominal fee, you can go have three months, download the software, install it on your system in your environment either whether it's a VM, on a bare metal server, even on a cloud compute, in whatever works for you and then you'll be up and running within two to three hours and then within another hour, two hours after that, you've now scanned probably a couple million different files within your system and can actually start taking steps through this. So within half a day you're already on your way to start getting ready for data readiness and just start that process of okay, how do we apply it? How do we take this and start applying our model and even start playing with it. I encourage everyone try it out. It's a rapid start program you can see on the website. Send me an email at info at crystallonicscom. I'll send you a link to get started and then we can talk further.

Speaker 1: 42:06

I love. It All right, tony, thank you so much, appreciate the time today as always, and we look forward to talking to you again here back on the podcast.

Speaker 2: 42:15

Great Brandy.

Speaker 1: 42:23

Thank you so much. Great to speak with you. Take care Bye. Thank you for tuning in today.

Speaker 2: 42:26

Don't forget to subscribe and review the show and for more information on IPX, visit IPXHQcom.