SNIA Experts on Data

Introducing SNIA Storage.AI

SNIA Episode 25

SNIA Chair, J Metz, introduces SNIA Storage.AI™, an open standards project for efficient data services related to AI workloads. Storage.AI will focus on industry-standard, non-proprietary, and neutral approaches to solving AI-related data problems to optimize the performance, efficiency, and cost-effectiveness of AI workloads. Hear about the industry leaders who have combined forces to solve AI-related data challenges.
Learn more at snia.ai

SNIA is an industry organization that develops global standards and delivers vendor-neutral education on technologies related to data. In these interviews, SNIA experts on data cover a wide range of topics on both established and emerging technologies.

About SNIA:

Speaker 1:

All right, everybody, welcome to the SNEA Experts on Data podcast, and we've got an exciting day today because it's launch day. It's a big announcement and one that I think that is both surprising and not surprising. We've seen a lot of buzz around the subject and really what is most important is why we need to go back to the foundation and really revisit this from the bottom up and really think about what is the net effect. So we're going to get jumping right to the fun part, and I'm very happy to be joined here by Jay Metz, of course, for peaks that are brand new to me. My name is Eric Wright. I'm the co-host of the SNE Experts on Data podcast. So, jay, let's jump in and let's talk about storageai, which is super cool. It is. Tell me what it is. When in doubt, we always have to start the. So what exactly is storageai?

Speaker 2:

Storageai. I love this conversation. It's one of the greatest things about what I do is I get the chance this is the fun part of my job which I get to talk about the really new, cool things that we're doing. One of the things that has been an interesting development over the last couple of years, as anybody knows, is the fun, fancy stuff we can do with AI. You know, making AI videos, making AI music all that kind of stuff is fun on the user side, On the back end. It's a little less fun On the back end of being able to create stuff where you have to create the videos and make people happy.

Speaker 2:

The underlying architecture is not quite so straightforward. As a matter of fact, one of the biggest problems we have in the AI world is getting information into the places where they need to be. So what storageai is? It's a way of doing that. It's a way of getting all that data into the processors and getting them into the right places at the right time for the right moment, into the right places at the right time for the right moment, and so what we're looking to do is have an open ecosystem for being able to have all the different companies that are working on these problems. Come together, talk it all over, have a solution that everybody can sign on board with and effectively make things life more efficient and more useful, quite frankly. But ultimately, what we're looking to do is we're trying to create a nice, ubiquitous solution for getting data to where it needs to be at the right time for the AI workload, and that is not as easy as it sounds and certainly wasn't as easy as just saying it out loud it out loud.

Speaker 1:

Yeah, it's always interesting because we see there's a ton of early innovation. That's obviously it drips off the tongues of us now. We refer to so many different companies and brands that if you talked about five years ago, no one on the street would even know who they are. But now it is really kind of like ubiquitous and part of what we do every day. But it is funny. I think, that the industry got there way before. A lot of the vendors and the platforms hardware, software, cloud, et cetera that are going to be running this stuff and looking at how SNEA has always been so important in really getting, I'll say, a cooperative way of doing it. It's cooperative but it's also competition. It's an interesting thing because these are folks that are industry competitors but we come together as a community because we all move faster when we move towards and with a standard. So you know, looking at AI in general, how have you found where it is in the world versus where it is in? You know where the standards bodies are coming to the fore now.

Speaker 2:

Yeah, let's do that. Let's take a step back and realize what the stuff really is. Right, because the problem is that if you don't already know the problem, you're not going to be able to understand why this is a big deal. Okay and realistically, no one individual can have a good scope of understanding of all the different things that are going on. And when we talk about industry standards or the organizations or the companies that are putting all this stuff together, it's all very nice fluff, but the reality is that there are problems and the problem, one of the biggest problems is that the understanding of the problem is a problem in and of itself. So, if you talk about an AI workload for those people who don't really understand the AI workload, they don't know that it's not actually one giant monolithic workload right, it's not one thing, it's not like you're. If you're talking about your laptop, it's not Microsoft Word, right, it's not Mail. It's not Outlook, it's not your web browser. That's an application, that's a workload.

Speaker 2:

When we talk about the workload for AI, it's considerably more complex and it actually winds up being a series of workloads and it's not this one big thing. If you're talking about training, or if you're talking about inference, it's broken up into multiple different workloads, and the problem is is that getting the data where you need it to be means that sometimes you have to do it in certain ways, and sometimes it's in different formats, and sometimes it's in different structures and different locations and it's complicated. And then you zoom in even further and you start to realize well, dang man, that's a hard thing to get from point A to point B without going through X, y and Z, right? Oh, I'm sorry, you're Canadian X, y and Z. So the issue here is that when you break it down and say I need to get the data from here to here, because it then has to go from there to there and there to there and there to there before you could ever start doing the work, and that's the thing that people don't normally get. There's no one way to do that. There's a thousand ways of doing that, and they're all unique, oftentimes they're proprietary, they're all difficult and it's getting to be overburdensome, right? So when we talk about the stuff, what we're trying to say is like how do we line up the dots, how do we make sure that we're not doing all these detours for every single thing that has to go along the line when we start talking about AI, because they'll talk about GPUs until the cows come home, right, they'll talk about networks.

Speaker 2:

I talk about networks, right? We talk about all these things that are component parts of the architecture as if it's this nice compressed box and all you have to do is make the box really nice and efficient. It doesn't work that way. That's not what winds up happening. So let me just give you an example.

Speaker 2:

I need to have a GPU have the data process, but the problem is that the GPU doesn't talk the same language as the data that's stored on a drive let's just say it's on an NVMe drive. I have to convert that data into a method or to a structure that the GPU can use. Ah, but it's not just that the GPU doesn't have a direct connection into that data. It's got to go through a CPU, and that's assuming that you've got it locally on side of a server. What if it's not locally on side of a server? You got to go through a CPU, then out through a network interface card, then out of network into another network interface card and then another storage unit, oftentimes object or file or something along those lines. So you've got a control plane and a data plane just to start the communication in the first place. And, depending upon where you are in the process, you could be doing a lot of reading, you could do a lot of writing, it could be sequential, it could be random, it's not all the same all the time. And so you have these multiple personality disorders that are going on inside of a server with these different processors and the networks and the storage.

Speaker 2:

And orchestrating this stuff is non-trivial, right, excuse me. So what we're trying to do here is saying look, how do we get this to work properly in the first place? That's number one. Number two how do we catalog all this stuff to make sure we understand exactly where the inefficiencies are? That's also non-trivial because I could do some of the processing in a cpu to make sure that I get everything where it's going. I can move some of that into a dpu on the network side. So now I've got to split that processing. Just through the work. I could offload things into other accelerators for memory movement and so on and so forth.

Speaker 2:

And that becomes really problematic because as you start to increase in scale, you have other problems. You have to deal with Power Big problem, right, they're talking about creating nuclear power plants just to run AI. When you're talking about that much power, to be able to run a workload, a single workload, you have to make sure that it is as efficient as you possibly can get, because that's a lot of power, right. One watt in power per CPU up to a million nodes is a big deal. That's a million watts of power, right. One watt in power per CPU up to a million nodes is a big deal. That's a million watts of power, right.

Speaker 2:

But what happens when we yeah, exactly what happens? If we need to you know we need to to figure out what happens at scale when it comes to errors, right? So one of the things that's really noticeable is that the bigger you go, the more inevitable it is to have failures. That's why we do checkpointing, right? There's so many errors in a normal system at that scale that people will have to checkpoint on a very frequent basis.

Speaker 2:

Well, if you're checkpointing those very expensive processors which can run up to like 50,000 a piece, and you're talking about thousands, tens of thousands and even hundreds of thousands of these things, that's an expensive proposition if they're not actually being used, right? So if they're waiting for all of this movement to go happen before they can actually get started. That's an efficiency, that's a waste of effort, it's a waste of time, it's a waste of energy and it increases the likelihood of error before you actually get to a completed workflow. So these are the problems that we're trying to solve, and there's no one way to do it. Everybody's got an idea of the way that it should be done, but not a way of making sure that everybody's playing by the same rules so that you can take a single plug and plug it in and know that it's going to work.

Speaker 2:

What we're doing at SNIA is trying to accomplish the really Herculean task of saying you know what. It's very important that we all get together and have a good conversation about what's actually required and how we need to solve these problems, because sometimes I need to move memory, sometimes I need to have the access to the data from the GPU or a TPU, sometimes I need to have good communication over the network, sometimes I need to be able to have direct access from one memory to another memory location, and so on and so forth, and you have to be able to have a lingua franca to be able to have the conversation, and that's what we're looking to do, and that's where we're tapping into the enthusiasm of a lot of different companies right now. Wow, that's a long way of saying what the problem is.

Speaker 1:

Yeah, well, that's exactly why this is so important, because just even in there, we could unpack 10 podcasts just out of that section of challenges. And we will. Well, that's it. That's the beauty part. You know, thinking about just you, would you always imagine that we're just going to assume that someone else is handling that next thing, like, oh, we've got data movers that are already happening at the controller layer, where they've got CPU and OS optimizations. You're like, yes, but that's for a particular type of workload that doesn't act the way the software does Right and well.

Speaker 1:

That's now the fundamental part is no longer fundamental. We don't have assumptions that this is going to go on a single architecture. We don't have assumptions that this is going to go on a single architecture. We have multiple chip architectures GPUs, tpus, dpus. That's a lot of PUs. And let me tell you PU, when it comes to trying to guess which one you're going to be using.

Speaker 1:

If you're writing the application, that has to make assumptions that this stuff will be where it needs to be when it needs to be there, or it's going to be punishingly slow. And remember the latency thresholds and the latency. The cost that we're paying is so much larger Because before it'd be like ah, you've got a web transaction. You're literally doing like crud operations. Let's face it, 90% of applications were crud operators. We're doing very simple. You know reads and writes and updates to databases.

Speaker 1:

Yeah, but now we've got combinations of structured data, unstructured data. We've got regs, so that we've got combinations of the existing model plus additional external structured and unstructured data. We've got RAGs, so that we've got combinations of the existing model plus additional external structured and unstructured data. It's going to live in a thousand different places. It's going to be plugged in by APIs and we're going to have MCP servers all over the place. Great, now we know where everything is, but how do we get to where we need it and in a way that's the most efficient and effective to perform this task at the lowest cost, at the highest efficiency? And, yeah, it probably would look like an intractable problem, or at least probably a stack of intractable problems. This is the intracta stack. There's a lot to sort out here. Sounds like a new startup.

Speaker 2:

Well, I mean, I think one of the things is that the traditional architectures of compute, network and storage are what we've always been working with. That's what we've been playing with. We've been trying out different topologies. We're trying out different ways of handling the network, handling different data structures. Key value is a new possibility that people are trying to approach, but the difficulty is how do you get this to work in existing traditional environments? Key value and deposits they don't go well together Now. So, at the same time, if you Take a step back and you think about it for a second, we have an opportunity here to really rethink the problem right, because we're talking about a completely different type of cluster, a type of networking, a different type of layout of the way that the problem has to be resolved. You've got your data, you've got your compute right and you've got the transit ways of getting there. Up until recently, we've not had a lot of choices in the granularity between any of these things. You either had compute here, you had storage there, you had network there, and that's what you had to work with. But now I've got compute in processing and I can put it anywhere. But now I've got compute in processing and I can put it anywhere. I can put it on the network, I can put it right next to the data. I can split it out into multiple different areas. I can have multiple cores or compute units solving different problems simultaneously. There's a parallelism issue that is not inherent in the traditional mode of compute network and storage inherent in the traditional mode of compute network and storage, right. So if I want to say I have this workload and sometimes it's better if I do the processing on the data location itself, so that I don't have any IO at all, right, well, that's not good for the entirety of an AI workload, but it is very good for some parts of it. Right, I may want to do all my pre-processing and pre-training modeling using that compute near the data, right? Or I might want to have additional processing on the network point, right On the network endpoint. That's a possibility.

Speaker 2:

So if I open up my imagination for a second, I can say well, what if I put the processing where I need it, as opposed to putting the data where I need it, because now I'm just putting in all that data all the time. You've got bottlenecks. But what if I put processors where they're supposed to be closest to the data. Well, that means that I don't have to worry about the time it takes to move data from one place to another, and that means that I don't have to worry about the idleness. And that means I don't have to. I can reduce my worry about the errors and how it's going to affect my workload.

Speaker 2:

Now I'm becoming much more efficient because I'm rethinking the nature of the problem. But what do I have to do that? I have to have better data movement. I have to have better data locality. I have to have better processing near the data.

Speaker 2:

Am I going to be doing it in high bandwidth memory or am I going to be doing normal memory? Am I doing it with a shared pool? Because now I've got clusters of GPUs acting like one big one right that have to be able to communicate across, you know, some sort of transport network, and where do I actually put the tools where I need them? There's no one way to do that right now, and I'm not saying that the way that storageai is supposed to address that problem is going to be the one way to do it. It's not the one ring to rule them all, but what it is is a way for people can actually get together and have a conversation about at least some way where they can line up their shots, and that's what storageai is supposed to be able to do.

Speaker 2:

It is a thematic approach to solving a workload problem from the data perspective, right From the lifecycle of data where does the data need to be and when? And let me rephrase that when does the data need to be, not where do you want it to be? Right? And because, if I can have the data needs to be here, I can move a processor there now, which is before I just had the gravity was going the other way around. Yeah, right, and so I think, I think, ultimately, and that's that's, that's one paradigm. It's not the only paradigm, right? Um, there will still be a lot of opportunity for, you know, more traditional approaches, because the workloads are built for that way and we don't want to ask the software guys to rewrite their software, right? So that's a. That's a big thing, right? We want the workloads to continue to work as as as intended.

Speaker 1:

Right, right, we don't want to. It's. It's sort of this classic thing. Like we saw this with observability. People are like, oh, easy, all oh, and you had a team of developers going. No, no, like we've done what we were supposed to do.

Speaker 1:

The sprint said to do this user story. It didn't include observability data placement. Like it's just like we make a set of assumptions that all of you all are sorting this problem out. We've got SREs, you know, I assume that kid has the knowledge. Meanwhile that kid is just like just praying that everything stays up.

Speaker 1:

Like we've programmatically solved a lot of these things with automation but none of them are latency really critically sensitive. Like there's always this idea that we're going to send it out there. We'll fire a few containers out so they may take a little while to spark up. No big, we'll handle it in the queuing, we'll handle it in the caching, just so we don't drop the session. We maintain state Like, okay, cool, lots of assumptions that don't care about sub millisecond latency. The stuff we used to think about, like synchronous data connections, you know sub 10 milliseconds, you know metropollinary networks like this is baby problems compared to what AI introduces as the frequency of change of where data needs to be, as you said, not where we want it to be, but where it needs to be to be optimal. For all this other stuff of access to CPU, gpu, like they assume, uninterrupted access to everything because interruption could mean process stopped.

Speaker 1:

Start over, you know, and when you're training a model that takes days to do, you can't just yeah, you can't just like oops, you know, rebooted a couple of systems on Thursday night by accident, so we got to start over. You're like no, no, it's not an option.

Speaker 2:

Yeah, no, I mean you're, you're, you're spot on. I mean the, the, the real, the real question about, about this is that when you get into what people have normally considered to be black boxes, right, the opacity of not being able to see what's going on inside your compute cluster, what's not, you know, if you're a compute person or a software person, you don't typically think about how the processor does what it does. You just, most of the time, you don't care, you just sort of want it faster and better and bigger. That's what you want, right? However, the people who really care about this want to eke out every single possible amount of performance as they can, because, because the nature of the game is different, right?

Speaker 2:

So now we've got all of these different things working in parallel and the bottlenecks are in the relationship between devices. Like, let me say let me give you an example I have a CPU that, let's just say generously has 200 cores. That's a good beefy processor, that's a good big CPU. I've got a GPU that has 15,000 cores. Now, that 200-core CPU and that 15,000-core GPU are not going to have a one-to-one relationship between the communication between those cores. But if I've got a GPU that needs to access that data, and I have to go through the CPU to be able to do it. I have to pin at some level the relationship between what the cores are going to be asking for and what the cores can provide. The transport is going to do the best it can, right, but if I need to be able to have parallel ideas of how to communicate to the storage itself, the data itself, do I want to have an ever-decreasing level of that valve that has to be open to be able to suck in the data, right, and then, of course, you have to go into the kernel and you have to come back out to suck in the data, right, and then, of course, you have to go into the kernel and you have to come back out.

Speaker 2:

You know we have all kinds of magic that we do inside of the operating system to be able to handle, you know, the IO part of it, and so what we're looking to do is we're trying to solve those problems. Right, how do I handle a non-uniform way? We talk about NUMA all the time, but what's a non-uniform way of accessing data? Right? That is not part of the same chip cluster, right, and that's a non-trivial problem, and you need to have the cooperation of processors and data and all the vendors of the above listed ones. Right, and what happens if you don't have this locally? You got to go over a network to do it. Right, how do you do that? Well, that's why you know we're going to be working with you know, the partners for SNEA, right?

Speaker 2:

Ultra Ethernet is a really good example of this. Right? How do you do you know? If you're going to do file-based storage or object-based storage over RDMA, that implies you're going to be able to do it over AlterEthernet too. Yeah, right, so that's not SNEA's job. Snea does the file and object part, but the RDMA part, the UE part, that's a networking group, right?

Speaker 2:

So it's very important that SNEA work with these other organizations to be able to control you know a end-to-end way of solving these problems and fortunately, snea and UltraEthernet have a very good working relationship. So it's not just where do you, where does the data need to be, but where does the work need to be, and so that's why having an organization like SNEA, with its long tail of you know, its reputation and its you know established workloads, work streams, is a really good place to handle the storage and the data services, and then work with groups like UEC and OCP and DMTF for management and NVM Express for the protocols and so on and so forth. Right, right tool for the job, right organization for the job. And so that's why we're doing what we're doing, because if we succeed at what we're seeking out to accomplish, everybody wins. It's the rising tide rins all boats right Now.

Speaker 2:

Obviously, some of the stuff is going to take a long time. Some of the stuff is already there. Right, many of the projects we're already working on have been established inside of SNEA for a long time, but we haven't looked at it from the perspective of being a part of a bigger workload. Right, that's the new part. Right, creating an entire community of vendors and people and partners and academics and organizations to solve a workload-specific problem. That's new. That's a new thing for us, and so we're very excited about that in particular.

Speaker 1:

Well, and this is really funny that as we started off, we thought of this idea that this is brand new, it's an announcement, it's a big thing. What it really is is the culmination of decades of collaboration, effort and building standards-driven innovation, and the reason why this is so important that it lives in this Snea world and in between UltraEthnet, we've got CXL, we've got all these different folks that are going to be involved is just because we've got SDXI, we've got CXL, we've got all these different folks that are going to be involved is just because you know we've got SDXI, we've got CXL, we've got you know what's around data storage. We've got data security. You know. You talked about the idea like when do we? You know where do we put the data? Okay, cool. When do we move it? Okay. Second problem you know where do we move it to? Okay, another the data, or protect the data, or provide resiliency and multi-path, like you know.

Speaker 1:

Like you said, we've we've had multi-cpu for, you know, decades at this point, yet it took us till not too long ago really to really use all those cpus and cores, right, how many applications have you seen written in enterprises where it's like cpu one, 98. You know, core two, three, four, five, six, seven, eight, all the way up to 16, 0%. Congratulations, you've just underused the hardware in the worst way possible. And now, when we think of what we saw with stuff like TPM, when stuff was moving what we believe to be securely between memory cache and CPU, it was like, oh wait, a memory cache and CPU. It was like, oh wait, a second somewhere in the middle that could get hijacked. And so how do we make sure that we can actually protect it and understand the path of, you know, the path of transit, securing the path, securing at rest, securing like it's many, many things.

Speaker 1:

And security, why we always said why do you have to call it DevSecOps, not just DevOps, like? Because when we called it DevOps, no one invited the security team. We had to be overt. And again, one of those things that just we assume. We've made this assumption that someone's taking care of security, like.

Speaker 1:

Well, now we've got all of these multi-purpose groups in a single working community who then also can go hey, you know, we're also working on this. Right, it's no longer your storage, your CPU, your GPU and your security. Now it's like your storage and we're working with security and network. Everybody's got to collaborate together, and the only way you can do that is to have, like I said, a lingua franca. You have to have some kind of a little orphan Annie decoder ring. For those older folks like myself, drink your whole tea, yeah. So it's a great time to see this, and while it may seem like out of the blue to many, this actually has just been lying in wait and now it's just becoming visible the work that's happening, the collaboration that has been occurring, and now that can occur at a much faster rate in public.

Speaker 2:

Yeah, it's like the actor who's's taken 15 years to be an overnight success.

Speaker 1:

Right, exactly, and then you see them with. That's such a perfect example. Because then you see that person. You're like, remember when Mr Robot came around, and then the guy from Mr Robot also, and they're like, man, he's in like 10 different movies. So like, yeah, because you know, when Mr Robot was written and produced, it was four years ago, so then they knew it was going to be good. So then he's got an agent who's been like, ah yeah, so this is what's been going on while AI is this brand new thing. You're like, well, ai has an agent, not that kind of agent and it's been out there shopping around.

Speaker 1:

So AI now has a little bit of everything, just like the guy that sells cranberries. Whoever's the cranberry salesman, that person has been getting cranberries in every drink known to person. Kind, hope, somebody's on commission for that efficient, secure and it's not negatively affecting the users and the overall ecosystem. We're really creating a protection scenario as well, because we've got power, we've got data protection, we've got security, storage, retention, all this stuff like what data stays, what state it can go. There's so many questions that are being re-asked that we think we just solved in enterprise computing, but everything changed.

Speaker 2:

Well, you also put your finger on, maybe inadvertently, but I think it's very important to kind of zero back in on. You said you know people think well, somebody should do this right, Somebody should solve this particular problem. And why isn't somebody doing that? Why isn't, you know, anyone thinking about this? Why isn't anybody working on these problems? And the short answer is because sometimes the problems are very big and very complex and no one somebody can do it. We are the somebody for this right. The data problem is a very real problem and somebody should do it. We are the somebody for this right. The data problem is a very real problem and somebody should do it. And it's true, and we are that somebody. We are the ones who are going to be working on a very complex and very ambitious project to make sure that the data does not get overlooked or is an afterthought, or you know something along those lines. And so SNEA is the somebody who's looking at that problem, and so that's why we're looking to make sure that.

Speaker 2:

You know, I've been personally reaching out to as many companies as possible. I've been, you know, the hyperscalers, the storage vendors, the processor vendors. I've been on all kinds of calls with people and I hope if they do see this and you listen to this, they realize I've been consistent in my message here. But I honestly do think that we have a moral and ethical responsibility to doing things correctly, not just from the content perspective, but also from the end user's perspective. Right, yeah, and I think that when we start to try to do the shortcuts in getting these things out the door so that various companies can make the money that allow them to operate nothing wrong with that, I'm a big capitalist Sometimes we forget that we build in problems, we build in cul-de-sacs of technologies in the future.

Speaker 2:

Now, I don't think that when you get to a certain stage, that's a good thing. I don't think that when you're talking about a million inputs, when you're talking about building nuclear power plants, to solve one AI workload problem, that it is sufficient to say, well, oops, well, we just we kind of backed ourselves into a corner Right and so the waste and the inefficiency becomes very, very real. Architects, as technologists, to at least attempt to address that problem so that in the future we don't have to completely undo all of this work and all of the money that's spent and all the effort that's put into time and all the man hours that have been put into this are wasted. Right, I certainly want all the work that I do to actually mean something in the future. I think a lot of people do as well.

Speaker 2:

But even more importantly than that, you know, I do think that when we come up with a solution, it's not just about the hype cycle. It's really about, you know, solving the problem. And so, you know, from a you know, from a personal perspective, it's very important. It's very important to me that we actually do something that has an import has an import?

Speaker 1:

Yeah, because long after the memes, the standards will remain and you know, while we, culturally, are latching onto whatever's current, active and most front of mind and the local problem that we have to solve I think Jim Keller said a great. He says we, as developers, we, you know, we solve extremely hard problems so that we can then introduce the next really hard problem. Like that's the goal. The goal is to solve something so well that it feels solved, but then you find a new problem and this is it we have. Now. We understand how these systems work. Llms are everywhere, we've got all this innovation going on, but now this is the next hard set of problems to solve and the fact that we're solving them as a community. It's just a chance that when you're in a room of people that are like-minded, with like goals as parent, supporting companies, all these member organizations all win when we're in the room going hey, so I just figured something out. They solved a problem that we've been looking at for a long time. Like that is the value of innovation as a group and standards driven innovation, because sometimes standards come, because we find them, and sometimes we create the standard and then we develop into it. I'd say that AI has effectively started.

Speaker 1:

You know, we saw MCP just showed up out of nowhere. You know, on Thursday. Mcp was defined by Monday. It was just like dude, are you even running? Like, if you're not running MCP server, what are you even doing? You're like that didn't take long before we realized what's wrong with you. Man, like geez really. And then you think like, do I really throw myself at this Cause? In two weeks are people going to be like oh, mcp was so last month. Weeks are people gonna be like oh, mcp is so last month. You're like we saw, we've got mcpi now. You're like but so some things get discovered and we're, we're building in public, which is kind of cool, but then people are losing sight of what's actually being built on did you ever see, oh?

Speaker 1:

I'm sorry I interrupted, please continue. Oh no, no, it's like, but like you, you really said it. For like, all the outcomes are are already being solved, like we're. We know we're solving and we're solving the problems that make those outcomes faster, secure, reachable, extensible.

Speaker 2:

Yeah, yeah. And when we were talking about you reminded me of an old, uh, an old, an old clip of uh, steve Jobs talking about the laser printer. Did you ever see that clip?

Speaker 1:

This one sounds familiar now, so but run it through me.

Speaker 2:

So he was doing a town hall, as he often does, or did Um and uh he was. This was back in the mid nineties, back when open doc was a big thing, and so one of his engineers got up and braced.

Speaker 1:

I think I know this one. He goes oh, here it comes. It's like one of the famous ones, yes, yes.

Speaker 2:

And he said okay, look, why are we not following the industry standard on OpenDoc? I did not expect the answer that Steve gave. Steve was like one of the things about being in charge is that you're going to get really good, difficult questions like this and when we start to make the decisions, you have to make very difficult decisions. You know there are, there are going to be really good technologies and they're going to solve a lot of different problems. The problem and he's he was talking specifically from Apple's perspective of selling a product, and I'm thinking more in terms of selling an idea, but the principle is the same. So he said look, you know when, when we started working on the laser printer, the Apple laser printer which, for those people who may not be aware, the the, the Apple laser printer, was the first affordable printer for businesses that didn't cost hundreds of thousands of dollars, that that was laser oriented. And he said we've got really good technology in there, we've got PostScript in there, we've got a lot of hardware stuff and they got chips in there All really good technology. Was there better ones? Possibly right, but when you saw the output of the laser printer, he said this I can sell.

Speaker 2:

I can sell this. I can't sell OpenDock. I can't sell PostScript. What I can sell this I can't sell OpenDoc, I can't sell PostScript. What I can sell is this this is what's important. I believe storageai is something similar. Right, I can go through each and every one of the technical working groups that we've got and I can say to the cow from home that this particular solution is great, is going to make you do everything you want it to do, plus make you lunch right.

Speaker 2:

The reality of it is that if it's divorced from the workload or if it's divorced from solving a problem, I can't sell the PostScript. I can sell the laser printer. I can sell conceptually the idea of storageai. I can sell the fact that these things are working together to bring a result to people who have a problem. That goes beyond whether or not this bit moves to that register Right, and I can say consistently, repeatedly over time, reliably, that if you do this, you will get the result that you want, and I think that's the thing that people want more than anything else.

Speaker 2:

They don't want to have to reinvent the wheel every single time I come up with a processor, every single time I come up with a new memory format, every single time I come up with a new nan format or new networking protocol. Right, I want to make sure that the the thematic approach, just like that laser printer, is going to solve the problem again and again, and again. And that is what we're bringing to the table here. I'm not just bringing SDXI, I'm not just bringing computational storage, I'm bringing in the conceptual analogy to the laser printer in storageai. To me, it's I'm okay with that. To me, that's a win 100%.

Speaker 1:

Yeah, and this really is. You know, you couldn't do it without being able to tap the existing active working groups. It is such a natural extension to what already existed. So this is. It couldn't have landed in a better community as far as I'm concerned, and it also couldn't have come at a more, you know, important time. Every day is a great day to be in technology and an incredibly challenging day to be in it as well.

Speaker 1:

So, yeah, and you know, all of this stuff that we were trying to solve with software defined storage. Then it became, you know, we griped over the terms and all these things, and then we've got network storage and network transports and all of a sudden we've got, you know, containers that are running on disk controllers that are actively doing computing your data, running on disk controllers that are actively doing computer near data.

Speaker 1:

And that's been around for decades. It's just that it's never been done consistently at scale, knowing that we have the potential to reach the outcome. Like you said, if you went to the drawing board and said, if we design a laser jet, it's going to be able to do this amazing thing. Here's a mock-up of what it is. Just go to all these different tools and you can go there today. Just do a Google search. Didn't mean to mention a brand name, but just do a generic search and say AI, generation X, whatever tool it's going to be, and it's there. It's working today because they've tried to individually solve a really strong problem. But how are they solving it? Cash flow we got to solve. What's the best place for folks to get more information about what just came out? How to get involved with Storageai as an initiative and how to dig in and get their happy little AI-loving feet wet on this fantastic working group?

Speaker 2:

What a visual did you just put in my head? I'm going to let the AI filmmakers get away with that one. So obviously, sniaorg is the best way to go for all things SNIA. We are going to have a lot more information over the next coming weeks, next coming months, especially as we go through and out of conference season in the fall of 2025. We also have Thiniaai as a new landing point for where the Storageai work is going to be. Over time, you'll see that fill out with more and more information. Obviously, the SNEA Developer Conference is going to be a really good resource and if you can't make it live, then over time the videos will be available on YouTube. So there's a lot of material that you can find out from the SNIA and the SNIAorg family of sites and, of course, there'll be more posting on LinkedIn and ex-formerly Twitter. I still call it Twitter as well, without question.

Speaker 1:

And I think the fact that people that are in the organizations are obviously aware of the collaboration capability in the organizations are obviously aware of the collaboration capability, but I love that more and more smaller companies and startups have a chance to participate. It's not onerous or incredibly costly to be involved. Like it just everybody wins at least to go in and tap the community for questions, to understand what's being done. Cause they're they are really just an amazing group of fantastic humans like yourself who are spending their hard time and hard yards to do that early innovation and discovery and research so that we know when we go live that we're trying to do it towards that optimized goal and security. Folks get in because this is one of the best times to be in there.

Speaker 2:

The security folks, the energy efficiency folks, the power folks all have a very strong play in this.

Speaker 1:

Yeah, and this is not a Bill Hicks style like, oh, let's throw some cost savings on top of it, let's put some sustainability on there. We don't want the marketing people to savings on top of it, let's put some sustainability on there. You're like, no, like we don't want the marketing people to get ahold of this. This is like we're building sustainability in reality. I mean, as a marketer, I can safely say that, but no marketers were hurt in the context of this conversation. But like that, that's it. You know, people need to get in.

Speaker 1:

Every aspect is being discussed and these are the people doing the work. This is not, god bless, my fine friends who are pundits. These are not the pundits who are having fireside chats on every AI conference stage about the ethics of AI for the 438,000th time and saying nothing. Right, yeah, coming to the same conclusion, which is we should really talk about this more. Wow, good luck with that. So you know, while everybody's talking, stuff's getting done, and you got to be at SDC to make sure that you're there doing it with them. So we'll have links, of course, in the show notes. And of course, don't forget, there's other amazing conversations like this one with Jay and other amazing humans who are part of the SNEA organization and other contributing and partner orgs. We've met with folks from all sorts of organizations in the past. So SNEA experts on data, jay Metz, thank you so much for taking the time and if people do want to find you, what's the best way to do that?

Speaker 2:

you so much for taking the time and if people do want to find you, what's the best way to do that? Probably through LinkedIn. My name J Metz, my Twitter handle is Dr J Metz and at Dr J Metz, and those are probably the two or. I haven't been posting recently, but my website, jmetzcom, also does talk about some of these things, although, like I said, I haven't done much recently. But those are the best ways to reach me, the LinkedIn is a definite must follow.

Speaker 1:

I love your newsletters and your takes on. Stuff is really good to see, you know. Sometimes you know appropriately biting but honest discussions about some really interesting challenges that we're seeing in the industry. So with that, dr Jamez, thank you so much. And for folks, of course, check it out, make sure you get to sneaorg and now sneaai Now with 100% more AI. Sorry, that is the marketer me. All right, folks, have a good one and we'll see you all on the next podcast.