
SNIA Experts on Data
Listen to interviews with SNIA experts on data who cover a wide range of topics on both established and emerging technologies. SNIA is an industry organization that develops global standards and delivers vendor-neutral education on technologies related to data.
SNIA Experts on Data
Webinar Preview - Everything You Want to Know About RDMA But Were Too Proud to Ask
Our SNIA Experts on Data are here to explain the basics of RDMA and discuss why now is the right time for an “Everything You Want to Know About RDMA but were too proud to ask” webinar. Our expert panel, featuring Michal Kalderon, a Distinguished Engineer from Marvell and Rohan Mehta, a Senior Software Engineer from Microsoft have over a decade of experience developing RDMA solutions, they are joined by Erik Smith, Chair of SNIA Data, Storage & Networking Community and a Distinguished Engineer at Dell. Together, they share insights on this critical communication technology that enables direct memory-to-memory transfers without CPU involvement. This podcast is a quick preview of the SNIA webinar, "Everything You Wanted to Know About RDMA But Were Too Proud to Ask," which you can watch here https://bit.ly/RDMATooProudtoAsk .
SNIA is an industry organization that develops global standards and delivers vendor-neutral education on technologies related to data. In these interviews, SNIA experts on data cover a wide range of topics on both established and emerging technologies.
About SNIA:
All right, thank you everybody for joining. This is a great chance to change the way that we're doing the SNEA Experts on Data podcast, because this is a preview to something fantastic If you haven't seen it, if it's on demand, but if you haven't had a chance to attend yet, you definitely got to sign up which is talking about what we can do in the world with RDMA. And you're saying to yourself what do I need to know about RDMA? Well, I'll tell you what the goal of this webinar is to tell you everything you need to know about RDMA, and you don't have to be afraid to ask. This is a chance for us to really dive into number one what is RDMA doing, what has it done and also what's next, because we think about the amount of use cases that are coming up. It's a fantastic chance to be able to take a look at the different ways that technologies work and, especially, how new use cases are arriving. So thank you for joining.
Speaker 1:My name is Eric Wright. I'm the co-host of the SNEA Experts on Data podcast and also the co-founder a lot of co's of GTM Delta, and I'm joined by an amazing group of folks who are going to be on this really, really cool webinar, so I'm going to get you to do a quick intro. So, eric, if you want to get us started, Sure, I'm Eric Smith.
Speaker 2:I'm a distinguished engineer working for Dell Technologies.
Speaker 1:Excellent and Rohan.
Speaker 3:Hello, I'm Rohan Mehta. I'm working as a Senior Software Engineer at Microsoft.
Speaker 1:And Michal, certainly. Last but not least, if you want to introduce yourself as well, then we're going to jump right in.
Speaker 4:I'm Michal Calderon, Distinguished Engineer at Marvell.
Speaker 1:Fantastic. Now there's a ton we want to cover. I would love to, but I also don't want to give away too much of the good stuff that we're going to see in the webinar. But let's maybe get started. Eric, what do you see as the opportunity? What's your goal, because I know you're going to be driving the conversation, and so what do you see as the reason why people really should attend this?
Speaker 2:Yeah, so thanks for asking Eric and thanks for having us on. Yeah, so thanks for asking Eric and thanks for having us on. So my interest I'm going to be moderating the session and my interest in the topic really goes back a couple of years. Ai has been sort of driving the need for RDMA and so, as a person who's involved with networks and fabrics, especially as they relate to AI, getting to know how RDMA works has been very important.
Speaker 2:It's critically important to my field and about a little over a year ago, I started really trying to dig into the details of it, especially looking for training information, and I was frustrated that there wasn't a basic overview of RDMA. I was frustrated that there wasn't a basic overview of RDMA, and what I found was, even if I went onto a website like Open Fabric Alliance, for example, they had training information there and it went really deep, right into the verbs, right into the stack, all the way down to as deep as you wanted to go, and it was just a little too much to get started, so I struggled a little bit. It was just a little too much to get started, so I struggled a little bit. So what I'm hoping we can do with this webinar is to give people a framework on which to build, and so, yeah, that's why I'm, and I think we've hit the mark.
Speaker 1:clearly, you certainly got some amazing folks contributing to this discussion. So, Michal, based on that, what's your view of you know, sort of a nutshell view of RDMA, and what you look to bring to the session from your own?
Speaker 4:you know previous experience. Yeah, so I was exposed to RDMA about 10 years ago and really feel connected to the technology. So, like in a nutshell, right, rdma stands for Remote Direct Memory Access. It's a technology that enables direct memory-to-memory data transfers between computers without involving the CPU and the cache and the OS, making it super efficient. You know, having high throughput, low latency and, of course, low host CPU usage. And you know a lot of people know just the buzzwords RDMA, but they don't really know how RDMA works, how the RDMA NIC actually writes or reads from memory without involving the CPU, and so on. And so this is like. This is the reason why I wanted also to join this webinar and show people you know how this actually works and what makes RDMA.
Speaker 1:Super ideal for different applications and use cases, which we'll also go into in the webinar included. But sometimes people just say I kind of know what that means so I'm not going to ask. So I do love that you're going to be able to dive in and it'll be interactive so folks can participate and figure out you know like and especially getting involved in the SNEA community this is such a fantastic group of folks and all the contributing companies that helped to make it happen. Rohan, based on your background and experience, what's your goal and what you want to bring to the session?
Speaker 3:Yeah, so we do. Like Mikael said, we do want to give an overview of what RDMA is. When I joined Microsoft, I actually started working on RDMA right away, and so for me that was like being thrown on the deep end, and I know what it feels like to ramp up on the topic that you know is completely new, while also contributing to you know something, that's, you know production level and you know deploying with actual customers on it and actually using RDMA out in the real world. So we want to simplify this topic as much as possible in a way that somebody who doesn't know what RDMA is can also join this webinar and learn about it. Somebody who already knows RDMA but needs to know more details of how it actually operates at various layers of the networking stack. If I want to write an application that leverages RDMA, what steps should I follow? How should I go about writing such an application? Even such a person can join this webinar and learn low-level details like that.
Speaker 1:That is one of the things that I really struggle a lot of times with webinars is we never get a chance to go deep enough because we often don't have the experts on the call or it's tough to explore in the time given. So I've been able to get a preview of what you've got ahead and for folks that are watching this even after the event, it's a must attend, just because the depth you can reach and also the fact that this is a community of people that we can continue to connect to after the event and keep asking questions. It's fantastic as an opportunity. Now, why now? This is always the question. We've had RDMA in our world for a while. It's been a couple of years perhaps. So, michal, based on that, what is the importance of this RDMA discussion today?
Speaker 4:Right. So, like you said, yeah, rdma has been here for a while, since the 1990s, right, but it's gaining traction again today more than ever, you know, due to the challenges that are imposed by mainly the AI and ML training loads that are getting, you know, much, much larger and have really crazy demands from the network. So, yeah, it's definitely always been significant, you know, due to its ability to enhance data transfers and its low latency and high throughput and so on. But I mean now, with AI and ML and you know these workloads, really the computation is so complex and the GPUs are so expensive and you don't want any idle time on the GPU. You have to enhance the network, right? You don't want the network to be the bottleneck and RDMA is close enough, right, to provide that, I mean, compared to traditional other. You know networking protocols. So I mean this is the main reason why we feel it's back and it should be discussed again, helping professionals stay informed on the technological advancements and the benefits that this brings.
Speaker 1:Team RDMA. I love it. We often joke CPU storage, network, all those core artifacts, and this is truly the chance to revisit, because everything changed once the workload changed. Workload changed and if we don't adjust how we leverage the capabilities or, especially, explore what's coming in, you know, future growth and other innovations that could happen, this is, this is really cool. Some will say it's, it's RDMA or RDMAI. So who's ideal to attend this? Eric, you know the audience well, given that you're, you are the audience. So who's a? Who's a person that's really going to get value from sitting for this session?
Speaker 2:Yeah, really anybody who's interested in RDMA. What I like about what Rohan and Mikal have done are the breadth of the information. It starts at a very high level, introductory, sort of you are know, you are here, sort of point of view, and then it gets all the way down into providing wire traces. So you know actually packets as they show up in a wire shark, and Mikhail will stitch those two together and show you basically how, like a conceptual model of RDMA, endpoints communicate with one another and what that looks like on the wire. So it's really anybody who's interested in RDMA endpoints communicate with one another and what that looks like on the wire. So it's really anybody who's interested in RDMA.
Speaker 1:And I'll close it up just because I know we don't want to overshare the goodness that's in the full session. So, Rahan, based on your experience in looking at these AI and ML use cases Because I know it's not just dripping off the tongues of marketers and engineers alike, but we are genuinely seeing use cases come forward so where do you see the importance of what this session is going to bring to folks that are starting to dabble or even well underway with an AIML? You know, adoption and transformation.
Speaker 3:Yeah. So the key problem or the key bottleneck in the AIML model straight today is the movement of data, like the data transfer itself, and that is where RDMA comes into play with all its benefits itself. And that is where RDMA comes into the play with all its benefits. There's a lot of data already existing in multiple storage clusters around the world. We need to move them as fast as possible from this onto a GPU. Maybe the GPU is sitting somewhere else, so maybe across the network, then onto the GPU. So in both the scenarios, rdma is coming into play and playing a crucial role in making this data transfer as fast as possible, as efficient as possible, bringing all those benefits of bypassing kernel, bypassing operating systems, bypassing the CPU utilization on the host and so where the data actually is going to land. And then you know this, you know further facilitates, you know, the speed up of the process of training as well as inference of all the AI models.
Speaker 1:Well, this is going to be amazing. So thank you all for giving a quick preview and I'm looking forward to everybody giving feedback on when they watch the full session, because it's such a fantastic deep dive and it is just that right. We're touching at the tip of the iceberg. The deep dive that you're going into in the full session is going to be amazing for folks that really want to see where, where it's happening.
Speaker 1:This isn't just marketing. This isn't buzzwords. This is real opportunity to optimize and build fantastic things. So it's happening. This isn't just marketing. This isn't buzzwords. This is real opportunity to optimize and build fantastic things. So it's everything you wanted to know about RDMA. But we're too proud to ask and I always like to say make sure you check out everything else at sneaorg. We've got lots of other amazing sessions like this and other podcasts and, of course, all the webinars that are coming together. 2025 is going to be the year of distributed systems getting better because of distributed knowledge, and this is a great venue to do so. So thank you to Eric, to Rohan and to Michal for all of this and we're looking forward to the session and we'll see you all on the webinar.
Speaker 4:Thanks, eric, thank you.