SNIA Experts on Data
Listen to interviews with SNIA experts on data who cover a wide range of topics on both established and emerging technologies. SNIA is an industry organization that develops global standards and delivers vendor-neutral education on technologies related to data.
SNIA Experts on Data
Next-Gen SSD Performance: The Power of Flexible Data Placement
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
This episode of the SNIA Experts on Data podcast features Bill Martin who discusses flexible data placement in the context of storage devices. Bill highlights the importance of efficiently placing data to minimize write amplification factor, which can improve the lifetime and performance of SSDs. The discussion delves into the origins of flexible data placement, its benefits in reducing garbage collection, and the significance of industry standardization through SNIA to drive adoption and innovation in data storage technologies.
Bill emphasizes the role of flexible data placement in optimizing performance across diverse workloads by associating data with specific reclaim units, thereby enhancing endurance and reducing write amplification. The conversation underscores the value of standardization in facilitating broader adoption of flexible data placement, enabling consumers to benefit from improved performance and efficiency in managing data. Bill also mentions ongoing efforts within SNIA to explore computational storage, CXL adoption, and data management strategies to meet the evolving needs of modern application architectures and workloads.
SNIA is an industry organization that develops global standards and delivers vendor-neutral education on technologies related to data. In these interviews, SNIA experts on data cover a wide range of topics on both established and emerging technologies.
About SNIA:
Welcome to the SNEA on Data podcast. Each episode highlights key technologies related to handling and optimizing data.
Speaker 2My name is Eric Wright, I'm the host of the SNEA Experts on Data podcast and I am extremely pleased to be welcoming Bill Martin today. So, bill, for folks that are brand new to you, if you want to give an introduction and today we're going to talk about flexible data placement, or data placement Depends on, I think I'll probably get busted more for saying data versus data by more people than anything else.
Speaker 3So thank you, eric, it's good to be here with you. I am among other things I am co-chair of the CINEA Technical Council. I'm also a member of the NVME Board of Directors. I chair the Insights SCSI Technical Committee and for my day job, I work for Samsung Semiconductors working in their memory systems lab, working on all of their standards requirements for future product development directions. So I've been working on storage for almost 30 years and started out with SNEA back in the early 2000s working on fiber channel interoperability in the storage networking world interoperability lab.
Speaker 2Now it was interesting because you've got a really deep you know bench of what you've done with SNEA, as well as your background and how long you've been involved. How did you get started with the first time learning about SNEA and contributing to it?
Speaker 3I was working on fiber channel interoperability, leading interoperability events for the industry back in 97-ish and as part of that SNEA was doing storage networking world. I got involved there and, moving forward, had different projects I was on with different companies. That led me into leading a variety of technical work groups in SNEA and eventually joining the technical council and becoming co-chair of that back almost a decade ago of that back almost a decade ago.
Speaker 2So today we've got a fun topic to talk about, because this has a really interesting history and, I think, a lot of effects that may not even be realized by folks. So we're going to dive in on both the how and the why of flexible data placement. Let's take us back Bill, give us a bit of history on kind of what flexible data placement is, and then we can talk about how it plays out and why it really matters.
Origins of Flexible Data Placement
Speaker 3We've had a variety of attempts to determine where data is physically placed on a storage device, and this comes from the fact that you provide a logical address to the device. The device then translates that and places it in a physical location, and there are a number of things as we talk further today that we'll talk about as to why it's important for the host at times to know where that's physically placed and some real benefits to that in terms of increasing the lifetime of your SSD and also in increasing the performance of your SSD. So it has flexible data placement. Um came into the NVMe world. Um last year early last year I believe it was. It may have actually even been late in 2022. Um, and it was an effort that was started because Google and Meta had both come to their uh suppliers and said we need this capability, and they each had a slightly different approach and that in turn, got merged together and turned into flexible data placement.
Speaker 2What was interesting to me was that it kind of came out of this rapidly growing disaggregated space single app focus, so they generally had like consistent, predictable read and write patterns as predictable as they can be now. What was neat was seeing the move to like so I'm wearing a literally a shirt showing design of a bicycle. It seems simple, but it has to come from a fundamental design of a standard. So when did the standard then really move into how SNEA got involved and where other development partners began to jump in on helping out?
Speaker 3So flexible data placement got completed in NVMe last year. However, there's a desire to make it something that the industry uses, so SNEA got involved in the fact that we are working on a white paper. We did a white paper talking about zoned storage and how to most effectively use zoned storage, and that group said, well, we've started doing flexible data placement and there's really multiple technologies that have been developed out there to help with placement of data placement of data and so SNEA is now working on a white paper called Storage Data Placement. It will talk about how to maximize your benefits from flexible data placement, how that compares to zone storage and how that compares to something that I brought back into NVMe a large number of years ago called STREAMS, which was yet another mechanism, all of which are based on how do you place data on your storage.
Speaker 2When these come up as an origin, how driven is it by sort of prolonging the life and really extending the capability, or extending the life, expanding capabilities, or you know, it seems like it's a bit of a teeter-totter that in order to extend capabilities, there's always the risk that extending life and maybe that goes to the origins of where this came from, the idea of right amplification factor let's throw more capacity at it and then it will diversify the rights and allow us to extend life cycle. But then there's efficiency challenges. So this is a real tough balance of finding that equilibrium between the two this is a real tough balance of finding that equilibrium between the two.
Speaker 3I think the real thing here is the fact that if you can place data appropriately to minimize right amplification factor, then you, by doing that, automatically improve the lifetime of your device. You also have a side effect of potentially improving performance, because where your writes are going, compared to where your reads are coming from, are segregated based on the source of the data or the nature of the data, in terms of whether or not it's all data associated with a particular virtual machine, whether it's all data associated with a particular file. A virtual machine tends to either be reading or writing at any given time and if you associate all that data in one place, then you get a performance benefit because another virtual machine isn't utilizing that same physical area of your media.
Speaker 2So how does the standard and sort of, how do technologies use hints in order to make choices about placement with that sort of dual-sided thing? Number one we have to have efficiency in write, but also there's the potential for what read patterns could affect where this data is optimally placed.
Speaker 3So basically it is up to the application, the host system, to determine what things it wants to group together. And the grouping really the number one purpose was really the right amplification, the life expectancy of your SSD. The performance benefit is kind of a side effect of that, but it is something where the host has knowledge of virtual machines, can say this virtual machine needs to all be on this particular area and this other virtual machine all needs to be in another particular area. So it manages that and it manages it by, first off, knowing that it has a certain number of handles that it can use to associate with rights. And the handles are only associated with rights because it's determining where you put the data, not where you read it from. Once it's put there, you read it from wherever it happens to be. So the host has to know how many handles it has available and then it can allocate the graphic that it has for write data appropriately to handles where it says. I know that these are all related to each other in a way.
Speaker 2How would it come into play with like multi-tier caching and other things that could help around write amplification? And just curious on what are other ways that efficiencies and performance may also be found in different layers rather than the final write?
Speaker 3layers rather than the final write. So the caching really doesn't improve or affect the write amplification factor. It's a place where you have things before you actually physically write it to the disk. But I think you really have to understand what causes write amplification. So let me just kind of dig down a little bit into that.
Speaker 2Try not to go too deep on that.
Speaker 3But what happens with right amplification left? Excuse me, I got to grab a water.
Speaker 2No problem. Yeah, and regarding this, this is one of the I've seen a lot of folks struggle, especially folks that are higher up at the application layer. They may not necessarily understand the impact of patterns in which they create data and the impact on downstream systems. So more and more, I've talked with the folks in SNEA and learned about other stuff that's happening, different caching layers, but in the end it comes down to the raw metrics. Write amplification factor is a thing.
Speaker 3Right. So basically what happens is the host wants to write data to the storage device, to your SSD, and in a NAND you have blocks that you have to write in and those are normally a page size which is something bigger than a logical block. But once I write data onto the NAND flash, before I use that same NAND flash to write something else, I have to erase it. Now when I erase it, I have to erase it in what's called an erase block, which is even larger than a page. So if I've written a whole bunch of things to that particular erase block and some of them, if I wrote a logical block number one and then I write it again, I have to put it someplace else. I can't just overwrite that area. So when I do, the original location for that is then marked as no longer valid, no longer in use, and eventually I get to the point that maybe three-fourths of my erase block is no longer in use. And eventually I get to the point that maybe three-fourths of my erase block is no longer in use, but I want to reuse that three-quarters of it that's not being used. So what I have to do is I have to take the one-quarter of it that is still valid. Move it to another, erase block and then erase that one complete block.
Speaker 3Now what happens then is I've now had an extra write to the disk drive that uses up some of its lifetime, and that's what becomes write amplification. So write amplification factor is the total number of writes from the host. Actually sorry, let me flip that around it's the total number of times that a logical block is physically written to the device, divided by the total number of writes from the host. So if, for every block I write from the host, I only put it on the disk once and never have to move it, my write amplification factor is one. If I actually have to move it, then my write amplification factor goes up. Every single block I were to write, I have to rewrite it a second time onto the media because of doing what's called garbage collection, which is what I was really describing earlier. Then my write amplification factor becomes two. So the goal is to drive write amplification as close to one as possible.
Speaker 2Now this is the interesting thing too as well about Now this is the interesting thing too as well about pre-write and post-write, where there's continuous cleanup and reorganization. Now, of course, I'm an older gentleman, so I've been around since the days of defragging, spinning disk and the idea of being optimal on usage of the entirety of the platter. It's a different world, obviously, with NBME. But what work is done up front and what work is done post, and is there value in the same way that we used to have to literally reorganize because of physical wear?
Speaker 3So there is some value to that. And what's really going on with flexible data placement? We're moving some of that logic back up into the host. So what happens is the host says, oh, I know what I've used, I know what I've written into these particular what we call reclaim units on flexible data placement units on flexible data placement and by being aware of that, it knows what it may need to rewrite or move around in order to optimize its utilization of the actual physical space on the media. And that may be a little bit similar to what was done with defrag.
Speaker 3Um, one of the things that happens is what happened when we did defrag years ago is we took and we did all of this movement in order to, uh, free up larger blocks of space where you could write things continuously, which then made it easier to get better performance Today with NAND Flash, because every time you rewrite something, you have to wait for an entire block to clear up, defrag kind of happens automatically. But flexible data placement helps to remove some of the right amplification and it also helps to improve the performance when you haven't yet had to do that clearing up of an erase block.
Speaker 2Right. So, especially when you've got extremely high throughput workloads, that's less time that's spent on maintenance, so to speak, or garbage collection and other things that are reorganizing so that you're actually doing production rights instead of rewrites for lifecycle Correct. Now do we have any metrics on the actual effect? Because this is, I remember, the early days of people would say, hey, don't write stuff on a USB at what was you know high-end SSDs. And then we moved into you know commercial and the idea of we could have commoditized work, which I think is what Meta and those folks were trying to use. Now, what are the before and after once we see flexible data placements put in place?
Speaker 3So there's not. It's difficult to put it in those terms, because what you're talking about is what's the lifetime? The lifetime is basically limited by the number of rights a particular technology can support. Technology can support, and there's a lifetime number of rights that a particular technology can support. What flexible data placement does is it attempts to reduce the number of physical rights to the device.
Speaker 3Now, in the white paper that SNE is currently working on that we hope to have out the second half of this year, there's actually a number of graphs available that will show the impact of reducing right amplification factor and how close we can get it to one based on utilizing flexible data placement. So that's something that's coming out. I don't have the numbers in front of me at the moment, but I do have some initial numbers that are already out there. There have been other white papers published by individual companies that show some of that data already, including a white paper done by Samsung, but I'm just trying to go in a more of a vendor neutral vein here. We're working on a global discussion that will discuss how that is done across the industry.
Speaker 2That really goes to the value of this being brought to SNEA and the work that you do with the technical committees and the other folks. Like you've seen this over the years, I believe it greatly accelerates by solidifying on a standard, because then everybody at least moving towards a common goal and especially when we look at we have operating system requirements that have to meet up with this. So VMware, linux, the Windows, the Microsoft community, everybody needs to be able to then write towards kernel updates and kernel modules to be able to take advantage of this. And if we didn't have a standard body that was able to work together collectively, that's, we're vendor neutral in this, in the solution, and there's then vendorized implementations of a solution. So for me this is a perfect, beautiful storm of opportunity because we've got your capabilities, bill and your team and also all these other contributors that seem to be pretty happy to move this in and have it standardized.
Speaker 3Absolutely. And you know the other factor. You know you have the factor that by standardizing it, you get implementations from multiple vendors so that a customer, a consumer, can take a product from any of those suppliers and plug it in and it works. The other thing is one of the big things I've dealt with for years is the fact most of my customers say I will not utilize your new nifty whizzy functionality if you don't have a second source for it. And this is the mechanism where you get the second sources by standardizing. Everybody has the same definition of exactly how to do it and it makes it possible for people to second source it.
Speaker 2That's a funny thing too. You and I and a lot of folks that we chat with and our peer, we're always chasing the neatest latest. We're doing a lot of experimental stuff, but then when we get to the actual consumer side and we know about the laggard's curve, we know how it works. Is that enterprise buyers, as you say, they want predictable stability, they want to see that something is going to last, which is ironic because it has to last first before they'll even look at it. But that again is, I think, the advantage of coming as a collective approach, as a standards organization, and then everybody really really comes together and then it just cuts down those fences of you meet other experts, you know, and I think it accelerates the speed at which everybody can innovate now, versus us going alone and having to sit in your R&D team and likely, you know, a lot of time could be saved and then a lot of advantage to the customer is being lost because of that.
Speaker 3Absolutely. The standardization is really the important key here is explained well so that people can utilize it, which is where the SNEA storage data placement white paper comes in, as trying to educate not just provide something that's standardized, but also education on how to use it.
Speaker 2That's perfect in reminding folks that, of course, in the audio version or if you can watch this on YouTube as well, there's a video version. We'll have links and we'll update those too as new papers come online, because I know some stuff is pending publication, so as soon as we get links. But at least we'll share back to the SNEA site and people can take a look and also get involved too, which is the more the merrier. A rising tide lifts all boats, particularly when it comes to this type of work.
Speaker 3Absolutely.
Speaker 2Let's talk about the raw consumer side as that enterprise buyer. Then what's the advantage, the business outcome then for that buyer because of all this technology innovation?
Speaker 3because of all this technology innovation.
Speaker 3The raw output is the fact that if you have a layer within your stack that will identify streams of data that are associated with either all of my file, all of my IO for a particular file should go to a particular area on my storage device or all of my data for a virtual machine should go to a particular area, if you can simply implement that layer within your stack that looks at what the source of the data is and allocate that to a particular resource unit handle, then you can improve your lifetime.
Speaker 3You improve your performance. You know, one of the things I talked about earlier in terms of performance improvement was simply that your writes and reads aren't conflicting. But another really, really big performance improvement is you don't have as much garbage collection. You may get to zero garbage collection on your device and garbage collection is a background operation that affects both read and write performance. So by eliminating garbage collection you improve performance that way also. So you know, the investment by the consumer is in modifying their software stack in a way that directs your data output. The benefit is improved performance and improved lifetime of your device.
Speaker 2Luckily, that's the other distinct thing we've got with work that you've done. We've got tons of presentations that we talked about leading up to this Bill. There's a lot of good code level samples and other experts that are in the SNEA community that other vendors who want to get involved and take advantage of this they can do so. So again, we'll have make sure we'll share different links and such. Now what's exciting to you about what's next as we see the new storage data placement white paper coming to the world, and what's ahead in 2024 with you and SNIA Bill.
Speaker 3So, with the flexible data placement, I think that our hope with this is that by developing this, we encourage a broader adoption of flexible data placement, allowing the industry to actually reap the benefits of it. There's a lot of different activities going on within SNEA where, you know, I work not just in flexible data placement, I'm working on computational storage, computational memory within the broader community, looking at how you utilize CXL for things, and we're looking at, you know, places that SNEA can get involved in helping to accelerate adoption of CXL. Snea is also involved with SDXI. There's just a lot of things that we're doing related to data, related to how we move that data, how we store that data, and that is one of the things in terms of being experts on data. It's not just storage of data, but it's how we move it, how we manage it, all of the things related to utilizing data and storing data.
Speaker 2There's definitely a distinct advantage in the more we think as well about, you know let's innovations in lifecycle at the hardware layer and then software innovations. We're seeing vastly different patterns of application architectures. With AI and machine learning, with, obviously, kubernetes that introduced a lot, so more abstractions from the application, and it just really hits home that using a common way in order to implement this and flexible data placement, regardless of what that application layer is going to be, there's advantages that can be felt down below, and we got incredibly diverse workloads. Maybe one quick thing when we have very diverse workloads, what's an advantage with flexible data placement when we've got that more enterprise pattern of multiple workloads sharing a common storage subsystem?
Speaker 3So the big advantage is if you know that, without doing anything else, you are going to improve your WAF, you're going to improve your endurance of your device. So simply by knowing that you have different workloads and associating each workload with a particular reclaim unit, associating each workload with a particular reclaim unit If you don't have enough reclaim units, then you may overlap those and say, well, these two workloads I'm going to put with this particular reclaim unit. You're still going to reap the benefits because you are still going to have some amount of that reduction in your write amplification and improved performance.
Speaker 2Yes, fantastic stuff and thank you for going in. Definitely there's a lot more deep dive technical stuff. I know we could go a lot deeper, bill, and I want to thank you for jumping in and covering a lot of great stuff today. If folks want to reach you, bill, what's the best way to do that? And obviously, sne is a great way to meet you, so I hope to see you, bill. What's the best way to do that? And obviously, snea is a great way to meet you. So I hope to see you at SDC and some of the other events. But where can folks find you online?
Speaker 3Folks can definitely find me online at billmartin, at samsungcom.
Speaker 2Excellent. Well, thank you again, bill, and for folks do tune in, check out the YouTube channel and we can also find the SNEA Experts on Data podcast on all of your syndicated podcasts places of choice, whether it's iTunes or Spotify. So, thank you very much, bill, and thank you, folks for listening. Thank you.
Speaker 1Eric, Thank you for listening For additional information on the material presented in this podcast. Be sure and check out our educational library at sniaorg slash library.