Building for Scale: AWS’s Marc Brooker on Distributed SQL Artwork

What's New In Data

A podcast by Striim (pronounced 'Stream') that covers the latest trends and news in data, cloud computing, data streaming, and analytics.

All Episodes

What's New In Data

Building for Scale: AWS’s Marc Brooker on Distributed SQL

March 25, 2025 • Striim • Season 6 • Episode 2

In this episode of What’s New in Data, AWS VP and Distinguished Engineer Marc Brooker joins us to break down DSQL, Amazon’s latest innovation in serverless, distributed databases. We discuss how DSQL balances consistency, availability, and scalability—without the headaches of traditional relational databases. Tune in to hear how this new approach simplifies architecture, eliminates operational pain points, and sets a new standard for high-performance cloud databases.

Follow Marc on X, Bluesky, LinkedIn, or his blog for more insights on distributed systems, databases, and the future of cloud computing.

What's New In Data is a data thought leadership series hosted by John Kutay who leads data and products at Striim. What's New In Data hosts industry practitioners to discuss latest trends, common patterns for real world data patterns, and analytics success stories.

Speaker 1: 0:05

Welcome back to what's New in Data. I'm your host, john Coutet. Today I'm joined by Mark Brooker, vp and Distinguished Engineer for Databases at AWS. Mark has been at the forefront of building some of AWS's most critical infrastructure and today his focus on their distributed SQL database services. In this episode, we'll dive into the architecture behind AWS's distributed SQL offerings, namely D-SQL, and we'll also talk about the future of cloud databases and lessons from building resilient, high-performance systems. Let's get right into it, mark.

Speaker 2: 0:41

How are you doing today.

Speaker 3: 0:43

Yeah, I'm doing great. Super, super excited to be talking about D-SQL and databases more generally Always a fun thing to talk about.

Speaker 2: 0:53

Yeah, absolutely. I know the listeners are excited about this one. You've written a lot of awesome technical blogs on this topic and we're going to dive into some of what you discussed. But first, Mark, just tell the listeners about yourself.

Speaker 3: 1:08

Yeah, so I work at AWS and I've been at AWS for 16 years, engineering the whole time.

Speaker 3: 1:16

I've done a little bit of management on and off, but I've been a primarily individual contributor throughout that career and I have worked on a whole lot of the different, a whole lot of different services at AWS, Most recently in our databases and AI businesses but storage. Before that I was I joined the team that built AWS Lambda. You know, soon after the preview launch of that product I worked on the EC2 team and in the early days when folks were still saying why would I buy my IT infrastructure from a bookstore? And so it's been a super fun journey of just seeing the cloud grow and this whole industry grow around it. Before that, I joined AWS almost completely from a completely different background. I was working, I was at grad school, I was building radar systems, doing electrical engineering and kind of distributed background, but not computing, not distributed computing, not the cloud at all and so it was a really fun kind of career pivot back in 2008, and one that I think in retrospect I've been pretty happy with.

Speaker 2: 2:33

Yeah, excellent, and definitely so much value has been delivered with services like EC2 and cloud computing in general, and you know it's. It's changed the entire landscape for how you build and scale applications, whether you're a mom and pop small business or appear one of the world's largest and most sophisticated uh enterprises and tech companies as well. So very cool to uh chatting with you about what you've been working on recently, which is dsql. So I wanted to ask you who is dsql designed for and what key challenges were you aiming to solve for your users with this service?

Speaker 3: 3:17

yeah. So there are kind of two things there. It's, it's, it's almost kind of two product, uh, you. One of them is we have this set of customers who have been building on serverless. They've been building on containers and what they've noticed is that well, they've continued to love the relational data model, love SQL for all of the reasons for that, want transactions and all of that good stuff. But what they've noticed about most relational databases is they require a lot more operational care. They require a lot more scaling care and attention to scaling. They don't have the kind of scale and dynamism that folks expect from containers and serverless. Often they have things like connection limits which get in the way, and you know.

Speaker 3: 4:09

So one set of customers we were really trying to help out with with D-SQL is people building on that kind of modern cloud native computer architecture to give a fully relational database experience, to give SQL, to give transactions, to give ACID and all of that good stuff. You know, to folks building on that kind of infrastructure in a way that can keep up on scale, you know equals on simplified operations and so on, and then almost at the other you know, I wouldn't say the other end of the database market, but in a different part of the overall of transactional databases. There's a set of folks building extremely highly available applications, building critical applications, often applications that are critical to their countries or the world's economic or transportation infrastructure. And these are folks who are building, often have internal requirements or regulatory requirements to run in multiple regions, to have geographic diversity, to have the ability to move workloads away from entire regions.

Speaker 3: 5:25

And one of the patterns that folks love, and for good reason, is active-active right, like they want to build in a way that their application is running concurrently in region A and region B. They want to have again SQL, acid, the relational data model and all of that good stuff. They want strong consistency, they want transactions, and dSQL is also being built for that set of folks, with multi-region, active-active, high availability and durability across three AWS regions, the ability to remain available even when an entire region fails, no data loss or consistency loss on failure, and so two fairly different use cases. There's some intersection between them, but not an awful lot. But those are the two big things that we were trying to go after with this new Aurora flavor, aurora D-SQL.

Speaker 2: 6:23

Yeah, and D-SQL is serverless and you know that's one of the big advantages of it, and it's distributed. But you don't lose it seems like you know, based on how you designed it you don't lose the availability and consistency of having a you know this massive dedicated maybe it's a cluster database, right. Maybe it's a cluster database right. So you know, when I traditionally worked in building cloud applications, it seems like you know if you wanted to quickly build a demo or something, sure, use serverless runbooks for failover and you're probably going to need some dedicated infrastructure for that and significant planning just to maintain the infrastructure for an application in addition to the databases and the software stack. But what you're saying is you get all the advantages of serverless and the advantages of a clustered, highly available, distributed database.

Speaker 3: 7:24

That's been our goal, and as we get into the technical details there, we can talk about some of the trade-offs. But one of the things that we do at Amazon is when we're building a big product like this and not always, but especially when we're building a big new initiative like this is to spend a lot of time talking to customers and prospective customers about their needs, and one of the words that I heard a lot having those conversations was was regret. Right, like hey, I want to start with something that's that's really easy to get going, that's low risk, that I can build a meaningful application with a, with a small engineering team. But I don't want to regret that choice as I grow, as I'm successful, right, like as my business or this internal product or whatever moves up and gets bigger and gets more critical. And so that's really been our kind of core focus here is saying you know, this is going to be a great choice at small scale, it's going to be easy to get going, going to be easy to build that sort of V1 of your application. But then you're not going to be a great choice at small scale, it's going to be easy to get going, it's going to be easy to build that sort of V1 of your application, but then you're not going to regret that choice. Over time it's going to grow with you, it's going to meet your most critical operational requirements. It's going to grow with you on scale, on availability, on durability, on enterprise integrations and all of those other pieces. And so that's a big part of that vision, because we don't.

Speaker 3: 8:48

You know, it's great to have serverless products to make it easy to get going with something, easy to build a V1 of something. That's valuable in itself. But it's even better if you can build things that way and then you can slowly evolve them and adapt them into being huge and meaningful businesses. Because the worst day in any application team's world, or nearly the worst day, is when they have to go to the business and say, well, for the next year you have to turn customers away while we do a big re-architecture, while we replace our data store. I think nobody wants to have that conversation, but too many people do have it. I've had that conversation in my career and I would. You know, I hope to never have it again, and so that's part of our. You know, that's been a big part of the kind of drive behind this vision is that you know, folks can pick D-SQL and then it will grow with them, grow with their requirements to businesses of literally any size.

Speaker 2: 9:48

Absolutely, and that's a great point that, yes, teams build that V1 version of their product on serverless or however they design it, but it may not scale once they've become successful and have tens and thousands of customers. And, like you said, migrations can take years. Like infrastructure, migrations can can certainly take years and you know a lot of additional planning and costs for zero downtime in those scenarios. So I do want to get into some of the you mentioned trade-offs, but specifically you talk about active-active and multi-region architecture and I'm sure even across availability zone. I'm sure that's all there. Can you elaborate on how you balance the trade-offs between availability and consistency and latency in this design?

Speaker 3: 10:40

Yeah, yeah. So you know, I think, probably the first law of physics there is a consistency versus latency one. And so at the moment that you say to the database, commit, right, like I want you to make this transaction durable for me. You know it can make that durable locally, but in the multi-region setting it can either send the data over and then, while it's in flight, say it's committed, or it can send the data over, wait for it to have been sent, wait to get acknowledgement, know that it's durable on the other side and then return that commit and you know, and so the the difference between those can be, you know, no impact, no rtt's of of communication versus having to wait for at least one round trip between a, a pair of regions, um, and so that's, that's trade-off number one.

Speaker 3: 11:37

And so what we've done with, with the sequel and it's in its current guise and we might provide more control over this later as the product, you know, grows um, but we have chosen synchronous right, like when you call commit um, in that three region setup, we're going to make sure it is durable in storage in two of those three regions before we return.

Speaker 3: 11:59

You know that that commit to you and that does increase commit time latency not write time latency, crucially, but commit time latency but it does ensure that you can lose an entire region without losing any data. As an upshot of doing that durability work almost secondary to that is, now that we've moved the data over, we can, with very little additional latency cost, provide strong consistency. And so that's the other part of it. Is to say, can I assume that if I have done a commit in region A and I immediately go to region B, I can see the effects of that commit immediately? And again, if you do this replication asynchronously? The answer to that would be no, right. That replication might still be in flight and might still be, you know, being applied. But what we've chosen is consistency, and so that's related to the synchronization.

Speaker 2: 13:02

And when you say synchronous versus asynchronous you mean when you are ensuring the durability, sending the commit out to the other nodes or whatever pieces of the stack exists. Are you sort of holding the world in a sense and kind of pausing the database at that point before you've ensured that it's been acknowledged by the other?

Speaker 3: 13:23

nodes? Yeah, so that's a great question. The answer, literally, is no. You don't need to pause the whole database, but you do need to potentially wait on other writes to the same keys. And so if I have a million rows in my database and I'm update row A and row B and those are in separate transactions, they don't have to wait on each other. They can be fully parallelized. However, if I update row a and in region one and I update row a in region two, there needs to be some handshake, and that's. That handshake is there mostly to ensure the, the other important property here that we haven't spoken about yet, which is isolation, right, this transaction isolation, uh, and and providing the, the transactionality guarantees in in this multi-region active, active setup, um, and so you certainly don't want to be in a world where you're slowing down an entire data set and and so you don't have to do that from a kind of law of physics perspective, but once you have a single key which is being accessed from multiple places at the same time, there does need to be some at least one and a half, I think, is the physical limit rounds of communication to ensure isolation and to ensure that consistency.

Speaker 3: 14:40

I wanted to jump quickly back to availability, because you asked about that, and so I think what's important in our model is this is a three-region model.

Speaker 3: 14:52

We have two primary regions that are witness and if one of those regions fails or becomes partitioned off and not communicatable, the database is not available in that region but will continue to be available and strongly consistent in the two remaining regions. That's kind of the majority side of a partition, and you know the alternative to that would be to be available on both sides, but to do that you would have to give up consistency and you would have to. Once that partition healed you would need to be able to post-merge data. And so if you look at another product like DynamoDB Global Tables, that is the decision that we made there is to be available on both sides and then post-merge. That's got great availability properties, and what we've decided for dsql is to be available only on the majority side and then, when that that partition heals, you can copy the data over, repair the cluster fully deterministically without giving up any of the asset properties, and get back to uh, you know, get back to three running regions.

Speaker 2: 16:04

Yeah, absolutely incredible. The other thing that's impressive and notable about this is you fully leaned into the AWS software stack to deliver this solution. So one example which you wrote about is using EC2's precision time infrastructure for synchronized clocks. Using EC2's precision time infrastructure for synchronized clocks and people who are familiar with distributed systems engineering know the significance of having clock synchronization and even the precision of the operating system, how it perceives time and things along those lines, making sure that's all in sync. So can you explain how you use EC2 and AWS's clock synchronization support?

Speaker 3: 16:45

Yeah, and so you know, as your listeners might know, we've been rolling out you know we rolled out TimeSync service to EC2 a few years ago and we've been rolling out a big improvement on that over the last year or so to bring you know time synchronization globally to EC2 instances down to tens of microseconds, and so that's available to regular EC2 instances and over time will be available to all EC2 instances in all regions, all regions, and so, taking advantage of that, one of the things that we use that for is ensuring that reads are strongly consistent without needing any coordination between regions or even within a region for reads, and so we might touch on this in more detail later, but the the core idea there is um, when you send a read in a transaction, and so that could be a select. It could be an update right, because an update in it in in sql, is a read modify right. It could be an insert, because inserts on on um unique keys also require, you know, some some amount of reading, um or or any combination of those things. At the time your transaction starts, we look at that local clock from the EC2 time sync service, we read its current value and then throughout the rest of the transaction. Every time the query processor which is running that SQL goes to storage, it goes to storage and says give me this data I'm looking for. As of that time we call it tstart.

Speaker 3: 18:34

And by saying, as of tstart, what we can do, when combined with the fact that storage keeps track of multiple versions of data and can answer that as of question with perfect accuracy, what we can do is give that transaction a fully consistent snapshot across the entire data set in the database, even if you know, with no locking, with no latches, no locks, even if you know, with no locking, with no latches, no locks, and even if we don't know, and you don't, in SQL in general, you don't know which keys are going to be accessed. And so you know, this is a big partition database, it's a big partition feature set, big partition data set. And so you'll go to partition C and say you know, give me the row for cat. You know, as of this time, you'll get that back. You do some thinking, maybe you run some of your application code, you go back to the client and you know oh, actually I also need the row D for dog. You know, go to D, say give me the row dog as of that t start, and what you'll be sure of is that the row c for cat and the row d for dog that you pulled up will be at a consistent point in time in the database. One won't be a transaction ahead or a transaction behind or whatever the case may be. And that's really great for isolation.

Speaker 3: 20:00

But because of the use of time and the use of multiversioning at the storage layer, that doesn't require any coordination between shards, even between replicas in a shard. It doesn't require us to have a single primary replica for a shard, it doesn't require cross-region or cross-AZ coordination and so there in an availability zone between one pair of machines, we can do those reads, do those reads to another shard or partition, pull that data together and still have that strong read isolation, snapshot read isolation, which is very cool. It's a very cool property. And then doing that with a clock that's well synchronized to physical time allows us to make sure those responses are also ordered in physical time, which provides this additional property of linearizability or strong consistency, where you can be sure that you're going to read your most recent writes and you're going to see all of the effects as of you know, a real time in the database and is most of this happening?

Speaker 2: 21:09

there's an underlying database engine behind this, but is this synchronization happening in kind of a software layer abstraction uh, that's head of the database, or separate from the database itself?

Speaker 3: 21:22

yeah, it's kind of separate. So you know our literal sql engine. The thing that is running sql uh that is, you know, running those transactions is is, uh, doing query planning and optimization and so on is postgres. Um, and we talked about how we made that choice, I think, think it was a really great choice. I'm a huge Postgres fan.

Speaker 3: 21:46

But then below that, when Postgres says, hey, I need some data to return to the client or to process this query, that's where our custom implementation starts. And so Postgres will say I need this row as as of this time and that's where we'll have an implementation that goes off and finds the right partition or right shard, you know it figures out, uh, you know what, which of the the sql operations it can push down to be processed down at storage, which is a big latency. When, um, it finds a healthy replica, it finds a nearby replica, maybe it talks to the control plane to see say, hey, you need some more replicas of this data. It does that whole thing and then it gets back to postgres and says you know, here are the rows that you need to process this transaction. And then postgres will do things like the join logic and and you know all of those, those kinds of things that relational engines are, just, you know, really great at.

Speaker 2: 22:44

Absolutely, because what you're describing sounds like something beyond any database engine itself has offered in terms of having, you know, there's been databases that you know clearly can scale in the cloud and they're being, you know, cloud uh databases offered and mostly using, you know, sharding and partitioning, and then there's cluster databases as well, but to do this with a serverless offering is truly unique and you know, it's the first that I've I've seen of it at this level of control around uh, you know, you disaggregation of compute and storage, and you know strong consistency and isolation and these very sophisticated multi-region setups with high availability. So that's why I was asking, you know, I don't think there's a database engine that would just, you know you could, you know, pull some knobs and set something up like this yourself, pull some knobs and set something up like this yourself.

Speaker 3: 23:45

Yeah, I'm not aware of one, although there are some really interesting distributed SQL engines in the world and some really cool technology there. One of the choices that we made as we designed D-SQL was to try and disaggregate the database into multiple concerns, and so we tried to separate out the distributed system concerns from the core relational concerns. Now there's some leakage across that boundary. The relational engine can't be entirely oblivious to the fact it's running in a distributed context, because it has to know some of that to make good query plans and to do good query execution, but it doesn't have to know anything about how many replicas there are. Are there replicas? Where are they situated? How do I find them? Have they failed?

Speaker 3: 24:36

And so by taking all of those hard distributed system concerns and separating them out into a separate component from the database engine, it's allowed us to keep that database engine quite simple, quite stock, and that's useful operationally. But it's also organizationally allowed us to have the folks working on that part of the system be deep database experts rather than deep distributed systems experts. And that's a really big win just organizationally and as an engineering team is to be able to, you know, have a level of specialization like that, and yeah, yeah. And so what we've tried to do is make that Postgres layer as unconcerned I wouldn't say oblivious, but unconcerned with the realities of being in a distributed system as we can.

Speaker 2: 25:42

Absolutely, and I'd love to hear about how you approach offering strong consistency and isolation in multi-region setups. So that's also very interesting and unique and you wrote about this but the use of optimistic concurrency control and multiversion concurrency control as well, and how, how does that contribute?

Speaker 3: 26:07

yeah. So you know, that's uh, let's say, a really deep, really deep, yeah, yes, uh. But yeah, let me get into some of that. I, you know, I think the, you know, multi-version concurrency control is the first, you know, is the first part of that, and that is this idea that on the storage node we have multiple versions of each piece of data, versions spanning over a time window, and so that allows the query processor to go and ask for data as of a particular time, go and ask for data as of a particular time, and so that's a data structure at the storage layer with multiple pieces of data in. And the benefit of that is sort of touching on what I said earlier is that the query processor can form a consistent snapshot of the data in the database without having to do any coordination between nodes with other query processes. It doesn't even have to be aware that other query processes exist, it doesn't have to go to a leader for a shard, and so that multiversion concurrency control step on the read side pretty much eliminates the need for distributed coordination in the system. And then we get to the write side and that's where things get really interesting, and so I would sort of break that down according to the ACID properties.

Speaker 3: 27:34

And so, as you commit a transaction, you need to do two things. One of them is isolation, and that simply means, given a particular isolation level, can this transaction commit to a transaction while keeping a certain set of rules about the transactions that it ran concurrently with? And at the isolation level that we currently support, which is called strong snapshot isolation. The rule is fairly simple. It's a transaction can commit as long as, given the set of keys that it's updating, the keys that it's writing, nobody else has written those keys between when that transaction started and when it's trying to commit. And so it does the step where we have another separate piece of the database called the adjudicator, where it looks at the set of rows that it wants to commit, and it goes to the adjudicator and says my transaction start time was five. It's now time seven. I want to commit keys A, b and C. Are there new versions of these keys in the database? Since I read those and the adjudicator can say one of two things. It can say you know, no, there are no new versions, that's fine, you can go ahead. Or, yeah, there are new versions of these keys. You need to abort this transaction, and that is all the coordination that is required to meet this strong snapshot isolation level.

Speaker 3: 29:12

The next step is the most important one, and that is atomicity and durability. Right, this is where a commit actually becomes a change to the database, and so what we do, once we've passed those isolation checks, is we package up the set of changes to the database we choose, based on the isolation checks, a version number for it, a point in time for that transaction to take effect, and we write it to a service called Journal. This is an internal component we've been building at AWS we use for over a decade. We use it in all kinds of places. We use it in S3, we use it in DynamoDB, we use it in Lambda, we use it in Kinesis, and what Journal provides to the database is two things it provides durability. In the single region mode that means on storage across multiple availability zones, and in the multi-region mode it means on storage across multiple regions. And it provides atomicity, this kind of core database property of I'm either going to accept this whole transaction or I'm going to accept none of it whole transaction, or I'm going to accept none of it. And so once we've handed that off to journal. Then we know that we've passed our isolation checks, that that commit is is atomic and that commit is durable, and we can go back to the client and say your commit has committed, congratulations. Now then at that point, in parallel, what we're going to do is, uh, apply that commit, apply that change to the, to the relevant, to the relevant storage nodes, and and so that's.

Speaker 3: 30:50

And then then, sort of looping back to your question about optimistic concurrency control, I that check with the adjudicator, is optimistic concurrency control right?

Speaker 3: 30:59

We have allowed the, the transaction, to run in its entirety, to do a bunch of reads, to spool up a bunch of rides, to communicate with the client. We have not coordinated with other transactions up until this point. And then at that commit time we go and check have we met the rules? And one of two things can happen yes or no. Your transaction needs to be, needs to be aborted, um, and so there's some database. Academics will will notice that this is is not a pure optimistic concurrency control scheme, because it's sort of mixed with multi-version concurrency control. In that way, the core right path is is optimistic, um, and so we can go into the, the trade-offs between optimistic approaches and pessimistic approaches, if, if, that would be interesting, but the big thing we get out of this is is it minimizes the communication between regions and azs for these distributed settings yeah, excellent, and there's always trade-offs you're going to make in distributed systems, and the most popular one is CAP theorem, which is between consistency, availability and partition tolerance.

Speaker 2: 32:09

But really there's hundreds of thousands of trade-offs that you'll make when you're actually in the weeds of implementing a service, especially at this scale. I personally can't imagine, but based on what you're describing, it's certainly very sophisticated. And so, coming back to your earlier point about you're relying on Postgres as your main query engine let's call it that, which is great because so many applications are built on Postgres. There's a lot of development happening there, whether it's building relational database applications there Now there's some recent AI-based applications with the PG vector and has a great open source community of extensions, and we had Gwen Shapira from Nile who came on the pod and really dove into just how rich the extension community is too. We really appreciate learning about that.

Speaker 2: 33:10

Now, of course, postgres has its own logging as well. You spoke about journaling, which is a little higher level, but Postgres the database itself, like a single instance, will have its own write-ahead log and it'll have its own buffer cache. It'll have all these components that sort of come in with a database off the shelf. So are you also piggybacking on the logging of Postgres, or is that something you've also built into your own layer of abstraction?

Speaker 3: 33:41

Yeah, so that whole kind of durability storage layer of Postgres you know we aren't using, because we, you know we wanted to build in this fairly, you know, fairly unique set of distributed properties, which isn't you know, isn't Postgres' core concern. Right, you know that it is designed to be durable to storage on a single machine. You know there's obviously some replication machinery there too, but we needed to replace all of that, both for scale-out reasons and for fault-tolerance reasons. Right, we wanted to be durable synchronously across multiple AZs or across multiple regions. We wanted to be able to transparently partition data into multiple write partitions and multiple read partitions, and those were things that that layer of Postgres, as great properties as it offers, doesn't do, and so we replaced those pieces of the engine.

Speaker 2: 34:46

Yeah, wow, that's incredible to hear about and you know, certainly a very sophisticated design and it's really focused on serverless but also high availability and and scale, which is a truly unique combination, and it makes you know the the end results users. It makes it easier and less costly to run databases than than ever before. So what would you imagine are some of the new types of use cases this architecture can enable that just weren't practical.

Speaker 3: 35:19

Yeah, well, before I answer that and that's a great question I did want to say one of the reasons that we chose Postgres is that it is this really great, extensible, modular architecture. It's actually for its age and complexity it's it's. It's a beautifully architected piece of software and, um, and you know, and so it, it very naturally allows the kind of deep surgery that we've we've done on it, uh and uh and and that's. You know, that's, that's exciting, and I think it's one of the reasons that you know, postgres is having this kind of moment across the industry, right, like it's just super popular right now, for good reasons, and one of those reasons is that, you know, it provides a huge amount of extensibility and flexibility and lets folks, you know, build all kinds of cool things with it and around it in interesting ways, kinds of cool things with it and and around it in in interesting ways, you know. Jumping to your question about, about use cases, um, you know, one of the things you know we're really trying to do here, uh, is simplify architectures, and so, if you look at um, if you look at an architecture today based on, you know, a, a traditional relational, there are often a lot of blocks, right, you will have your primary database. You'll maybe have some failover databases in multiple data centers. You'll have machinery about managing you know detecting failures and managing those failovers. You'll have replication machinery keeping those replicas in sync. You know, often, because you only have a single consistent primary, you'll have a caching layer on top of your database. You'll have some kind of change data capture or replication mechanism keeping that cache up to date. Or you'll have logic in your application you know keeping that cache up to date or you'll have logic in your application, you know, keeping that cache up to date Often. You'll have a bunch of plumbing out to infrastructure for doing reporting and analytics. You'll have plumbing to infrastructure for doing backup and restore, point-in-time recovery, audit and so on. And so these architectures, as you try and take a database into being a reliable, whole kind of end-to-end application solution, they become quite complex.

Speaker 3: 37:48

And so what we've been trying to do, both with B-SQL and kind of across the whole AWS data family, is really simplify that. And so you know, with B-SQL, what we've said is you know, with with D SQL, what we've said is you know, hey, this is already multi-AZ, you don't need to worry about those, those multiple replicas We'll. We'll handle that replication and and and and moving of data. If they're failure failures for you. You don't have to worry about about caching. In the same way we can, we'll scale out for reads and for your read load. We'll keep your reads local, within an AZ or within a region and optimize that latency for you. You don't need those blocks in your block diagram. Backups are built in, restore are built in. These pieces are built in, right like these pieces are are built in.

Speaker 3: 38:37

And so what we've really tried to do is take all of those use cases that folks have, all of the things that that add complexity to architectures, and really simplify them. And I think you know if we've succeeded there. Um, I think there's a cool set of applications we're going to unlock, which I'll talk about in a second, but really it's about simplifying those things. Uh, and then you know, what do we unlock? Well, I think you know, I think the new architectures is we're hoping that it becomes much easier and much more cost effective to run active, active architectures, especially in the multi-region setting.

Speaker 3: 39:17

Folks tend to go to active, you go to active failover architectures, mostly for operational simplicity reasons, mostly because their database doesn't do active-active in a nice way and those architectures can be hard to reason about. They can be hard to make sure they actually work in the days that you need them. They can be more expensive to run and so, you know, we've really tried to make it as easy as possible to run active. Active, uh, we can have both sides of your application running. You know they work because they're running all the time. They're both are handling customer traffic all the time you're able to. Uh, you know, keep, keep that infrastructure paying. Uh, you know, on on both sides you don't have infrastructure that's just sitting there gathering costs and no value to your customers. You can send traffic to the region that's closest to it to get better latency. It's a whole bunch of glass and we're just trying to make those architectures easier to achieve.

Speaker 2: 40:18

Yeah, that's incredible. So, based on what you're saying, you'll have an even higher level API in addition to what Postgres offers out of the box to store and query your data.

Speaker 3: 40:30

And you were talking about reach and locality and things along those lines and those are just built in right, like you just connect with your Postgres client, you look up the DNS name, you connect from your Postgres client and all of the locality stuff will be handled. If you're running in EC2, even the kind of AZ and data center locality will be entirely handled by the infrastructure without you having to worry about it at all as an application programmer or as a, you know, as a system operator.

Speaker 2: 41:03

That's absolutely incredible. Developers who work with Postgres before and especially optimizing it. You know you might use. You know shared buffers or you know PG pre-warm to govern. You know which data stays in your buffer cache. Would those features be abstracted away or also something that, like a native Postgres developer, is experienced with that be able to access?

Speaker 3: 41:31

Yeah. So I think that's a sort of mixed answer to that. So, a Postgres developer, you'll still need to think about the way that your queries and schema come together into execution plans in the database. You're still going to want to pay attention to that. Explain and that explain, analyze, like how much work am I asking the database to do on my behalf?

Speaker 3: 42:01

On my behalf, you know, because anyone who's written you know, or written a lot of a database code or or operated database systems, knows that the difference between the database doing, you know oh, one, oh, login, oh, or even oh n squared data, you know, work for you, uh, it's hard, you know, it's hard to look at a query and and know, like how much work is the database going to do. And that's where that explain comes in, like, hey, what are you going to do when I ask you to run this piece of code? And so that remains just as relevant. The stuff that you know we don't think is going to be as relevant is that, you know, preheating of the cache, is that tuning of buffer and cache sizes and so on. We've tried to abstract those away and automate those so the operator doesn't need to worry about them.

Speaker 3: 42:48

Oh, wow the returns still exist, but they are not exposed to the operator, to the developer. They're things that are handled in our system.

Speaker 2: 42:58

Oh wow to the developer. There are things that are handled in our system. Oh wow. And just clarify, if I can ask from my side is that with some sort of per-tenant caching design?

Speaker 3: 43:08

Yeah, so there is some caching, especially around very frequently accessed tables like the catalog, which is a per-ten isolated caching design.

Speaker 3: 43:20

But most of what we've tried to do is avoid the need for aggressive caching at the query processor layer by pushing down work to storage, and so the storage interface is not key value.

Speaker 3: 43:38

It's actually quite a rich push down interface where you can push down filters, you can push down projections, you can push down aggregates and those kinds of operations, and what's super valuable about that is it makes the interface between the query processor and storage so much less chatty, right, um?

Speaker 3: 44:01

If you think about the, the interface that postgres itself would have to storage, it's very chatty, it's very low level. It's like you know, hey, get me that btree page, get me that btree page, give me that page, right, um, whereas you know, in in dsql it's a logical interface, that is, get me all of the rows matching this predicate and all of that low-level data structure wrangling happens locally on the storage node, where it can be done in memory or done locally against fast storage, which saves a whole lot of back and forth and then allows us to scale out that query processor layer horizontally without the challenge of keeping a large, coherent cache and cache coherency is a classic computer science problem for a reason, and the reason is that it's really hard to get right, especially in the distributed setting.

Speaker 2: 44:59

Yeah, absolutely, and it's so interesting to hear about the implementation here and the design choices you made and it's truly remarkable. You know what this can enable and you know we talked about use cases. I on a completely separate. You know this is a tangential point. The most popular, one of the most popular and you know revenue generating AI use cases right now is code generation. It's developers can write code super fast and blurred. Looking ahead and saying, you know we're going to have AI agents write more code and, you know, accelerate that process even more. And one of the things that AI is good at is kind of generating procedural code and logic. One of the things you wouldn't want AI to do is provision a bunch of infrastructure for you. So having a service like this, where you know you can just have a postgres api or postgres interface but the, the scaling and high availability and that's all being abstracted away from the developer, whether it's, you know, human or you know, ai agents that couldn't be. You know a future software development approach that we see becoming more popular.

Speaker 3: 46:11

Yeah, I think that's very astute. In a lot of ways, ai and serverless are just really deeply complementary the ability to have agents or human-in-the-loop code generation, generate, code, generate applications that then run against compute services, against databases, against storage services, that handle all of the operations without having to push that complexity onto that ai to solve. You know, I think that's that's going to be. Uh, those are going to be hugely you know, hugely complementary technologies. Now there is also going to be a bunch of you know, ai powered things that run on on more sort of classical architectures, um, you know, and serverless things that are not, you know, not ai powered right, like, I think you know, we, we fill in the whole, the whole kind of matrix there of options. But I do think that combination of kind of AI, agents and serverless is going to turn out to be a really popular and really productive one for, you know, for a lot of folks building systems lot of folks building systems.

Speaker 2: 47:28

Yeah, it's really an exciting time to be a software engineer and a data engineer. I know there's some kind of pop science fiction writing out there that's going to replace all engineers. I mean, I don't know. If you go look at OpenAI, they're still hiring a lot of engineers, or any AI, even AWS. So you know, I think it's gonna it's. It's certainly very cool to see some of the new like you said, complementary technology come in, like serverless it. It does at least simplify the deployment process for ai, especially if there's a commonly used api that's built on at least foundationally, like Closegres, which most LLMs are already trained on, and leveraging open standards. I want to get your thoughts. You also mentioned that, in terms of use cases, just simplifying architectures is the first one. Is there anything else radical that you see being enabled by uh dsql?

Speaker 3: 48:33

um, yeah, that's, that's a really uh, you know, that's a really interesting, interesting question. I, you know, I don't I don't have anything particularly radical that I can that I can share. Obviously this is something that we think about, you know, think about super deeply. But, you know, maybe, maybe one, you know one, one hint that I can drop is, I think, you know, having this journal at the core of the system, having this, this commit log, at the core of the system.

Speaker 3: 48:59

Um enables us to do some really cool things with the kind of history of transactions and uh and and so on that is is hard to do in in classical systems and and I think we'll be able to build some very cool data features around that over over time. And then I think having this, you know, serverless infrastructure and scalable infrastructure gives us a lot of flexibility in the ways that data is accessed over over time. Right and so right now it's Postgres and SQL, but it could be a whole lot more options than that. So very fun. And to your larger point about the industry, I think this is honestly one of the most exciting times in the data and computer industry. Certainly that I can remember Just the last year in the data space has been extremely exciting. There's so many interesting trends and so many cool new technologies coming along, but it's just a really, really cool time to be involved with the stuff absolutely every every day there's, there's new announcements.

Speaker 2: 50:04

It seems like we're making these big kind of categorical advancements in technology on a much more frequent basis now since the, since the start of the ai wave. So it's been very fun to follow along with it and you know especially the, and you know I I have a you know distributed systems and data background, so especially the work that you're doing and I've had a lot of fun following along with your blogs. I just love how generous you are with your insights and the work you're doing, because there have been companies that have built incredible IP and databases that they don't really get into depth of their implementation as much. They will. You have to go find it, but the way you just blog about it is is really cool and I I definitely recommend the listeners of this podcast, if you haven't yet to, to subscribe to. To mark. Follow him on uh x. Uh, follow him on blue sky. I'll let you actually answer. You know where can people follow along with?

Speaker 3: 51:06

yeah, x, x and blue sky are great, great options. Uh, you know you can follow my blog. I tend to put most of my long form long form writing there. I've been doing a little bit more on linkedin recently, uh, but uh, but still not a whole, a whole lot there, although there's a really interesting kind of technical community that seems to be building around around that platform, and so any of those four options is is great and should keep you up to date on on the things that I've been doing, and I really appreciate your comments on on my blog. Great, great to hear that folks enjoy it absolutely they.

Speaker 2: 51:39

They do, uh, you know me, me and my friends who also work in the industry, you know talk about it and uh, it's just really interesting to read and and and you know, see what's going on. It's almost like, uh, I want to say, uh, you know watching a movie and then watching the behind the scenes footage, you know it's, it's, it's just very captivating and and cool to learn about the, the process and the thought process too yeah, that's really cool.

Speaker 3: 52:05

Thank you, mark Brooker.

Speaker 2: 52:08

thank you so much for joining this episode of what's New in Data. Yeah, thank you to the listeners for tuning in. Yeah, really appreciate it, thank you.