The Sourcegraph Podcast

Creating the GitHub of databases, with Sugu Sougoumarane, co-founder and CTO of PlanetScale

January 18, 2022 Beyang and Quinn Season 2 Episode 13
The Sourcegraph Podcast
Creating the GitHub of databases, with Sugu Sougoumarane, co-founder and CTO of PlanetScale
Show Notes Transcript

Why is using PlanetScale a mind-altering experience? In this episode, Sugu Sougoumarane, co-founder and CTO of PlanetScale, shares how one email got him a second job interview with Elon Musk, tells the story of how he became one of the elite engineers at Paypal by solving the company’s most painful process, and explains why database administrators are shifting from managing machines to managing fleets of machines. Along the way, Sougoumarane explains why so many developers have told him they’ve felt like they’ve waited their whole lives for self-serve schema deployment.

Show notes & transcript: https://about.sourcegraph.com/podcast/sugu-sougoumarane/

Sourcegraph: about.sourcegraph.com

Beyang Liu:

Hey everyone, welcome back to another edition of the Sourcegraph podcast. I’m here today with Sugu Sougoumarane. He’s the CTO and co-founder of PlanetScale, the serverless database platform. It gives you a horizontally scalable SQL database that’s super developer-friendly and just works. It’s built on top of this really cool open source project called Vitess, which Sugu also created. He started it when he was at YouTube, working on scaling the YouTube backend to serve the billions of users and viewers that it handles today. And chances are, if you’re watching the video form of this podcast, then you’re probably using a lot of the infrastructure that Sugu wrote and created. So Sugu, thanks for being on the show today.

Sugu Sougoumarane:

Thank you. Excited to join you.

Beyang Liu:

Awesome. So I always like to kick things off by asking people… We have a lot of accomplished guests. They’ve done amazing work. But if you go back to the beginning of your programmer life, what was it that initially got you into coding and computers?

Sugu Sougoumarane:

I’m a little older. First, I went to college. I actually took EE. And when I joined college, I did not know what a computer was. We are talking 1981. I’m probably revealing my age there, but I did not know what a computer was. My school, BITS Pilani, did have a computer science degree, but I had no idea what that was. I didn’t even choose that option when I applied, because I didn’t understand. What is this “science” thing? I wanted to be an engineer.

And there was one mandatory programming course that everyone had to do. That was actually enforced only at BITS at that time probably. And that was my first exposure to computers. And when I saw it, it completely blew my mind–the kinds of things you could do. The first language they taught me was Pascal.

Beyang Liu:

Nice. How did you like that?

Sugu Sougoumarane:

Actually, it’s one of the better languages because it makes you organize your thoughts. You don’t just write code. You have to think about what you want to do and how you want to express it. So, that early discipline carried over. I haven’t looked back since I did that course. After that, I was secretly regretting not having taken computer science.

Beyang Liu:

So you graduated as an EE.

Sugu Sougoumarane:

I graduated as an EE, but I took as many optional computer science courses as I could and got myself a computer science job. That’s how it started.

Beyang Liu:

What was that first computer science job that you got?

Sugu Sougoumarane:

The first computer science job was with computer graphics. It was computer graphics, but based on geo. So we were trying to build a geographical information system. Actually, I don’t know if that company still exists. It’s a company called Esri, E-S-R-I.

They were kind of the leaders at that time. They called it GIS, geographical information systems. But the reason why I liked it was because it was essentially computer graphics. You had to actually take maps and render them on the screen. It was a miracle those days. This is 1985. In ‘85, being able to display a map on a computer screen was unbelievably awesome.

Beyang Liu:

Because in those days, most people ran something like DOS.

Sugu Sougoumarane:

Yeah. This was on MS-DOS. C was older than that, but it had just become popular in India. And we were the first ones. There was something called the Glockenspiel. It was only a translator at that time. There were no C compilers. No, this was C++. C++ had the Glockenspiel. But C did have a compiler. So it was in C, but then you directly wrote to the monitor. There were no drivers or anything.

Beyang Liu:

Wow. That’s crazy.

Sugu Sougoumarane:

Yeah. So, it was essentially computer graphics for a few years. And then I went into C++ and other things.

Beyang Liu:

Awesome. And I understand somewhere along the line, you ended up at PayPal somehow.

Sugu Sougoumarane:

Yes. How did I end up at PayPal? Actually, before PayPal, one relevant piece of experience is the time I spent at Informix. This was the early ’90s. That’s how I actually came to the US. I came to the US on a contract job to work at Informix.

And strangely, at that time, I was not directly working on the database. I worked on IDEs. Informix was developing an IDE for 4GL. I think that was the buzzword those days. Fourth-generation language is what they called it. Like, what is beyond C and C++, right? These fourth-generation languages. It was really not the fourth generation.

Beyang Liu:

What generation was it?

Sugu Sougoumarane:

It was more of a lower-level programming language than even C, but where you could embed SQL. It’s basically as if SQL and C were first-class citizens. So there was no need to escape out into a different paradigm.

Beyang Liu:

Got it. So Informix was kind of this database system with this so-called fourth-generation language?

Sugu Sougoumarane:

Fourth-generation language. And I was developing an IDE for that.

Many people may not remember this history, but a big scandal happened where Informix cooked the books and got caught and so that company barely recovered and finally IBM just bought them out.

Beyang Liu:

Oh no.

Sugu Sougoumarane:

But I had left Informix by then. That was actually in the ’90s when the internet started booming. And I realized that, no, what I’m doing now is not the future. Essentially, for me, coming over to PayPal was a career change. And there’s an interesting story there because I had a run-in with Elon Musk on my way into PayPal.

Beyang Liu:

Yeah, tell us that story. What happened between you and Elon Musk?

Sugu Sougoumarane:

So, this was 2000. Elon had just founded X.com. He had sold off Zip2, which was his previous company, and he was considered one of those up-and-coming entrepreneurs. He wasn’t as popular as he is today. But when I read X.com, if you read the description then, it said that it was an online bank. So, basically, there’s Wells Fargo, there’s Bank of America, and now there’s X.com, which is an internet-first bank.

That’s what it looked like. I said, “Wow. If it’s a bank, I better be formal.” So, I wore a tie and suit.

Beyang Liu:

Oh no. And I’m assuming the dress code was a lot less formal.

Sugu Sougoumarane:

That turned him off instantly. So, I’m sitting there, he walks in and says, “Who the hell are you?” “I’m here for the interview.” “Oi, why are you wearing a suit?”

Beyang Liu:

That’s hilarious.

Sugu Sougoumarane:

So it turned out that this was 2000, where the boom was happening–‘99, 2000. And there were so many “me too” companies. “I’m going to change the world” kind of companies.

Beyang Liu:

Sure. A lot of copycats.

Sugu Sougoumarane:

And I had been interviewing with a bunch of them. And none of them had a real story. A lot of them didn’t even have code written, but they had raised money from VCs because that’s how desperate VCs were to give you money. Anybody that said “I have an idea” got money.

Beyang Liu:

It was the height of the bubble.

Sugu Sougoumarane:

Yeah, it was the height of the bubble. I was completely disillusioned saying that, “My God, this is a disaster. There is not a single company that has anything viable.” So I show up there and I’m kind of low energy because I don’t expect anything from this interview. All previous interviews were disasters and I had just walked out. And this one, I was like, “Oh, okay. Let’s hear it out. Let’s see what you got.” And so that came across to Elon as a guy who has no energy. And he was kind of disappointed. And I was wearing a suit. That’s even worse.

Beyang Liu:

Two strikes.

Sugu Sougoumarane:

Two strikes. And as I’m hearing his story, by the time the interview ended, I was so excited by what he was building. I just couldn’t believe it. I said, “Oh my God, this will change the world.” That’s basically how I ended up leaving. But that’s not how he ended up seeing me. I went through a recruiter. And the recruiter says, “Hey, they’re going to pass on you. They’re not interested. They think you have low energy.” I said, “What? Give me Elon’s email now. I’ll send him an email.”

So I sent him an email. Essentially, paraphrasing what I did, I basically sold him back his own company. I told him why this is going to be huge. And it turns out that, apparently, he was struggling to convince the greatness of his company to his own employees. And so he saw that and he was impressed that I was able to see what he saw. He even sent that email to everyone and told them, “See, there’s a guy outside who believes in what we are doing.” And so he called me back for an interview.

And now, looking back, I can see everybody was smiling. Everybody was passing me by, giving me second looks, and smiling.

Beyang Liu:

You were the guy.

Sugu Sougoumarane:

Yeah, I was that guy.

Beyang Liu:

That’s awesome.

Sugu Sougoumarane:

The rest is history. Then I joined X.com. And then soon after that, X.com and PayPal merged, and that’s how I ended up at PayPal.

Beyang Liu:

Do you remember what it is you said in that email? Because I’m sure everyone who’s listening, who’s ever bombed an interview, is like, “What do you say? What do you say in an email that gets you that second chance with Elon Musk of all people?”

Sugu Sougoumarane:

I think actually I relate to what I said even now, because what I said was, “The most important thing is that I am sold on this vision and I will do anything to be part of it. I don’t care what you offer me. This vision is awesome. This is going to change the world. So I want to be part of this. I don’t care. My low energy is situational. Don’t read into that.” I need to dig up that email. I don’t think I was as eloquent as I am now.

Beyang Liu:

If you dig it up, we’ll link to it in the show notes.

Sugu Sougoumarane:

Yeah, I’ll find out. I’ll see if I can find it. This is literally 21 years ago.

Beyang Liu:

I was just trying to think of email systems back then.

Sugu Sougoumarane:

Yeah, exactly.

Beyang Liu:

But do you even remember what email you used in those days?

Sugu Sougoumarane:

I think it was Yahoo.

Beyang Liu:

Oh, Yahoo. Okay.

Sugu Sougoumarane:

So it may still be there. I should go look for it.

Beyang Liu:

Yeah, fascinating.

Sugu Sougoumarane:

He may not remember me, but I think if I reminded him of the story, he would remember me if I ever met him again. If I told him this.

Beyang Liu:

Yeah, it seems pretty memorable. I mean, a guy comes in and sells your company back to you so well–probably better than he could describe it. He shared it with the entire company. That’s awesome. So, talk about joining X.com and then PayPal. What was that experience like?

Sugu Sougoumarane:

So the X.com and PayPal merger was equal. It was actually a difficult merger because X.com was a Windows shop and PayPal was a Linux/Unix shop. PayPal was even pre-LAMP. PayPal was running Linux on Oracle. And X.com was running Windows with COM. This is pre C#. Windows with COM and SQL server.

Beyang Liu:

Okay. Old school.

Sugu Sougoumarane:

Old school. And when the two companies merged, something had to give. We had to choose a platform. And since Elon was the resulting CEO, he said we were choosing Windows.

Beyang Liu:

Elon is a Windows guy, huh?

Sugu Sougoumarane:

Elon was a Windows guy. I don’t know what he is now. But he was definitely a Windows guy then and the entire team was Windows. And the funny part was that they hired me thinking that I’m a Linux guy that has to be trained on Windows. But I was also a Windows guy.

Even though I was at Informix, I was one of the few people at Informix who was a Windows person. Because I was developing an IDE and IDEs existed only on Windows. There was no GUI for…

Beyang Liu:

Microsoft pioneered the IDE.

Sugu Sougoumarane:

Visual Studio.

Beyang Liu:

Which makes total sense.

Sugu Sougoumarane:

Yeah. And Linux didn’t even have a GUI then. It was mostly command line. So there was no chance of an IDE in Linux in ‘99, 2000. I started being productive right from day one because I knew all the things. I knew the entire stack. So all I had to do was learn the code and start writing code. But this tussle continued for a while, and there was eventually a reorg and everything happened, and the final decision was: we are staying with PayPal.

Sugu Sougoumarane:

Because what had happened during that time was, paypal.com had gathered momentum. And it would have been hard to retrofit Windows underneath there. So the decision was to actually go back to PayPal and then build on that. So I moved over and kind of had to learn Linux, without telling anyone, and start writing features for the new system. I might have spent about six months churning and writing code that never went to production.

Beyang Liu:

That’s painful. Was there any work that you did back in those days that ended up being useful when you started working on Vitess and PlanetScale?

Sugu Sougoumarane:

Oh yes. So the big thing for me, which was really hard, I don’t know if you know about the PayPal gang there, the founding PayPal members. They were all geniuses from UIUC.

Beyang Liu:

This is Peter Thiel, Elon Musk, and…

Sugu Sougoumarane:

It’s the Max Levchin crowd.

Beyang Liu:

Okay. The Max Levchin crowd. Okay.

Sugu Sougoumarane:

Yeah, Max Levchin is from UIUC. And UIUC, a bunch of those people were from a school called IMSA, which is called International Math and Society Academy, which is kind of a school for gifted children or something.

Beyang Liu:

I see. So, super high IQ.

Sugu Sougoumarane:

Super high IQ. I was completely intimidated by the amount of brainpower that was around me. And I had to figure out a way to prove my existence and justify that I’m useful, considering these people used to produce the kinds of features that they were producing. And so what I did was this: there was a problem that everyone hated. And it was actually the most problematic part of PayPal. There’s an interesting story behind this.

I don’t know if you’ve read the old PayPal business model. Originally, when PayPal was founded, the way we raised money was we told them that we will allow people to send money to each other, and we will make money from the float. The fact that when people are sending money with each other means that we’ll have some float money, and we’ll make money off of that. So that was the business model that was proposed. You can probably find it if you search for it.

But this was engineers writing code that moved money. They were inserting rows into a transaction table, updating the user balance, doing a commit, and moving on. And obviously–guess what happens–there are bugs in the software. So at some point in time, auditors came and said, “Okay, show us your books.” And so they add up the transactions, they look at the user balances, and there’s a few million dollars missing. And they’re like, “Where’s this money?” And we’re like, “Oh, that’s all in the float. It’s float money. It’s being invested somewhere.”

Beyang Liu:

They didn’t accept that answer.

Sugu Sougoumarane:

That kind of hand-waving doesn’t go well in the financial industry. So, they basically asked us to account for every penny. And so somebody hacked up a tool. But because developers were writing software and there were bugs, that tool used to report discrepancies every day, or every other day. And every time it posted, somebody had to go in and investigate. And in millions of transactions, how do you find out where things went wrong? And so there was a victim engineer who was selected every once in a while as the activity report reported conflicts. The guy would say, “Okay, my life is gone today.”

Beyang Liu:

He had to go through and manually resolve discrepancies?

Sugu Sougoumarane:

Yeah, figure out where the difference came from.

So then I said, “You know what? Give me that tool. I will own this.” It was the most painful thing. Everybody hated it. So I said, “I will own this problem.” And I completely rewrote that tool to the extent that if it reported a discrepancy, it would find out the transaction that caused the discrepancy and the user’s account and how much they have found and say, “This is the culprit that caused the discrepancy.”

Beyang Liu:

Interesting. And it was completely automated?

Sugu Sougoumarane:

Yeah, so it would run every night, look at all the transactions, and look at every user’s balance. And it had to be done efficiently, because you can’t loop through users. So I had to run some smart queries that did diffs in different formats. And then that was used to identify the offending transaction. And so, people loved me for it. They said, “Wow. Sugu solved a real problem.”

Beyang Liu:

Sounds like you’re a hero. I mean, every single day when an engineer would be on call to handle these things.

Sugu Sougoumarane:

Otherwise, I was not one of the elite engineers. So after that, they said, “No, Sugu is cool. If there’s a problem, we can count on Sugu to solve it.”

Beyang Liu:

That’s fascinating. So they respected you.

Sugu Sougoumarane:

So I kind of earned the respect. It was very hard to earn that type of respect. You had to show something for it.

Beyang Liu:

So how did all this relate eventually to the work on Vitess?

Sugu Sougoumarane:

After that, when they said “Sugu solves problems,” I kind of went into solving… So, at that point, the biggest challenge for PayPal was always scaling. I became part of that core team that was responsible for scaling PayPal. And this is another story you may not know, we tried to shard PayPal.

Beyang Liu:

You tried to shard it? Like split the databases into pieces?

Sugu Sougoumarane:

Split the database into smaller parts. And that experiment failed, not for technical reasons, but nevertheless, the experiment failed. And we had to roll back.

Beyang Liu:

So all PayPal was literally on a single giant database.

Sugu Sougoumarane:

Yes.

Beyang Liu:

What was the database, just out of curiosity?

Sugu Sougoumarane:

Oracle.

Beyang Liu:

Oh, Oracle. Okay. Oracle. So one giant Oracle database instance.

Sugu Sougoumarane:

It was Oracle, and we had the biggest hardware money could buy.

Beyang Liu:

Which was what in those days? Do you remember?

Sugu Sougoumarane:

It was called the Sun E15K, I believe. It was the biggest machine Sun ever produced. If Sun made a new, more powerful machine, we bought it. So I think we had an E10K before that. And then after that we bought something called the E15K.

Beyang Liu:

It was one box? Literally one box.

Sugu Sougoumarane:

One box. We had two E15Ks because there was a hot standby.

Beyang Liu:

Okay. Got it. Yeah, for fallover.

Sugu Sougoumarane:

So that’s how Oracle ran.

Beyang Liu:

Makes sense.

Sugu Sougoumarane:

The only hint I can give you is, after that outage, we migrated out of E15K into Fujitsu. That happened after I left, but there is a hint in there.

Beyang Liu:

What is the hint?

Sugu Sougoumarane:

There was an operating system/hardware problem combined together that caused the outage. We thought it was the sharding, but it was not the sharding. Nevertheless, we had to roll back the sharding after that happened. I didn’t quite immediately leave PayPal because after that, I went to India to open the PayPal India office. And that was when YouTube was formed here. But I came back and joined YouTube when I decided to come back.

Beyang Liu:

I see. And YouTube was where Vitess was born, right?

Sugu Sougoumarane:

YouTube was where Vitess was born, and it’s not a coincidence that that is in some way the reason why Vitess was born. In some respects, the fact that the sharding experiment in PayPal failed remained with all of us. And we were all motivated. Because we believed that sharding was the way to go, that it was the best way to solve this problem. And, at YouTube, we redid that sharding and made it work. And the night we succeeded, the first sharding, this was pre-Vitess. Vitess was born after our first time sharding. We had a huge celebration. And we were all like, “We stand cleansed. No one can tell us that we don’t know sharding.”

Beyang Liu:

It works.

Sugu Sougoumarane:

It works. Sharding works and here is the proof.

Beyang Liu:

So did you have a bunch of ex-PayPalers from that team?

Sugu Sougoumarane:

Yeah. Most of early YouTube were all ex-PayPalers.

Beyang Liu:

Oh, interesting.

Sugu Sougoumarane:

YouTube was founded by Steve and Chad. They were both PayPal engineers. Steve, Chad, and Jawed, and we all worked together. Jawed actually was sitting right next to me at PayPal. When they started YouTube, I went to India because I wanted to see if I could go and move back there permanently. But then, after a while, we decided to come back. We gave it some time, we tried it, and no, we decided to come back. All kinds of reasons. All personal stuff. And while I was there, Steve and everybody else, all the other engineers at YouTube were saying, “Dude, what the hell are you doing there? We are exploding. You should come and join YouTube. It’s out of control.” Then finally when I decided to come, I said, “Okay, I’m coming.” So then I came back and joined, and soon after that, Google bought YouTube.

Beyang Liu:

So what was the problem that you were working on at YouTube that necessitated scaling out MySQL? What was that? Was it storing video data, or something else, or?

Sugu Sougoumarane:

Oh, no. So, video data was not stored in MySQL. Video data was stored in CDNs.

Beyang Liu:

That makes sense.

Sugu Sougoumarane:

I actually don’t know. I believe we used a CDN service. In other words, I don’t think we stored the data ourselves.

Beyang Liu:

It was just like Akamai or something like that?

Sugu Sougoumarane:

I think it was Akamai. Most likely Akamai because that name does ring a bell. So we stored the video data and Akamai would spread it around the world and take care of that part. But it was metadata. Metadata was growing too fast. We couldn’t keep up.

Beyang Liu:

What does metadata encompass? Is that comments and descriptions and titles?

Sugu Sougoumarane:

Comments, videos, user information. Apparently, the insert rate in the video was so high that it could not keep up with the inserts. Because we could not insert fast enough.

Beyang Liu:

There were so many comments that people were leaving on videos, or?

Sugu Sougoumarane:

So many videos being created.

Beyang Liu:

Oh, wow. Okay. So just the number of videos being created.

Sugu Sougoumarane:

So we’re talking hard disks, right? This is 2005, 2006.

Beyang Liu:

So, spinning discs.

Sugu Sougoumarane:

Spinning discs. You can only insert so fast.

Beyang Liu:

Wow. Okay, so it was too big for even the largest single box money could buy. Too big for one MySQL instance. And so you started working on sharding it. At what point did sharded MySQL become Vitess? Was there a single point where you decided to give a name to it? Or was it even the same project?

Sugu Sougoumarane:

So, we had sharded the MySQL database. We had, I believe, eight shards if I remember correctly. Because eight shards should be more than enough, right?

Beyang Liu:

Eight shards should be enough for anyone.

Sugu Sougoumarane:

I mean, we went so far with one shard. With eight shards, we can scale 8X. Here we are all set for the rest of our lives, is what we thought. And what was happening was that, at eight shards, the shards themselves were okay. We had not reached their capacity. But what had happened was that managing those eight shards became a huge pain for a number of reasons. And there were very frequent outages. A couple of outages a day kind of thing.

Beyang Liu:

Because one shard would go down?

Sugu Sougoumarane:

There was no peace in our organization. There was no peace. Because there was always an outage going on, and every time it was the database. So it was almost like, “Oh, there’s an outage. Let’s start looking at the database because that’s where the problem is.” And most of the time, that was the case. And so the thing that my co-founder, co-creator, and I did–actually, most of that work was done by him. What he did was, he took himself out and wrote a spreadsheet. There was no postmortem process those days.

Beyang Liu:

No Five Why’s.

Sugu Sougoumarane:

Yeah, none of that stuff. But he kind of went through that. He made a spreadsheet of every failure that we had ever faced. Every one of those outages. What caused it and what can we do to prevent it? And so he wrote–it was a huge spreadsheet. I may even be able to dig it up again. I should have it somewhere.

Beyang Liu:

Like he manually compiled?

Sugu Sougoumarane:

Yeah, he sat and wrote every outage that he had seen that he remembered. So he wrote it down and asked: what could solve it?

And so, by the end of that spreadsheet, when we looked at that spreadsheet, the goal was to actually do something that would leap ahead of all these problems, that will not only solve the problems of today, but also future problems that we are likely to have. So when we looked at the spreadsheet, it was obvious that we had to build a new software to handle all these cases.

Beyang Liu:

Can you give me a sense of what the common pain points were in the spreadsheet?

Sugu Sougoumarane:

There were a bunch of them. The first thing that we tested for YouTube was connection pooling. As the system was scaling, the number of front-end servers were scaling. Each one of them had an open connection to MySQL. And beyond about 5,000 connections, MySQL would start to crawl.

Beyang Liu:

How many front-end instances were there?

Sugu Sougoumarane:

We had thousands.

Beyang Liu:

Oh, okay. Thousands. All talking to the same eight shards.

Sugu Sougoumarane:

Yeah. All connected to the eight shards. So we had moved some of those out to replicas, for example. We were managing that. We were managing those connections, but even there, if a master went down, what would happen is these 5,000 connections would instantly try to reconnect and instantly bring down the MySQL, so it was a thundering herd cascading failure. So that was, for example, the first problem that we had solved.

Beyang Liu:

Yeah, I see.

Sugu Sougoumarane:

So we wrote a proxy that’ll stand, that can take 5,000 connections, but will only make 20 connections to MySQL.

Beyang Liu:

Got it. And so was that the beginning of the test?

Sugu Sougoumarane:

That was the beginning of the test. So that was the first problem that we solved. But then we solved other problems, where if you see a query that did not have a limit clause, we would hide it.

Why is this query not limiting the number of flows it’s retrieving? Because what if you end up scanning the entire table?

Beyang Liu:

And of course, that’s not always top of mind for the application-level development.

Sugu Sougoumarane:

When we are saying “who would upload more than 100 videos?”

Beyang Liu:

Who in their right mind?

Sugu Sougoumarane:

Who in their right mind would upload more than 100 videos? And then you discovered there was one particular case where there was someone who had uploaded 250,000 videos. Only. That’s actually a low number by the way.

Beyang Liu:

Oh, were they actual videos, or was it-

Sugu Sougoumarane:

Yeah. I mean, who cares? There were 250,000 video records.

Beyang Liu:

And they were all rows in the database.

Sugu Sougoumarane:

They were all rows in the database. So if I selected all the videos of this user, I got 250,000 rows back. But that wasn’t the problem. The problem was that some admin that was curating the index page of YouTube decided to feature that user in YouTube’s homepage.

So it’s on the front page. Everyone is getting it.

Beyang Liu:

It’s on the front page.

Sugu Sougoumarane:

Which means that every hit on the front page ended up fetching 250,000 videos.

Beyang Liu:

Got it. Got it.

Sugu Sougoumarane:

So one of the things that we wrote was, well, how do we protect against this? So, the way we did this was again, we had this proxy, right? So this proxy, every time it got a query, it would check if the same query is already executing.

If it is already executing, it won’t send it to MySQL, it’ll just wait for that result.

Beyang Liu:

It’s kind of batching the requests.

Sugu Sougoumarane:

Or consolidating.

Beyang Liu:

Consolidating.

Sugu Sougoumarane:

Yeah. So this was an expensive query. At any given point of time, only one of those can run. If the site got spammed with the same query, it won’t end up spamming the database. At any given point… if that finishes another one will start, but all other thundering herd requests will wait and will just share the result. So stuff like this, we started adding. And at that time, Vitess had only VT Tablet. Its only job was to protect MySQL, the single MySQL instance, that was-

Beyang Liu:

What is VT Tablet? I’m actually unfamiliar with that.

Sugu Sougoumarane:

So VT Tablet was the proxy that we first wrote. That was essentially the Vitess project.

Beyang Liu:

I see. Got it.

Sugu Sougoumarane:

Now, it is a project that had many other components. So when we wrote it, it was not actually meant to manage sharding. It was only meant to manage your cluster.

Later that evolved into, “Hey, is it ever possible to make the system so generic that someone other than YouTube who wanted to shard could use it.” And that was the beginning of what you see as Vitess today.

Beyang Liu:

So, how did you convince Google, the company, that this was worth open-sourcing? I mean, my understanding was, it was originally for YouTube, the internal system specifically. Did you have to fight?

Sugu Sougoumarane:

Not at all. Actually, it was extremely easy and straightforward. I believe it is still that type of company where if you want to start a project, if you’re starting a project and think that this is worthy of being open sourced, the only thing they want to make sure is that you are not exposing any company secrets that are critical to Google’s business. As long as that is not the case, you got approval. Our own decision to open source was a very selfish decision at that point. If we build this system, I don’t want to have to rebuild this if I went somewhere else.

Beyang Liu:

Yeah, of course.

Sugu Sougoumarane:

And there is nothing proprietary. And so we are going to open source this. And so we decided to open source it. So Vitess, actually, I don’t know how many rules of Google we broke by doing what we did, but we created a Subversion repository in our home directory, and checked in all the code into that repository.

Beyang Liu:

Just on your local machine.

Sugu Sougoumarane:

The home directory was a mounted drive. So it was a Google mounted drive network drive. So it was on a network drive. It was not checked into Google’s monorepo.

Beyang Liu:

You took it out of the monorepo.

Sugu Sougoumarane:

It was not in the monorepo. It was developed all in our home directory.

Beyang Liu:

Got it.

Sugu Sougoumarane:

Because we had-

Beyang Liu:

Oh, you did that intentionally, because it would be easier to kind of-

Sugu Sougoumarane:

So that it was easier to open source it. Initially, we didn’t want to start off open sourcing.

So we developed it in the directory and we deployed it within YouTube. So our deployment scripts used to read off of our home directory.

Beyang Liu:

And no one called you on that? You were just able to kind of get away with that?

Sugu Sougoumarane:

It was okay. Well, the fact of the matter is we were the callers. So we were the enforcers.

Beyang Liu:

It was all in the family.

Sugu Sougoumarane:

It was all in the family. So it was well-understood and Google’s home directory is pretty secure that way.

Beyang Liu:

Yeah. That’s awesome.

Sugu Sougoumarane:

And even when we went to the Google open source team and told them this, they were not shocked or anything. They said, “Yeah, sure.” It’s no different from you developing some code. So yeah, because it was backed up, it was secured, we were not going to lose that file system. It had all the security, so there was no fear of…

Beyang Liu:

I like that. So, give me a sense of timeline. When did Vitess as a project… when was the beginning of that, and then when was it open sourced for the first time?

Sugu Sougoumarane:

So I believe we started the project in 2010. It went into production in 2011, and we opened sourced it in 2012.

Beyang Liu:

2012, okay, got it.

Sugu Sougoumarane:

2012 I think is when we first open sourced it. So initially we had named it Voltron.

Beyang Liu:

Voltron. Is that a Pokémon or-

Sugu Sougoumarane:

It’s a Japanese cartoon. I think it is something where five different pieces come together to form this big giant thing. And the idea was that Vitess is a bunch of pieces that come together too. So that model still makes sense because Vitess is a distributed system where a bunch of components come and work together. So we named it Voltron, and obviously we could not open source with that name.

Beyang Liu:

Because of copyright.

Sugu Sougoumarane:

Copyrights and stuff. So we had to come up with a new name, and the problem was that because we had named it Voltron, all our files and things were named VT something.

Beyang Liu:

Okay, so you had constraints.

Sugu Sougoumarane:

I had constraints. I don’t know if you know Strong Bad. Not many people know-

Beyang Liu:

Oh, Homestar Runner?

Sugu Sougoumarane:

Homestar Runner, yes.

Beyang Liu:

Oh, yeah, come on. Yeah. Yeah. People who were on the internet in the 2000s era, right?

Sugu Sougoumarane:

2000, yes. 2000.

Beyang Liu:

The Flash era.

Sugu Sougoumarane:

We used to watch every release, by the way. We used to wait for it and…

Beyang Liu:

Doing the email and…

Sugu Sougoumarane:

Yeah, yeah. Baleeted, and all that.

Beyang Liu:

Yeah. Oh, God.

Sugu Sougoumarane:

So in Homestar Runner, there is one episode where somebody asks him, “I have this project, I want to give it a cool name. How do I come up with a cool name?” And he replies, “All you have to do is take an already existing name and completely mess up the spelling. Now you have a cool name. Say, limousine, put some Zs in there.”

Beyang Liu:

I actually remember this video. So, Vitess is a misspelling of the French word, vitesse, which means speed.

Beyang Liu:

Ah, okay.

Sugu Sougoumarane:

So vitesse in French ends with an E, and Vitess the project has no E at the end.

Beyang Liu:

Got it.

Sugu Sougoumarane:

And then it had VT in it, so it was perfect.

Beyang Liu:

Yeah. That’s awesome. So Vitess started in 2010 and open sourced in 2012. At what point did you start thinking there might be a company that we could build on top of this technology?

Sugu Sougoumarane:

For the longest time, I never intended to start a company, because why?

Beyang Liu:

Yeah. Don’t start a company for the sake of starting a company.

Sugu Sougoumarane:

For the sake of starting a company. I was actually previously approached by VCs, who said, “Oh my God, what are you sitting on? You should start a company.” I said, “No, not interested.” But I guess what the VCs saw–they saw the writing on the wall. They were trying to even convince me and I wasn’t sold, but I guess I also started seeing the writing on the wall.

Beyang Liu:

Which VCs? Do you feel comfortable sharing them? Were there any really good ones in the beginning?

Sugu Sougoumarane:

Yeah. Actually, Amplify Partners was one of them.

Beyang Liu:

Oh, nice. Amplify is awesome. We work with them as well.

Sugu Sougoumarane:

Our initial round was not by them, but they did plant the idea. And we did talk to them. We told them “You told us three years ago and here I am.”

Beyang Liu:

Yeah. Okay. So that got you kind of thinking about the…

Sugu Sougoumarane:

Kind of. It was in the back of my mind.

I was even willing to encourage other people to start a company and say, “If you start a company, I’ll help you.” From within YouTube. I don’t care.

Beyang Liu:

Wow.

Sugu Sougoumarane:

But nobody did. We felt that for YouTube to continue to use Vitess, it had to become a mainstream project in its own right. So that was one motivation.

Beyang Liu:

Yeah. Keep it around.

Sugu Sougoumarane:

Yeah. But then there were other motivations. The other motivation was eventually what is happening is, YouTube was also changing as a company, where the newer and newer talent that was coming in, they were all product-oriented.

Where does YouTube make money? YouTube does not make money by building Vitess.

YouTube makes money by producing videos, publishing them, and stuff. That’s where, naturally, all the effort was going. And at some point in time, the newer management in YouTube was starting to ask, why are we funding this project? But then this project is gaining adoption outside. Everybody is using it. We need this to be funded.

And then there is Spanner, and Google internally was saying, “We don’t know what this Vitess is.”

Beyang Liu:

Just put it all in Spanner.

Sugu Sougoumarane:

Put it all in Spanner. That is what is blessed within Google. You should move to that. So I started talking to people. I talked to Joe Beda and Craig McLuckie, who were the founders of Heptio. So I started talking to them, because they kind of went through the same thing. They created Kubernetes, and ended up leaving YouTube to start Heptio as a company. And they also founded the CNCF.

Beyang Liu:

Oh, I did not know that. Interesting.

Sugu Sougoumarane:

They had done all that, so we kind of chatted with them. “So Vitess is in this stage where YouTube doesn’t want to spend more money on it. How did you manage Kubernetes?” That’s when they said, “Vitess should join CNCF. It’s the perfect fit.” They kind of encouraged us to do it. So I went back and proposed to YouTube that we should donate Vitess to CNCF. YouTube liked the idea. Said it’s perfect. And the kind of understanding was that after that I will go and start a company on Vitess. That’s kind of how this all happened. So this happened during 2017. And in 2018, we started the company.

Beyang Liu:

So, the company is quite young, even though Vitess has been around.

Sugu Sougoumarane:

Yes. Yes. That was actually what the community was missing. Because Vitess is not a tooling that you try today, you don’t like it, you throw it out tomorrow. It stores your core data, your core infrastructure.

Beyang Liu:

It’s foundational.

Sugu Sougoumarane:

It’s foundational, you cannot migrate out of it for any reason. Once you’re into it, you are in it forever. And so companies were hesitant to adopt Vitess.

Beyang Liu:

There’s no one to pay. They needed someone to pay.

Sugu Sougoumarane:

So when I started the company, people said, “Okay, now we can count on Vitess being here to stay.” So that definitely upped adoption of Vitess.

Beyang Liu:

Tell me about the key selling points of PlanetScale. So the front page says serverless database platform. What does that mean?

Sugu Sougoumarane:

So here is what is changing in the industry. Vitess was in some respects, the first step towards what PlanetScale became. At YouTube, we literally ran tens of thousands of nodes. That’s how big the Vitess deployment was at YouTube.

Beyang Liu:

Wow.

Sugu Sougoumarane:

And there is no DBA team that can manage anything of that size.

Beyang Liu:

It’s just too many nodes.

Sugu Sougoumarane:

Just too many things, too many nodes. And those tens of thousands of nodes were running on Borg.

Beyang Liu:

And Borg is the Google internal predecessor to Kubernetes.

Sugu Sougoumarane:

It’s the predecessor to Kubernetes. It was in the Borg cloud. And the shutdown of a pod is something that is not human-generated. Previous to that, even today, with MySQL deployments, there is a DBA watching over that machine. Right?

Beyang Liu:

Super manual.

Sugu Sougoumarane:

It’s super, super manual.

And when we deployed Vitess in Borg, we had to make it developer-friendly. Make it such that a DBA was not needed to be watching over those nodes, so we had to actually automate a large part of those things away, so that we could deploy things at scale. The DBAs’ role actually changed, such that they would not manage a single instance. They will be doing things for the cluster. I think that trend is now more prominent. They are called DBEs or DBREs–there’s lots of names.

Beyang Liu:

They’re going from minding individual pets to being cattle herders.

Sugu Sougoumarane:

To managing fleets. So that was what Vitess did at YouTube, which became extremely relevant at Kubernetes. But it still had an administrative mindset, where you still had to know a few things.

The way I would describe it and now what Vitess is evolving into is, I kind of see two types of developers in the community. There are the tinkerers and there are the builders. There are application builders. They are kind of non-overlapping. They want to do different types of things. The tinkerers like to install software, configure it, play with it.

Vitess is a tool that relates to that type of person. They like to learn.

Beyang Liu:

Nitty-gritty, get your hands dirty.

Sugu Sougoumarane:

Yeah. How do I configure this relationship in this data?

Beyang Liu:

Diving into Unix internal… yeah.

Sugu Sougoumarane:

Yeah. They’re actually this one level above Unix. It’s more like understanding data, modeling relationships with them.

Doing cool things with the data. Like when you shard… optimizing your sharding algorithm. That’s what Vitess relates to. And then there’s the other category, which is the application builder. The biggest turn off to an application builder is that they say, “I want something…” and you say, “How much CPU do you need?” You’ve lost them. “How much display?” ‘I don’t know.”

Beyang Liu:

I don’t want to care. Just figure it out.

Sugu Sougoumarane:

I don’t know how much I’m going to need. Whatever it takes. Just give me something to get started. Those should be taken away. If the developer comes and says, “I want a database.” You say, “Here’s a database.” So PlanetScale is for this person. The person that says, “I want a database and we shouldn’t ask any more questions.”

Beyang Liu:

Yeah. It should just work.

Sugu Sougoumarane:

You want a database, here’s your database. “How do I connect to it?” “Here’s your connection string. Go.”

And another big problem that PlanetScale solves is the problem that’s been plaguing the database industry, which is managing schema deployment. A large number of developers who like databases, who like SQL–many of them are using key value stores.

Beyang Liu:

They like SQL, but they’re using KV stores for some reason.

Sugu Sougoumarane:

Yeah. They decide to use KV stores, say, “Why did you do it?” “It’s because of schema. I just don’t want to deal with the headache of schemas.”

It’s almost a universal story. But then what the database industry has been saying, “Well, it’s necessary. Schema is necessary.” Yeah, it’s needed.

Beyang Liu:

And you’re talking specifically about schema migrations or conversions, right?

Sugu Sougoumarane:

That’s where the disconnect came, right? Nobody tried to go in and find out why schema is such a big headache.

Beyang Liu:

I mean, it’s almost like a type system, right? It imposes certain constraints, but then this thing can change too.

Sugu Sougoumarane:

Yes. And not only that, it’s not only a type system. It’s a disruptive system. The moment you deploy schema and you make a small mistake, you have an outage that goes into many hours.

Beyang Liu:

Yeah. I’ve been there.

Sugu Sougoumarane:

One small fat finger, and you have an enormous outage of epic proportions. So that’s very scary. You cannot just say, “Here’s a schema change. Boom done.” This danger exists, right? How do companies solve the problem? They install gatekeepers. People to review your change.

Beyang Liu:

Only certain people can directly access the database and you have to go through them.

Sugu Sougoumarane:

Only they can directly access data, and they will review your code. And I have a schema that is ready to be deployed, but this DBA is on vacation, now I have to wait three days for them to come back. Or, they are busy fixing, taking backups, so they are not available. Because a DBA’s job is not to wait. That’s not their priority to review your schema change. So the entire process becomes a huge headache. And so Sam, who is our chief product officer, he kind of developed that system within GitHub. He came, he said, “I know what the developers want, and this is what we’re doing…” So this whole thing was his idea.

And you can see how the developer community is responding to these features. “I’ve been waiting for this all my life,” is what the developers are saying, which is self-serve schema-safe schema deployments.

Beyang Liu:

The full power of SQL, but with none of the hassle.

Sugu Sougoumarane:

With none of the hassle. And also, the way we deploy the schema is underlying… I don’t know if you’ve heard of gh-ost, which is the MySQL schema deployer.

Beyang Liu:

No.

Sugu Sougoumarane:

Basically, it allows you to deploy schemas without downtime.

So if you have a billion rows in your table and you want to add a column, MySQL is going to lock that table, and you’re not going to be able to write to it. And gh-ost actually allows you to do the deployment without locking that table. It’s called gh-ost, which stands for GitHub Online Schema Tool. But it’s also gh-ost, because it creates a ghost table for you.

Beyang Liu:

Oh, I see. That’s the trick. It just copies stuff over.

Sugu Sougoumarane:

And basically it forms an atomic switch. And it turns out that Vitess has the same technology called V-replication as a core primitive that you can use.

Beyang Liu:

That’s convenient.

Sugu Sougoumarane:

And the author of gh-ost coincidentally happens to work for PlanetScale, and coincidentally decides to implement that same feature in Vitess. So the same offline schema deployment now is available through Vitess.

Beyang Liu:

I feel like every organization or company that operates a database, they over time accumulate a bunch of scripts that automate certain tasks, whether it’s schema deploy, or pulling certain pieces of information. It sounds like you’re just productizing all of that and turning it into an out-of-the-box, non-hacky, non-duct-tape solution.

Sugu Sougoumarane:

Exactly. I mean, the writing is on the wall. GitHub kind of showed this model for source code already as a viable model, right? Where they took away what each enterprise had internally and productized it as a generic uniform method of managing source code. And now enterprises are all moving towards that, right? Instead of having each one implementing their own way of managing source code. We’re kind of doing the same thing for databases.

Beyang Liu:

Why is it that the databases themselves didn’t do this? Or why did it take someone external to the database communities and companies to go and build a system like this?

Sugu Sougoumarane:

Well, I consider myself part of the database community. I feel like I am the database community. I’m part of the database community that is responding. And if you look at Shlomi’s work, these are all people I would say are awesome to interview. Sam, Shlomi.

Beyang Liu:

Okay. Cool.

Sugu Sougoumarane:

You should get them when you can.

Beyang Liu:

Yep. Yep.

Sugu Sougoumarane:

Shlomi is an angel, so you should totally get him. You should see some of his presentations. They’re really awesome. That’s essentially what Shlomi has been doing. That has been his personal mission. His tools are some of the best tools for MySQL. Gh-ost is one. Orchestrator is another one, which actually manages failovers. Previous to Orchestrator, a DBA or a sysadmin had to go in and manually do this. So he automated. He has that empathy, that developer empathy. And he developed many of these things. It was amazing when we talked to him and he said, “You know what? Vitess is putting all this together. I want to do this.”

Beyang Liu:

That seems very serendipitous. And you all just launched recently. And from what I understand, it was a wildly successful launch.

Sugu Sougoumarane:

It was exciting. It was a little more successful than we expected.

Beyang Liu:

A little too successful?

Sugu Sougoumarane:

Yes.

Beyang Liu:

What happened?

Sugu Sougoumarane:

So internally we run on Kubernetes, by the way. We run Vitess on Kubernetes. We are running on EKS. There are limits. There are all kinds of limits that exist, if you’re running in the cloud. And they’re not all explicit. You have to know what they are. You have to find out what they are. When you hit a limit, the error doesn’t always tell you, “You hit this limit, go and change it.” That’s not how things fail. You have to figure it out. You have to reverse engineer from the error.

Beyang Liu:

What sort of limits are we talking about here?

Sugu Sougoumarane:

The obvious ones are machine quota limits. But there are underlying limits of how many EBS instances you can provision. Right? So that sometimes exhibits as a timeout.

Because we are like many stacks higher. Right? We are EKS, there’s PVCs, there’s PVs. There’s so many layers, and you see the outer layer. So, I think we hit every limit on day one of our launch.

Beyang Liu:

That’s awesome.

Sugu Sougoumarane:

Yeah. It was interesting. Fortunately, we have an awesome team of engineers. They quickly moved to them. And believe it or not, this goes to Nick, our VP. He knew that this was possible. I think something to do with the fact that he had been at GitHub and he knew… so when we did all the features, we had implemented read-only mode.

Beyang Liu:

Good call.

Sugu Sougoumarane:

And we had to use it.

Beyang Liu:

And everyone else got a “join the waitlist” page?

Sugu Sougoumarane:

Yeah. It said, “Okay, wait ‘till we are available again, and we’ll let you sign up.”

Beyang Liu:

That’s great. That’s great. So we’re almost out of time, but as a last parting note, if you were to ask the people listening or watching this, if there’s one thing you want them to do after this episode, what would that be?

Sugu Sougoumarane:

I say, just go try PlanetScale. The way I see it, it’s a mind-altering experience. It changes the way you look at things. The way I see it is once you’ve seen this, you cannot go back to something else. That’s what I feel like.

Another cool thing, for example, is you request a database. And then you have to go have coffee, right? But we will give you a database now. You request a database. When we’re doing our Alpha thing, the person will click the database, request a database, and it says, “Here it is.” He says, “What? Do I have to go look somewhere else, or is this done?” No, it is done. You can use it. So things like that. These are subtle things, but these are things that people value. Those are the kinds of things that we have done with PlanetScale.

Beyang Liu:

I’m definitely going to try that out. I was actually hacking on just a little side project with my brother last weekend. And it got to the point where we needed to stand up some sort of backend data store. And we’re like, “Let’s just install MySQL. That’ll be easy.” And we didn’t get through. We probably spent at least an hour trying to go through the motions, get it all set up right in the way that we had to, and then it was time for dinner. So, if we had only used PlanetScale we might have a working application by now.

Okay. All right. I’m going to check that out. Well, Sugu, thanks so much for taking the time. This was awesome. I feel like we could spend multiple hours. There’s so many things I want to hear about PayPal, YouTube-

Sugu Sougoumarane:

I would say talk to these other people that I mentioned.

Beyang Liu:

Okay, cool.

Sugu Sougoumarane:

They’re definitely worth talking to.

Beyang Liu:

We’ll get them on the phone.

Sugu Sougoumarane:

They’ll give you a very different perspective than mine.

Beyang Liu:

Awesome. Well, thanks so much.

Sugu Sougoumarane:

Cool. Thank you.

This transcript has been lightly edited for clarity and readability.