On Rails

Tom Rossi: Staying as Rails as Possible

Rails Foundation, Robby Russell Season 2 Episode 4

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 1:11:12

Tom Rossi, co-founder of Higher Pixels and the team behind Buzzsprout, joins Robby to talk about what it really looks like to stay "as Rails as possible", purely out of pragmatism. With over 472,000 podcasts on the platform and a team of fewer than ten people, Tom explains how sticking to vanilla Rails has been the foundation of Buzzsprout's ability to move fast, stay lean, and keep up with a rapidly evolving industry.

In this episode, Tom walks through Buzzsprout's migration from Paperclip to Active Storage (including what broke spectacularly in production), their recent shift from hand-rolled summary tables to ClickHouse for analytics, and how Hotwire made building the Buzzsprout mobile app surprisingly manageable. He also shares how Buzzsprout's real-world scale, including the monkey patching that resulted from it, led directly to contributions back upstream into Rails itself.

  • Higher Pixels: https://www.higherpixels.com
  • Buzzsprout: https://www.buzzsprout.com
  • Buzzsprout Blog: https://www.buzzsprout.com/blog
  • Tick (time tracking): https://www.tickspot.com
  • Donor Tools: https://www.donortools.com
  • StreamCare: https://www.streamcare.com
  • Higher Pixels joins the Rails Foundation: https://rubyonrails.org/2025/2/18/higher-pixels-joins-foundation

Send us Fan Mail

On Rails is a podcast focused on real-world technical decision-making, exploring how teams are scaling, architecting, and solving complex challenges with Rails. 

On Rails is brought to you by The Rails Foundation, and hosted by Robby Russell of Planet Argon, a consultancy that helps teams modernize their Ruby on Rails applications.

[00:00:05.07] - Robby
Welcome to On Rails, the podcast where we dig into the technical decisions behind building and maintaining production Ruby on Rails apps. I'm your host, Robbie Russell. I run Planet Argon, and for over 20 years we've helped teams maintain and evolve long-lived Rails apps, so I tend to approach these conversations through that lens. On this episode, we're joined by Tom Rossi, co-founder of Buzzsprout. Buzzsprout is one of several SaaS products built and operated under the Higher Pixels umbrella. In our conversation, we talk about what it looks like to stay as Rails as possible as a company scales. We dig into their recent migration from Paperclip to Active Storage, what went sideways in production, and how they adapted Rails to fit their needs. We also touch on analytics at scale, including a recent move to ClickHouse and how Rails helped make that transition quite manageable. We also explore how their infrastructure has evolved over time, along with why HirePixel chooses to give back to the Rails ecosystem. Tom joins us from Jacksonville, Florida in the United States. All right, check for your belongings, all aboard. Tom Rossi, welcome to On Rails.

[00:01:09.02] - Tom
Thanks for having me. I'm excited to be on the show.

[00:01:11.22] - Robby
I've been looking forward to this conversation as well. So Tom, I have to ask, what keeps you on Rails?

[00:01:19.01] - Tom
I love, I love Ruby on Rails. It makes life, uh, like I could be focused. At the end of the day, I feel like I'm a pragmatist. Like, I, I didn't do a lot of shopping. I wasn't looking for the best technical stack. I was looking to build a business, a SaaS business on software, and Ruby on Rails just provided all the guardrails I needed to be able to quickly build a business on it. And since then, like, I just— there's no reason for me to shop. So I stay on Rails because it works for me, and I'm very practical. So I've never, I've never chased any of tiny objects when other, you know, when everyone was talking about switching to different, uh, technologies that were out there, I just, I wasn't interested cuz I was more interested in growing my business.

[00:02:04.06] - Robby
Hmm. Do you feel like there's anything about the general wider community about within the Ruby on Rails ecosystem that tends to align well with that particular goal that you have about building your business and things like that versus be maybe geeking out is about the technology? Do you feel like you still get to geek out about the technology and super excited about that? Or do you, do you feel like there's always that like, well, I'm also I'm kind of running a business here and I want to just get that part done.

[00:02:28.20] - Tom
I think especially when you're getting started, it's so great to inherit opinions from people that are smarter than you. And I had no issue recognizing that people were smarter than me and they've made decisions and this is the way they did it. And, uh, it's funny because even still, even still, I remember the first couple projects I did with Rails where I overrode I basically spent time writing configuration because I didn't wanna follow. And sure enough, those are the things that come back and bite me later.

[00:03:00.17] - Robby
Sure, sure.

[00:03:01.15] - Tom
But for the most part, it provides so many opinions and so many guardrails for me. It makes it easy to get going, to get, you know, out the door with a product.

[00:03:11.04] - Robby
Do you recall approximately what version of Rails you started playing with?

[00:03:16.01] - Tom
Oh, it was pre-1.0.

[00:03:18.22] - Robby
Okay.

[00:03:19.05] - Tom
So, that's good.

[00:03:19.21] - Robby
You're back in the 2004, 2005 range or so?

[00:03:23.00] - Tom
Yeah.

[00:03:23.04] - Robby
Give or take?

[00:03:23.12] - Tom
Yeah, I think it was 2005. It was not long after the infamous Build a Blog video, which is really how I got into it. Actually, it was a designer that I worked with who now I'm a partner with, but he sent it to me and he was like, look, because we were, we were a big.NET shop before that. And he's like, look, these are— this guy's talking about all the things that we ran into when we were building apps using.NET. Just watch the video and see what you think. Right? So he's a designer, not a programmer at all. And I watched the video and was just blown away. And then started playing with Rails and haven't looked back.

[00:03:58.05] - Robby
So it's similar timeline to me as well. So, but to give our listeners a bit of, uh, context and grounding, could you give us a quick overview of specifically Buzzsprout, but then you also mentioned your business partner, so talk a little bit about Higher Pixels as well.

[00:04:11.06] - Tom
Yeah. So originally we started off when we were doing client services work back in the day, people would hire us, we would build product for them, but then around 2001, we started to pivot into building our own product. We built our first product, it was.NET, and, um, just tasted what it was like to have a SaaS product and loved it. And so I ended up partnering up with, uh, one of the designers I used to work with, Kevin. Kevin came on and I was like, look, let's just, let's just partner up and build another SaaS product. And that's when he said, look, watch this video first. And we built our next product, product called Tick, which is time tracking for people that track time against budgets. And we built that product on Rails and it was, an incredible experience. And then after that, it was just, we just wanted to build more products. So for a long time, it was just the two of us. We had more products than people cuz we had built the original product in.NET, then we rebuilt it in Rails, and then we had Tikk and we launched a product called Donor Tools.

[00:05:11.00] - Tom
We, we have a product called Streamcare, which is in the medical space. So we have all these different products that we built, but it's just kind of, we, we used to refer to spinning plates. You try, you know, you're doing a little bit of work over here, you're doing a little bit of work over here. And our big, our big products, our big, um, moneymakers were Tick on time tracking and a product in the medical space called Streamcare. But we had this little product that we built called Buzzsprout that was in the podcasting space, but we built it in 2008. I think we launched it 2009. And, um, it was not— I mean, it, it didn't make a lot of money, but it was super fun to build and we wanted it. We needed it because we had another product that kind of complemented it. and that product ended up just exploding as podcasting started to take off. When we first launched it, not a lot of people were, were podcasting. It was mostly technical people and, um, there's a lot of DJs. Um, but now everybody, everybody podcasts. And so Buzzsprout was positioned really well and being built on Rails, we were able to continue to adapt and make it, add features and functionality and keep up with the industry as it was evolving as podcast hosting as a, as a, a service.

[00:06:20.23] - Tom
Was evolving, Buzzsprout was able to evolve with it.

[00:06:25.00] - Robby
Nice. You know, I also work in the, you know, the client services area. And so you were able to make that transition from working with clients. Do you still work at all with any sort of clients at this point anymore? Or you pretty much really just have your portfolio of SaaS platform products and things that you're selling.

[00:06:43.05] - Tom
Right. No offense, but I have no desire to go back to working with clients. Like it was brutal. I mean, you know, it just gets really hard and Um, once you have— well, I'm kind of the client now, so I, and I love it. Like, I get to make the decisions as opposed to having to pitch a client on why they should or shouldn't do something.

[00:07:03.04] - Robby
No, I can appreciate that. Well, I think the, the, the ingredient there is not every— I realize that I'm not necessarily great at selling the product. Um, so I feel like I can build things and we experimented over the years with building several products and trying to figure out how we're gonna bring those to market. And it was that part of it. So you, you or your partners or people you work with have that skillset to know how to bring the thing to market and, and sell it and do that part. But I'm like, oh, I'm actually really good at helping people with their projects. So I realized that that part of distinction, at least with how I think about software development. And so I enjoy the client services, but yeah, it's—

[00:07:37.19] - Tom
it's fun too. Client services. What's fun about client services is there's so much variety. And so you get to go solve, you get to see their eyes light up when you solve a problem, you know, you see it get deployed and this problem that they've had gets fixed. And there is, there is a lot of enjoyment that I miss from that. Occasionally I'll do it more, uh, for like friends and it's not even, I don't even wanna get paid for it. I just wanna come in and help you solve this problem. Um, I'm doing it right now with, uh, a friend of mine where he's, his technical team just, they just hate Rails, but they don't know anything about it. And so I, I've convinced him, I said, let me build it. I could build this thing in probably less than 6 weeks to do all the things that you wanna do. And so anyways, I like doing stuff like that. Yeah. To play with. Um, but not, not as a business. I don't, I don't want my, to have to pay people's salaries based on my ability to sell, you know, large projects. It's fair.

[00:08:30.11] - Robby
It's, it's, I mean, having been doing that for a couple of decades now, I can appreciate that, uh, not wanting to have to deal with that part of it because it's definitely not for everybody. And I don't even know if it's for me, to be honest. But, uh, but I've been doing it and I figure it out. So the ebb and flows of running that.

[00:08:44.08] - Tom
But you do touch on a really good, uh, thing that I learned, which was the, the importance of marketing and selling because we used to think you just build this incredible product and that's all you needed to do. But you still need to— yeah, you still got to— you still have to go get those customers. And so what happens a lot of times is you're like, okay, we're not growing the way that we want to. What should we do? Let's build more features. The problem isn't the features. The problem is the marketing element. And so it's great. My partner Kevin, he's really good on the creative side and coming up with different ways to connect with audiences and be able to get the word out there and stuff like that. And since when we launched our medical product, I took on another partner, Marshall, and Marshall's on the finance side. And so he's really good at helping balance Kevin and I in terms of, you know, we'll put all of our chips on the table all the time and Marshall be like, yeah, maybe we should, you know. So it's good to have partners, partners that you trust.

[00:09:39.13] - Tom
But it's like marriage. Yeah, you probably know the whole thing.

[00:09:42.16] - Robby
Yeah, I got a business partner. It's, it's a whole thing. But also having people on my team that I need to bounce ideas off with, you know, try to play off their strengths, play off my strengths and somehow move things forward and, and still have to deal with the challenges of marketing a software consultancy just as much as it would be a software product. Maybe I need to rethink this. Anyhow, uh, let's talk more about this. So out of curiosity, how many engineers approximately are actively working on the, so let's say particularly the Buzzsprout codebase on a regular basis?

[00:10:10.16] - Tom
So we, right now we only work on Buzzsprout and Streamcare, and they have separate teams. Streamcare, pretty simple. There's only one developer on Streamcare. I occasionally, I'm, I'll pop in and help out on that, but mostly it's, it's one person, Ron. Um, he's part of the, um, Boulder City Ruby group in, in Colorado. Uh, Ron, uh, does all the development for Streamcare. On the Buzzsprout side, we have two SREs. 3 programmers and 2 designers. So pretty small team compared to our competitors.

[00:10:47.18] - Robby
And approximately how many podcasts are you hosting at this point on, if you don't mind me sharing, asking these questions? At least one.

[00:10:54.09] - Tom
So yeah, yeah. Right now there's about 472,000 podcasts active. There's 472,000 podcasts on Buzzsprout, but of those, I would say probably 125,000 of them are active, meaning that people are continuing to upload episodes and do things.

[00:11:11.18] - Robby
Getting past that first. What is it?

[00:11:13.07] - Tom
The—

[00:11:13.17] - Robby
what's the— you might know the number, but how many episodes do podcasts usually not get past? Is it like 5 or 3 or something? One.

[00:11:21.06] - Tom
One. Like, I tried to look at this because I've heard this before. Oh, you know, you got to get over the 7-episode hump and things like that. But I, I looked at our own statistics and I couldn't, I couldn't find anything like that other than you need to launch. There's tons of them that never upload their first episode and It's hard. Our numbers are hard to look at because we do start with a free plan. And so somebody that signs up for a free plan makes it, eh, you know, are they really into podcasting? Um, by the time they're paying, they are, they've got at least hopefully one episode. Yeah.

[00:11:51.09] - Robby
Yeah. That makes sense. And you could be maybe evaluating different platforms or whatever, just testing something out or like, oh, I got this really excited energy one afternoon. Like, I'm going to start a podcast with my friend, do this thing. And then you're like, I'll get back to that. And then 6 months goes by and you're like, oh yeah, I forgot.

[00:12:06.05] - Tom
January is a huge. Month for podcasting. It's funny because you wouldn't think it's seasonal, but a lot of people are like, okay, I'm going to— I'm finally going to start that podcast in January, just like they join the gym. They're going to go launch that podcast. And so we see— we see a big uptick in January.

[00:12:21.21] - Robby
Specifically around the number of engineers that you've got. You know, you mentioned 3— is that 3 programmers actively work, a couple of SREs and some other roles around that as well. Has it relatively been about that size for at least the last several years now, or—

[00:12:34.22] - Tom
Yeah.

[00:12:35.11] - Robby
Is it? Okay.

[00:12:35.20] - Tom
Yeah, it's been that size for a long time.

[00:12:39.01] - Robby
And you think there's an SRE?

[00:12:40.05] - Tom
I think the SREs, the SREs are, are relatively new. The SREs, thankfully, was 2019 that, um, we hired our first SRE and we just recognized this was a competency we had to bring in-house. I had to have somebody, I mean, I, I know how to write code. I lo— I love writing Rails code, but understanding the configurations on the servers and being able to scale and things like that and all the tweaking that's required once you get to that kind of size. Um, we were relying a lot on our vendor to do a lot of that work. And at some point we just recognized, man, our whole business is built on this. And if we don't have that competency in-house, we could go down. Thankfully, we hired our first SRE. Then 2020 hit. When the pandemic hit, podcasting exploded because everybody's locked in their house. So it was like you're holding on by your fingernails as it was just exploding. And there's so many people that were signing up. And thankfully, we had an SRE that was able to scale us up quickly. And so we ended up moving into a Kubernetes-type environment in 2020 to be able to scale up with everything else that was happening.

[00:13:48.00] - Robby
Interesting. And just for everybody listening that might not know what SRE means, it's a Site Reliability Engineer, correct?

[00:13:54.00] - Tom
Yeah.

[00:13:54.18] - Robby
Okay. And so, and what are their responsibilities? Are they primarily, like within the context of your organization, are they primarily keep an eye on monitoring metrics, scaling, keeping your— I don't know, are you running this on AWS or where do you run your cluster there?

[00:14:11.23] - Tom
Right.

[00:14:12.03] - Robby
So you're still using Kubernetes as well?

[00:14:14.12] - Tom
Right. We're still on Kubernetes. Um, and so Brian Trawick is our SRE and he is exceptional. And I think it's hard to call him an SRE because he, he'll go do development work too. So he's, he floats back and forth. So he's not just about keeping the server powered on and running, like he'll go in and, and write code. So one of the things that you and I had talked about was our recent shift that we did where we moved, we introduced ClickHouse into our environment. Well, he's not just spinning up the ClickHouse server, he's actually figuring out how do we get the ActiveRecord configuration working and how do we configure our parallel tasks to run. So he's an SRE, but he's kind of an SRE plus. And then we have another SRE who is strictly an SRE. He is basically watching the servers, making sure that memory usage is in line with what it needs to be, that it's scaling up and scaling down whenever it needs to, able to work on jobs and things like that. And so he's more of a traditional and he's relatively new to the company. I think he's only been with us for maybe a year, year and a half.

[00:15:16.15] - Tom
And he was really to give some relief to the one SRE we had so he could go on vacation. I would, I would always tell Brian, I was like, dude, I want you to go to Italy and not take your laptop. Like, that would be amazing if you did that. Because whenever he goes out, he's always got a little laptop because he's always nervous, you know, he's going to get paged.

[00:15:33.06] - Robby
That's the— I can understand that. There's always been a little bit of— I remember there was someone a long time ago gave me advice like, don't hire for roles until you can kind of afford to hire 1.5 to 2 of them. That way you have some redundancy because you also have that, like, it's really great that people can step up and take on that role, but then also if you can't give them some relief, then that can be, you know, they don't get to enjoy their vacations or, and things like that, or Tom's going to have to try to figure out how to modify the Kubernetes cluster and redeploy that. So that's fun.

[00:16:02.07] - Tom
We would talk about the bus factor, like how many busses would it take to wipe out your company? If it's one bus, you're in trouble.

[00:16:09.06] - Robby
Well, yeah, I think that's, that's, that's not good. Uh, so let's, let's talk a little bit more about Rails in particular. So in our, in one of our prep conversations, you know, you've been, you, you specifically said you described Buzzsprout being as Rails as possible, but not out of purity, but pragmatism. Could you talk about what do you mean by that and what that looks like in day-to-day, say, decision-making?

[00:16:32.22] - Tom
Sure. Yeah. I think if from a, pragmatist standpoint, it's I want to help podcasters start podcasting, keep podcasting for that product. Like, that's, that's our mantra. That's what we're, we're trying to accomplish. And so I want to use whatever means possible to deliver on that value prop. And so from a practical standpoint, Rails is really good at that because it doesn't, it doesn't have opinions about podcasting, but has opinions about things that probably don't matter— what type of columns you should use, uh, how you should label your— name your tables and, you know, things like that that you can spend so much time debating and working on where all those decisions, there's tons of opinions that are out there. And so it just helps you to move fast. And Rails is very agile. I can just go change things. You know, we recently we just introduced annual plans to Buzzsprout and it was painful, but it was not— it wasn't painful from the Rails perspective. It was just painful because it was such a big change, right, to be able to communicate. It was more on like the English that we used to communicate what we do, but actually able to do it, actually doing it with Rails was so much simpler.

[00:17:39.23] - Robby
Yeah, that resonates. You know, you talk about, say, like defaults. And so what sort of decisions do you think it saves you from having to make? You mentioned like maybe column names, things like that. Like people might be like, oh, that's nice. That doesn't feel like the biggest lift necessarily.

[00:17:55.22] - Tom
Yeah.

[00:17:56.02] - Robby
But I also remember, I remember pre-Rails and I remember being like, there was like, we would have kind of conversations as teams, be like, we're gonna map out our whole database schema and what are we gonna call that? Like people kind of—

[00:18:06.06] - Tom
Yeah. And remember like different naming conventions on our columns and oh, it'd be str as, as a prefix. And then, yeah, I mean, yeah, all, all kinds of stuff. I don't think about any of that anymore.

[00:18:16.06] - Robby
Yeah.

[00:18:16.19] - Tom
But yeah, it's not the heaviest lift. I think, uh, a better example would be Hotwire. So Hotwire has saved us a ton. We were able to come up with what the mobile app for Buzzsprout would look like. And develop it all within 6 months. Like, you couldn't. And then we, and then we built the Android app. The Android app launched, it took us 3 months to build it. We wanted to launch them both at the same time. It took us about 3 months to build Android. Um, just because it was, was already, we had already done all the design work. And so the actual implementation itself wasn't that bad. Well, why? Well, because we're web developers and we're on Rails, and because we're on Rails, we can take advantage of all the things that we know how to do. We can just use Hotwire to connect it up. Those, there's those elements that need to be native. I don't know if you've used the Buzzsprout mobile app, but it feels like a native experience because the areas that need to be native, we can connect in with, with, um, Stimulus and to do different types of native, native things.

[00:19:12.00] - Robby
How was the—

[00:19:12.12] - Tom
that's a great example.

[00:19:13.23] - Robby
No, I think it is a good example. How was the, how have you approached the like payments? Do, is that all that happening pretty much on the web interface right now? Or if you needed to work around anything to deal with like Apple or Google's payment purchase?

[00:19:29.13] - Tom
Yeah, we avoided it completely with, um, with both of them by not putting any payment information into the app. So they call it a companion app. We had to, we had to make the argument, um, we had to make the argument. They originally, they approved everything, and when we were ready to launch, they denied it. And we're like, whoa, whoa, whoa, this is, this is, you've already approved it in the past. We've told you it's a companion app there. So somebody is using this that's already put their credit card, they're already paying us, whatever. But we had to fight it, and you never know how long it's going to take. Meanwhile, we've got all this marketing that we've prepared that we're holding off. It, it, it was brutal, but we still haven't added anything. I know there's been some changes, but we don't want to risk getting in trouble.

[00:20:12.09] - Robby
Yeah, go through that process again. Yeah. And, and you think that that argument around the companion app holds in that— do you think a lot of people are— or does your app actually allow— I haven't played that much extensively with the app. It is a I think there's some like content management type features and stuff like that. Maybe seeing some, some data in there, but are you uploading, you're not recording the episodes in the app or anything at this point, right? No. So I understand.

[00:20:36.03] - Tom
But, but when we get into those type of features, we have this wonderful toolbox to be able to reach, reach into and use.

[00:20:46.11] - Sponsor
What if I told you your model already knows who it is? Introducing Enum Charts. The ancient practice of assigning your records a fixed identity based on a small integer column they'll never truly understand. With Enum Chart, your post doesn't just have a status, it is its status. Are you a draft? Creative, misunderstood, not ready to be seen. A published? Confident, public, living your truth. Archived? You've done the work, now you rest. Simply declare your values in order, and Enum Chart will assign each one a sacred number. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. A number you must never speak of and never reorder or your entire database will enter retrograde. Need to change your status? Just call.published! and feel the transformation. Need to ask who you are?.draft? The answer is always boolean. The answer is always certain. EnumChart. You're not just a row, you're a status. The enum chart is not responsible for adding a new value in the middle and watching 10,030 records suddenly become archived. Do not gaze upon the integer column directly. If you must use underscore prefix, you were never meant to have two enums in the same model, except this.

[00:21:41.09] - Robby
Your schema has spoken. So you mentioned also in our conversations in the past that, you know, you, you lean heavily on the, like the, the Rails and the Ruby ecosystem. So can you tell us a little bit about some of the open source libraries and how it helps carry your platform?

[00:21:58.15] - Tom
Every, everything that we do is built on Rails. And so that's why, that's why we're members of the Rails Foundation. That's why we wanna support what you're doing with your podcast. We wanna support the Rails community because Everything that we built is on top of Rails. And, uh, it's kind of like we were talking about before, the difference between being the practical business that uses Rails versus the contributor to the open source community, to writing Rails code. It's very different. You know, it's been challenging. We, we've tried in the last year, year and a half to start contributing to Rails. And it's, it's like you gotta put a different brain in your head. Mm-hmm. Because it's just, you have to think so meta about the work that you're doing. Um, it's very different. So we've, we've been longtime consumers and users of open source code, um, but only recently have started to dabble in, could we actually contribute? Like, could we actually do this?

[00:22:50.23] - Robby
Was there, have you actually started making any contributions to like Rails itself at this point or got any PRs you've sent as a team?

[00:23:00.06] - Tom
Yeah. Yeah. Um, that's exciting. A couple for sure. The most recent one was in Active Storage. And so we think Active Storage is something that Buzzsprout can really help benefit the Rails community, um, because we do so, we have so much storage, we're so concerned about bandwidth, like that's a big cost of goods sold for us. And so because of that, we're very invested in how active storage works. And so that was the, the, the first PR that I kind of worked on in active storage. Just got, I, it took about a year, but it's out.

[00:23:35.18] - Robby
So I feel like we, when we met up at a conference couple, we had maybe a hallway track conversation, maybe it was like Sin City Ruby or something. And I feel like you were, I don't know if you were already through the process or you were working on a migration to Active Storage. What were you using? Was that like Paperclip before that or something? Or what were you, how were you managing that stuff?

[00:23:54.09] - Tom
I think Active Storage is an example of where our lack of participation, like in other areas, like we just went with Rails opinion. So we are always trying to follow Rails opinion or vanilla Rails, Rails opinion. So when, as soon as Active Storage came out. As soon as Active Storage came out, not— we weren't Rails Edge. As soon as it was general release, we're like, okay, it's time to ditch Paperclip and go to Active Storage. And that was the wrong move. Like, it was, it was not— Active Storage was not built for what we use it for. And so that led to a whole series of monkey patching to really get it to do what we needed it to do, which we should have really just held off and waited and helped maybe Active Storage become what we needed it to be. But that was years ago whenever Active Storage rolled out. And so we've just had that monkey patching out there for a while. And so that's why we're like, okay, can we take that monkey patching and actually incorporate it into Rails? You know, push it upstream the way that 37signals, you know, generously has provided for us.

[00:24:55.05] - Robby
Yeah.

[00:24:55.07] - Tom
Can we, can we do something like that?

[00:24:57.09] - Robby
Do you recall what was, what types of things you were needing to monkey patch? Do you, can you remember much of the lower level detail?

[00:25:03.16] - Tom
Oh yeah.

[00:25:03.19] - Robby
What was, oh yeah. And what was, what was so different about how you're approaching things with Paperclip versus Active Storage. And if I recall the timeline, you know, ThoughtBot also kind of said they're going to stop maintaining Paperclip. So everybody kind of needed to figure something out at that point unless you forked Paperclip and try to keep it going long term. So I believe there's still a fork of that that's still being maintained.

[00:25:25.07] - Tom
So this is pretty—

[00:25:26.11] - Robby
but tell us more.

[00:25:28.01] - Tom
Yeah, before Brian Trewick, RSRE. So literally it's just some Rails coders. We don't know much about servers. We're not thinking about that kind of stuff. We just write code. And so we just switch it out. So we deploy it and all of a sudden Buzzsprout goes down and we're like, huh? Like what happened? Why did it go down? We're trying to figure out, we can't figure it out. We roll it back. We're looking at our code. Did we do something wrong? Like what's like, it just crawled to a stop. Like it just, it wasn't responding. And so like any good developer, we just rolled it out again and the site goes down.

[00:26:04.03] - Robby
Try it again.

[00:26:04.14] - Tom
So we, so we roll it back and we reach out to our vendor and we're like, okay, can you give us any clues as to why this is happening? And they're like, well, whatever you're doing, like you're, the number of requests that your server is handling is going up exponentially and you don't have the resources available to be able to service that many requests. Well, We start to uncover like what's, what's actually happening. Well, when we were using Paperclip, every asset that you link to, they're public assets. So they don't actually touch our server at all. They don't touch the, the, the Rails server at all. But with Active Storage, every request.

[00:26:41.07] - Robby
So it was just, just for context for everybody, it was, if you weren't using Paperclip, it would, it would return like an S3 URL, maybe a CloudFront URL, however you're, you know, whatever, but it was just pointing to the assets. So it just completely bypassed your own. Your Rails server, right? So it didn't, it wasn't, it wasn't part of the equation at that point. It's like, well, here's the URL, go serve it up from your browser. Right. So yes.

[00:27:01.05] - Tom
Yes. And so that URL would go directly to the asset. It was still in a Buzzsprout domain. So we had like public resources, like public assets. You can link to your public assets in such a way that they, they, they don't actually tax the server. Well, anyways, when we rolled it out, every RSS feed includes, you know, hundreds of links for the artwork, for the episodes, for the MP3s, everything. And so all of those URLs were all of a sudden just getting slammed. And so the monkey— the first monkey patch that we had to do was to override that and actually link directly to the asset when, when it's a public asset like that. Um, but it was just in our RSS feeds and, you know, a couple other places. So that, that's the one that I can think of, uh, that's the, the most The most important for us. Then the next wave was when we started to really rely on content delivery networks. So CDNs help, um, when you're, when you're starting off, you don't need to sweat it. Like you don't need to worry. Like, I mean, you can do caching and you can do CDN caching, uh, you know, server-side caching with a CDN, but you don't have to, right?

[00:28:05.23] - Tom
Like you can get things going. Well, we hit, we hit that place where we're like, okay, we need a CDN. We need something in front of our server because there's so many requests that we don't need to service that doesn't change. We could just use a CDN to do that. And when we implemented CDN again, we had to touch that. We had to monkey patch the Active Storage to include links to go to the CDN rather than redirecting them to the, the S3 assets.

[00:28:29.14] - Robby
That makes sense. Do you recall, you know, I'm curious, I didn't know that. I'm also thinking about like when the Active Storage came out, like the context of, let's say something like a Basecamp where most of those assets are like in a project. Yeah, they're— yeah, you have to log in and then you have access that's very dependent on your role. And, you know, it's not just— and you're serving up public assets, you know, that are supposed to— you know, the— it's like the open web of the, you know, the RSS and podcasts. It's like you don't know who's necessarily even pulling down these things. It's just people on the internet and machines and caching things. And that's a harder predictor of— and but also kind of makes it seem like a little simpler. So I think that was always an interesting— what it's like when actors—

[00:29:11.21] - Tom
storage was built Based on that, that domain model, it was thinking, hey, these are assets that are going to be behind a password, so they need to be private. We don't want to just drop in links to these assets. But that's not the case for, for everybody. And certainly for, for Buzzsprout, the majority of our assets are public assets. And so we want to be able to, to serve them up and we want to be able to control serving them up through CDN and even have different controls over how we, we do that. Whereas right now, if you look at Active Storage, there, there's some support for CDN, but it doesn't make any sense because what it does is it downloads the assets to the web server so that then it can get cached on the CDN. But we need it, we need it to cache on the CDN. That, that'll be the next, the next big push will be looking at how we can get Active Storage to work with CDNs. But right now there's a ton of monkey patching that we've done in there.

[00:30:04.23] - Robby
Are you doing anything like, uh, image resizing and things like that? Like we used to but in the olden days, we used to, you know, if someone would upload an image, maybe a high-res image, and we'd have to make like 10 different versions of that and push them on S3 into a bucket of all the different versions. And then if I recall, when Active Storage came out, it was like, oh, it'll kind of do that dynamically. And there's been other ways you can do that kind of dynamically. So if you change the size of your— are you doing anything like that when someone uploads like an asset to have different versions for like a thumbnail versus like so that way you could optimize the images? And where's that happening?

[00:30:35.22] - Tom
Yeah, this is a great— because this is what the PR that just made it through. The reason that we had that, that we pushed into that was because when somebody uploads, let's just say episode artwork, so they've uploaded an episode and they put— they want to upload the artwork for that episode. When they upload the artwork, we resize it to 2 or 3 different sizes. We do it immediately because we know they're going to get requested. So I can't remember what the option was, but what would happen is it would schedule a job job to go create those images. But the problem is, as soon as that artwork is uploaded, it will get requested thousands of times depending on how popular the podcast is, because that podcast is getting hit all the time. And they're saying, hey, is there anything new? Is there anything new? Oh, there's an episode. Oh, oh, there's artwork. Let me download the artwork. And so what happens is it starts making those requests. Well, then every one of those requests trigger the job to go create the— right? Right. And so the original name for the PR was like immediate variants.

[00:31:33.23] - Tom
The idea that we need to create these variants of the art, but we need to do it immediately before it actually begins to serve up the asset. We need to have those variants ready to go because as soon as the, as soon as the variant is, or as soon as the asset is created, it will be requested and it will be requested, you know, thousands of times.

[00:31:52.01] - Robby
Yeah, that's, that's, that's a, that's a fun problem to kind of deal with. Are you, are there any other kind of fun things you're like, you also have like audio files you're dealing with and are you needing to do much on those a, after they get uploaded and like, are you re— like resizing them or anything? Like, I mean, maybe not recompressing them or, cuz how, cuz not everybody, cuz you can, I know like in different platforms you can have like a, you know, a WAV file versus an MP3 or whatever, an AIF file. And like, is there a standard?

[00:32:21.04] - Tom
We had all that tooling done before Active Storage, so You know, Active Storage has some capability to do that. Um, like they even talk about in the documentation of being able to process audio and video and things like that. But we already have all that tooling done. So we still use our own tooling. But what happens is when somebody uploads an audio file, they can upload anything they want, and then we're automatically going to process it. We're going to turn it into an MP3. We're going to use industry standards for everything to be able to, to make it so that they don't have to worry about how do I, how do I prepare my audio to be able to to get it out to the world. When we first launched Buzzsprout, I'll never forget, it was probably in the first couple weeks. I did not anticipate somebody uploaded a 600-meg WAV file for like, you know, a 30-minute podcast, but it was 600 megs a WAV, and I just hadn't— like, you need a lot of disk space, you need a lot of processor to be able to, to do that. And so the server, I was trying to figure out again, like, because I'm not the server guy, and I'm like, why is the server struggling.

[00:33:22.20] - Tom
Well, it was cuz the disk space was used cuz it, it had taken up all the space with this massive WAV file. So from the beginning, we wanted to be able to help people that don't, they're not geeks. They're, they don't understand the difference between an MP3 and a WAV. And so it was proof that we definitely hit that audience, but it does mean that we need to be able to do that post-processing.

[00:33:42.01] - Robby
I'm also thinking back to an era when we had to change, uh, you know, like our Apache or NGINX settings to make it allow it to upload files of a certain size.

[00:33:50.18] - Tom
—yes.

[00:33:52.01] - Robby
What does that world look like now these days? Are you, is that even needing to go through anything like Apache or Nginx at that point? Or like if someone uploads like a pretty large file, what does that kind of look like behind the scenes? Where does Rails come into that? Or do you, have you needed to work with or kind of around Rails for things like that?

[00:34:07.21] - Tom
That works pretty well with, um, Active Storage. So they upload it because it does a direct upload. Now we do have some issues with direct upload, but I think that's more of an Amazon issue because they're uploading directly to Amazon. And so sometimes they'll experience the slowdowns. There are a lot of podcasters outside of the US too. And so you hit those edge cases too where they're uploading it. So where is your bucket located that they're uploading to and all that kind of stuff. But Active Storage, that is actually a really, really nice feature that they're directly uploading it to S3. They're not even touching our server until it's done uploaded and then it's just just, you know, a key in your S3 bucket, which makes it so much easier to process. In the old days, like you said, we had to fight the web server to allow it to let them upload these massive files.

[00:34:56.13] - Robby
We have to upload it and then we'd have to then move it ourselves. And then, you know, there's a lot of levels of different transportation that involved. And we would try to put like a nice little AJAX interface in front of it and like, here's like little progress and we like, oh, something went wrong. And like, like, is where did it break in the process?

[00:35:12.01] - Tom
And yeah, so much of that we get for free now, right?

[00:35:15.00] - Robby
Yeah, it's, it's, that's so true. You know, I think about, you know, you mentioned those types of projects where you've done some monkey patching and you've made some contributions back to the PRs. Are there any other like open source gems or other aspects within the Rails ecosystem outside of Rails, like Rails provides? I know you try to keep it as vanilla as possible, but where do you tend to gravitate towards? Like Are there some gems you really, really depended on and you've been really appreciative of? And like payment processing, anything like that?

[00:35:44.04] - Tom
Or— I mean, yeah, so we definitely, we use, we use the Stripe gem to be able to interact with Stripe, but I am just a big believer in— I want to be in control of my, of my credit card processing and things like that. Like I want to be able to switch vendors. And so I don't want to be beholden and we've all seen it, right, where you've had a gem that you depended on and then that gem went, that gem went belly up. I'm trying to remember, there was actually an FFmpeg. So we do our audio processing with FFmpeg and there was a gem that we were using that we were dependent on that just stopped getting support. And so you get burned like that and you're like, you know what, I'm just gonna do it myself. I'm just gonna do it myself. And so credit card processing, it was long before, I mean, we launched credit card processing before even Stripe was around. We were using Authorize.net. .NET. So we had already built a lot of that functionality for like subscriptions and, um, what do you do when the card is declined and how do you reprocess it?

[00:36:40.03] - Tom
So all that code already lives in a library for us. So we don't have a, a dependency there, but we do have dependency, like I was trying to think of which gems would we install if we were to start a new project? And it's just not many, maybe like VCR, because VCR is great for being able to do replayable Um, VCR records your web interactions so that you can play them back without actually hitting APIs. And that's, that's been great for both Tick and for Buzzsprout to be able to record those interactions and run those tasks.

[00:37:09.18] - Robby
You know, you brought up the, like the Stripe and that you started back when you had Authorize.net. Um, and you, I'm assuming back then, like I'm like when that timeline, you, I mean, it was ActiveMerchant was around, there was things like that around, right? and that's, you know, I think Tobias from Shopify, I think initially released that and we contributed some payments to that. But I always think about, you know, it's how, how easy it is and alluring it is to just gravitate to like, oh, we'll just use the gem from that gem provides, right? And then you, so you're using their SDK and then you also end up with a situation where, I don't know if this is the case in your code base, but do you have a lot of like column names where it's like Stripe ID, you know, that's very specific to a vendor and then versus like, okay, what's our payment processor ID? And like, because you, you're, you're using Stripe behind the scenes right now for handling payments and everything, but there's also plenty of companies out there like, oh, they work in different markets and like not everybody has a credit card, but we need to accept, let's say, uh, ACH or, um, what am I thinking, PayPal, you know, because they have a PayPal account and they live, you know, they're somewhere in, um, you know, in the Middle East or something and they don't like, well, I don't have a credit card where where I'm at, so I can't, I can't integrate.

[00:38:20.14] - Robby
And so like, there's not like a lot of coupling. It's really complicated when you become real, really reliant on being able to take money from people, right? And obviously you want to make that— remove as many barriers as, as you can there. So you said, mentioned that you have written your own code to kind of manage that yourself. Have you needed to bring in other payment gateways, or have you come up with some clever— do you have like a philosophy around that? And maybe I just want to get people thinking about that when they're listening, you know, watching this, like to think about like how easy is it not just to swap vendors, but to maybe add vendors if you need to for different types of things.

[00:38:57.02] - Tom
Yeah, I think that, that's a good way to think of it is I want to be able to swap vendors. I want to, I don't want to be beholden to one. I mean, I, it's hard because Stripe is so good. Their API is so good. They make it so hard.

[00:39:11.06] - Robby
But maybe documentation is so good too.

[00:39:13.14] - Tom
Great documentation. But I mean, they're expensive, and, and they know they own you, and so they can kind of do what they want. And so it's hard when you're like, oh my gosh, like— and so I'm glad that I don't have all my subscriptions through them so that I can manage that on my own. I could easily switch credit card processors, um, but what's the reality that I'm going to do that? Now, we launched a product called Donor Tools, and Donor Donor Tools is kind of a fun project for us. It's, uh, it's to help nonprofits just basically track donations. And Donor Tools, I was like, okay, we are not going to use Stripe. I am not. I'm going to resist the urge. And so we ended up using WePay. I don't know if you remember WePay. WePay credit card processing. And, um, so we launched it with WePay and it was great, but then Chase bought WePay. And so we're like, oh, this is going to be amazing because Chase is going to turn it into something incredible. And then like 2 years later, then Chase shut it down. Um, but because of that, we ended up having to go from Chase WePay to Stripe.

[00:40:14.12] - Tom
And in the process I was able to— it was the latest code we'd ever written. Like Buzzsprout was written 2008, 2009, and Donor Tools was written like in 2015 or something. And so the code was so much easier for me to be more abstract. Like you were saying, not a lot of columns about Stripe ID, um, or the columns about Stripe were in a delegated type table for for that credit card processor. And like that delegated types, I, that wasn't even a thing. Um, but it was a great case study for us to be able to make it so that I could easily swap out different merchants, uh, to be able to process. But we don't have that in Buzzsprout. Buzzsprout, it would be painful.

[00:40:53.05] - Robby
When you go back to and think about the Paperclip to Active Storage migration, was there anything you, you recall needing to do? Did you need to write any one-off scripts to like migrate assets into a different format for that? Or, cause you mentioned like, oh, we swapped it out and we flipped this, we've deployed it and things, but did you need to roll that? And was it, was it complicated to kind of roll it back and still have everything kind of work in the meantime?

[00:41:17.20] - Tom
Oh, well, it was broken. It was broken for a while. Um, the way that we, the way that we did it, and I can't totally remember, but I know that we had millions and millions of assets. I think we duplicated them into a new bucket that we were gonna use for Active Storage so that we could run both. You could run Paperclip or you could run Active Storage. So the only, the only problem was if somebody uploaded something new since you last ran your script or they deleted something since you last ran your script. So there was like this true up that had to, to happen. Yeah. And all of that was done through scripts that you're running from the console.

[00:41:52.03] - Robby
So it wasn't necessarily a, it wasn't something you could easily do in like a phased all right, we're going to start incrementally doing this with like 10% of our customers at a time. And if it was kind of like an all or nothing, it's going to work or it's not going to work. So I love that. I love that, that when you, when you can do that and it works out, but sometimes you get a couple things. But obviously, you know, that maybe in the context of like the type of— obviously people want their podcasts to work and be listened to and they get their ads served and all the other things. We'll add, upload new things and want to break their publishing schedule. So I'm not saying it's not But it's not like you're migrating a bunch of confidential HR details from one system to another and people are like, where's my taxes? You know, it's like, yeah, there's a little bit of, a little bit more wiggle room.

[00:42:35.07] - Tom
We're servicing are not users. A lot of the requests, I mean, Apple, Apple hits our RSS feeds a ridiculous amount of time. They'll hit one RSS feed like thousands of times in an hour. Um, it doesn't change that often, but they're hitting it thousands of times. So those are the kind of requests that, um, hit the RSS feed are typically um, feed readers and, and they're just polling essentially, but they're, they're not respecting the, um, the goodness that we get with Rails, right? That we can, we can, we can have it so you can do conditional GET requests, but they have to respect the GET conditional GET request. Otherwise you're downloading the whole RSS feed and parsing it every time. But anyways, all that to say that there's a little bit more flexibility in us where the RSS feed, if, if we run into an issue for a little while, Users aren't actually being affected by that. Now the, the feed readers might be affected by it. Um, so I might get an email from— I did get an email. I can't remember what we broke, but I got an email from Amazon because Amazon, uh, ingests podcasts and they're like, hey, did you guys change something?

[00:43:38.10] - Robby
I'm at a— you know, we've, we've worked on some projects for some of our, like, one of our largest clients is Nike and we've done a lot of stuff that's public-facing for them, like, like their news and RSS feeds that feed into like Apple News and things like that. And so we've dealt with a lot of fun Apple-specific, uh, hitting RSS feeds a little too aggressively and having to kind of like massage that in some ways. Uh, I'm also curious, like, have you been able to lean on things like what other, like, it's like, what does the RSS feeds have? Is it ETags or something so that, you know, like if something's changed or do you? Yeah.

[00:44:10.03] - Tom
Yeah. So Rails gives it to you out of the box. You can set your ETags. So that now it can respond to a conditional request, which basically it says, hey, here's the last ETag that I had. If you have a new ETag, send me the request. Otherwise, just send me, uh, not modified, a 304, I think. Um, and it just keeps going. And so some feed readers respect that and it works great because then they can just basically hit our CDN and say, has it been modified? Um, and they don't actually have to pull it down. But Apple, they have two different ingestion engines. I'm not sure why that run. And one of them uses it and one of them doesn't.

[00:44:44.07] - Robby
If anyone works at Apple in these areas, uh, you should listen to this episode and maybe have someone put a ticket in to get that sorted out for us.

[00:44:52.06] - Tom
I've gone back and forth with them about, um, you know, cause Apple is so green and I'm like, I don't think you understand how much energy you're wasting processing these RSS feeds.

[00:45:03.03] - Robby
I, you know, it's funny. I'm not, I'm not going to name names, but there is a competitor to Buzzsprout that I use for one of my other, my other podcasts. And. One of the biggest gripes is because I tried to build a static site version of it to add some other features on top of the, uh, uh, on the website for, for the other podcasts that I have and they don't send an ETag and I can't. And so I, every time I pull, I have to literally fetch every single episode and just see if anything's changed when I'm like, oh, if I want to go fix the edit for some content on 2 episodes again, cause there's a broken link. I got to rebuild the whole site again. And I'm like, this is so ridiculous. So ridiculous. And they've not—

[00:45:41.18] - Tom
think about it. Like, I'm not a genius for doing that. It just came with Rails. If you're, if you stick with vanilla, you get this great business benefit because I remember having this conversation with Brian, my SRE, and I'm like, what can we do? Because these guys are just hammering the RSS feeds and it's costing us a fortune because of all the bandwidth we're spending, blah, blah, blah. And he's like, oh, Oh, you know, 2 lines to add an ETag. And as soon as we deployed it, all of a sudden we saw the 304s, we saw people respecting the ETag and using conditional guess. So it's like a great example of, yeah, okay, somebody smarter than me figured that out.

[00:46:16.19] - Robby
I appreciate that. Thank you, Rails. Yes. So speaking of large volumes of bandwidth and data and collecting and downloading statistics, so I know that you've been downloading statistics. I think you meant maybe in our prior conversation you mentioned maybe since 2014 you've been collecting a lot of data about all these podcasts. I think it was, I think it was maybe you said 2014. So, and I believe you, you, I think you mentioned earlier as well that you've, you've integrated with ClickHouse. So how has that come into play and how has it kind of shifted how you think about collecting and storing data? Like, and maybe for anyone that's listening that's not that familiar with ClickHouse, like, because it's come up on a number of conversations in the last several rounds that I've been part of. And, but what is ClickHouse? Not here to advocate for them or against them necessarily, but like, how does that kind of fit into what you folks are working on over there at Buzzsprout?

[00:47:04.08] - Tom
Sure. So it's funny because, you know, I talk to other SaaS founders and I always tell them, you know, don't, don't solve problems you don't have. And I'm sure I'm stealing that from David or Jason or somebody smarter than me. But it's something that we've always said, you know, don't solve problems you don't have. But then when 2020 hit and Buzzsprout and podcasting really exploded, we hit all those problems that we said, don't solve problems you don't have. And one of the problems that we, that we ran into was scalability on our stats because we have data that's just growing all the time. So we started collecting stats when we launched 2009. But when we first— the only data we were collecting was a single number. How many downloads did this episode get? Right? That was the only number that we were storing. Incremental number. Yeah, incremental numbers. Pretty easy.

[00:47:52.06] - Robby
But is that like in the episodes table itself or were you like have a different table for that or?

[00:47:56.18] - Tom
I can't remember. I think it was a different, I think it was a separate table, but it was, it was a simple piece of information. But now let's fast forward to 2014. So we're like, okay, we're going to, we're going to now offer advanced analytics for our customers. And so now we're going to tell you based on the IP address, we're going to tell you the location, we're going to tell you what kind of podcast player was it, you know, Apple Podcasts, was it Spotify? We're going to tell you all this different information about it. But in order to do that, we need to store that information. So I had a table that was filled with all of this data and then we would just run queries against it. But that table is never going to slow down, like it's never going to stop. You're always going to query the data. And, you know, there's like the concept of hot and cold data, like nobody cares about what your analytics were on your server from a month ago, right? For a 5-minute window, you might care about it over a larger window, but you don't look at the same.

[00:48:52.04] - Tom
But that's not true with podcasting. A podcaster cares how many downloads did they get in 2015 when they launched that episode? You know, they talked about this thing and they want to be able to see those, those numbers all the time. So anyways, that just makes it more complicated because that data has to be fresh. It has to be available to be able to serve it up. And so as it grew, that data got harder and harder to serve.

[00:49:16.11] - Robby
That sounds painful. Do you, you know, like with that amount of volume of data, like, so what were you able to do then? Were you able to entirely move that over into something like ClickHouse or was that like a phase?

[00:49:28.20] - Tom
So the first iteration, so I'm literally, I'm going to RailsConf, I'm talking to people, I'm like, how do you, like, what do I do? How do I solve this problem? And I'm hoping there's a magic bullet for it. I'm hoping there's something and I don't get anything. Um, and so that's funny, that's why I want to pitch a talk for, uh, RailsWorld, to give a talk on, um, on this topic, because I went to so many Rails comps looking for this topic. But anyways, so this database is growing and I need to be able to make it accessible. So what I did was I came up with a solution where I would create these summary tables, which were essentially every hour it would summarize all the data that's most recently come in. So it would take all the data from the last hour and it would populate a table that showed episode Episodes by date, episodes by location, episodes by user agent. So that way each individual table could just summarize it. And oh, magical, it worked out great, except that those tables continue to grow and grow and grow. And eventually they got to the place where I'm like, I need another solution.

[00:50:26.19] - Tom
I need something else. Now you can throw more CPU, which got us all the way through 2020, was, you know, just getting into the Amazon ecosystem with Kubernetes. It allowed us to kind of scale a little bit horizontal with our, with our database. But we needed, we needed something. That process to create those summary tables, I mean, at one point it almost took an hour and it takes an hour, like it runs every hour. So if it takes more than an hour, you're in trouble. Yeah. And so we were constantly battling with the, the summary tables and the summarizing process that runs. So That's where ClickHouse comes in. And really, it's more about understanding OLAP databases, and ClickHouse is just the one that we chose. But I, I've always thought about relational databases. I'm a relational database guy, and it just doesn't make sense when you start doing OLAP stuff. Like, in a relational database, you, you want your stuff in different tables, and you want joins and indexes on the joins and things. But with an OLAP database, you think more in terms of columns of data and you just shove it in there and it's super efficient at bringing back columns of data and summarizing columns of data, counting, maximizing, adding all those kind of things.

[00:51:41.17] - Tom
It's just really good at. But you have to understand how to write your queries in such a way to make them performant. So that's what we, that's what we began doing was implementing an OLAP database with ClickHouse and we, we ran it in parallel with our summarization process. Actually, our summarizer is still running today. Because we still have some queries that are still using those summary tables. But in the next, in the next work cycle, the next 6 weeks, we'll get completely off of it. So now we'll be totally on ClickHouse for all of those analytics.

[00:52:13.19] - Robby
Nice. It's definitely a mental shift, I think, for folks there to think about the relationship, relational databases versus thinking about an OLAP. I'm curious about, you know, and for those listening, because like it's like this common pattern that we get called into doing some consulting stuff where you'll try to pre-optimize. Like, all right, we've got— we have some slow areas. Maybe it's like a dashboard or some pages where there's some reporting and things like that. Like, okay, this is really slow. So there's this very natural progression. We're like, okay, well, how can we speed this up? Well, maybe we could pre-cache this. And so maybe you're putting on like a summary table, or you'll have a different table, or we'll have some cache columns or whatever. There's a lot of different patterns people do. And we'll— so maybe that moves to a background job. And then, but they're— then they're doing it for for everybody, you know, they're doing for all the, all the podcasts or whatever. Like they're going through and cycling, oh, does this start anything from this? So they're having to check and then they're like, oh, like maybe we can do that just for the ones that have uploaded podcasts in the last hour or data from the last hour.

[00:53:08.07] - Robby
So you're not cycling through everything because that number will just keep growing as you add new customers and more users. And then the, you know, and then they're having, and they're like, well, this isn't working either. This is taking too long. And so they'll, it just becomes like they keep fanning out the problem in an interesting way. And they're, and then So we'll come in sometimes, they're like, well, there's a— we've seen people come up with really clever tricks, like, what about you're optimizing a lot of data for people that are not even logging in to look at the data, you know? But it's there if, should, in case they do show up today. But it's like a Sunday at 11 PM and you've got an hourly process recaching all this stuff. I'm like, there's a bunch of CPU time, like, yeah, whatever, let the computers do the work, but nobody's ever going to see the data, like, very, very small. And so they're like, all right, well, what's the How do we only optimize it for the people that are showing up? And like, well, if they hit the web page to log in, we can see like they're logging in.

[00:53:54.22] - Robby
I've seen all these clever things that people have done and we've helped implement some of these, maybe overly. Oh, someone hit the web page, we got a cookie. They haven't even logged in yet, but we're going to fire off a background job and pre-cache things to make the request before they hit the dashboard. It's there, you know, like we, we, we can do this in 20 seconds. So let's start doing it as soon as we notice that they're hitting our website. Seemingly like—

[00:54:15.12] - Tom
and I feel like we reached the extent of all of those things. Like, we reached the end of it, and that's when we're like, okay, we need to learn— we— there needs to be a new tool in our tool belt to be able to accomplish what we need to do. And that's where OLAP fit in. And it was super intimidating, but when you get into it, Rails has made it so, so easy to be able to have multiple databases. I mean, some of the things that it does, I just— I cannot believe— I just can't believe the way that it works, but it feels so natural to have a second database that you're able to just reach into and, and pull ActiveRecord objects, but it's coming from an OLAP database. Are you—

[00:54:52.03] - Robby
for, for anyone listening out there, could you just describe what's— where is the— how does the data get into the OLAP database? You mentioned like using ActiveRecord to fetch it. Are you— is it just like, is it extracting data from your, your, your relational database itself and then you've configured it then, or, or do you have something like ActiveRecord actually like writing to the OLAP database directly, or a bit of both? Yeah, yeah.

[00:55:17.02] - Tom
So this was something that we, we talked about a lot, because as I talked to people— and really a lot of this came together at Tropical Ruby, um, I, I went there and I was like, once and for all, I'm going to answer the question. I'm going to figure out what is the solution, and I'm going to ask all these intelligent people, how would you solve the problem that we have? And I kept hearing, you know, OLAP, OLAP, um, and so— but a lot of them fixated immediately on something like Kafka, which Kafka does the, you know, you're putting information in and then it's in your, your, you're streaming it into the OLAP database. And maybe that's where we're going to end up eventually. But what we ended up doing was much simpler because we already ingest the play data from the logs. So we're processing logs all the time. We're just constantly adding records to our relational database and we're just adding play data like, okay, this, this episode was played, but there's there's processing, there's a lot of Ruby processing that has to happen to make sure that the stat is a real stat, you know, that it wasn't a bot, that it wasn't, you know, multiple requests from the same IP address within a certain time period, all these kind of things.

[00:56:20.05] - Tom
That's all Ruby processing. And so we do all that Ruby processing and then we put the row into the table. Then what we do is we have another— that job that right now does summary summaries where it's summarizing into summarized tables. Instead, it just puts together a massive, massive update and then boop, pushes it into the OLAP database and that's it. So it inserts like 50,000 rows an hour and it'll just drop them into the OLAP database. And then the OLAP database is very performant, like it can, it can handle that request and it's still handling all the select requests that are coming in. So it's still giving reporting data while it's updating. And so that process has worked out really well for us. I'm not saying that we won't go to something like Kafka, some type of pipeline process, but for right now, It's working really well for us.

[00:57:04.19] - Robby
That's interesting. I was also talking to someone else recently and they mentioned that the strategy that they took as they scaled up was to use something like BigQuery or something where you would have something else sit on top of your production database and then it's like mirroring that data and then it's way more performant on how— So basically the queries to pull that data out is like way more efficient because of whatever, however that magical beast of a database is. But there's all these different strategies that people can have. Yes. So, so, so for anyone listening, and if you've got interesting ideas or how solutions, how you come up and navigate these things, definitely reach out to us and let us know. Cause I think we're always curious to keep iterating on our—

[00:57:44.08] - Tom
and don't be intimidated by multiple databases. The work, um, that, that the open source community has done for Rails. And I think specifically, wasn't it, was it, um, was it GitHub who, who pushed the mult— multiple databases into Rails? All that work.

[00:57:59.09] - Robby
I think it was, I think that was Shopify, but Shopify.

[00:58:01.22] - Tom
Shopify. Yeah. But that, they saved us. They took all that complexity and compressed it to a point where you can do multiple databases very, very simply. It was like 2 lines of configuration to be able to get it connected. And then it felt really natural from then on. So whether you're using, you know, a BigQuery or you're using ClickHouse or you're using some variation of all these other, I'm trying to remember some of the other, Pinot. There's all these different databases that are out there that are essentially OLAP databases that they are designed to do the kind of thing that we're talking about.

[00:58:33.17] - Robby
Yeah. It sounds almost like it's a boring change to make it in some ways. And that's maybe a good thing. You know, it's not—

[00:58:39.08] - Tom
I wish. It's not boring yet. It's not boring yet.

[00:58:44.03] - Robby
Um, but well, a 2-line change, I think, or a coding change to be able to like, it's not, it's not intimidating.

[00:58:48.20] - Tom
It was 2 lines to configure the database, but you had to rewrite all your queries. So you had to rethink your queries to how would, how would you write it? And you're doing things that don't make sense. From a relational database standpoint. I'll give you a perfect example. Is in relational database, you're trying to normalize the data, right? You're trying not to repeat yourself so that you do joins to other tables. But in OLAP, it's one big table. It's one big table. So there's a column that is the user agent, but then there's another column right next to it, which is the, um, podcast app. Well, the podcast app is just a function of the user agent. In a relational database, that would be in a separate table. But in an OLAP database, you want to put them right next to each other. You want to denormalize the data. You You wanna have it right next to it so then I can run these, you know, incredible queries that you couldn't— I mean, we literally, there's queries that we're running now that we couldn't have run based on summary tables. They just weren't performant. Locations, like it's denormalized.

[00:59:40.20] - Tom
There's a location ID, but then there's also city, state, country, you know.

[00:59:44.16] - Robby
You know, something you touched on that I don't feel like I hear often about is that you're, you're comfortable staying small even as Buzzsprout is growing as a business. So how do you feel like Rails helps support that? Support that decision?

[00:59:59.12] - Tom
Yeah, I would say I'm more than comfortable. I desire to stay small, right? Um, I enjoy, I enjoy the small business. I enjoy, uh, what we get to do. And Rails, Rails makes that possible because you can, you can be, you can be so productive. And now you throw AI into it, I mean, that's a multiplier on top of a multiplier. So I feel like It's, it's cheating, um, because it's just, we're able to do so much with such a small team. And so we talk to our, you know, competitors that are much larger than us, and, um, in terms of number of people, we have more podcasts than they might have, but they have more people. And they're like, they don't believe us. Like, no, there's no way. There's no way you do that. StreamCare, our medical product, you know, we have one developer. We have one developer that's working. Competitors in that space, They're like, no way. They have hundreds, hundreds of people, you know? So I think, I think it's rails. I think it's a superpower. That's awesome.

[01:00:57.04] - Robby
Do you think, you know, the, there's, are there any problems you feel like you have, you have though? You mentioned maybe needing to hire an additional person so that someone can take, enjoy their vacations or the bus factor, but is there any other like things you think are maybe a slight downside to kind of staying small?

[01:01:14.17] - Tom
Yeah. I think. It's hard to do what Basecamp has done in terms of around-the-clock support so that people are actually supporting in their own work time zone. Yeah. So we love— it's one of the things that we've learned. You talk about marketing a SaaS product. I had— I did not know this, but having an incredible support team is a massive feature for your product. And there's no code involved. It's just having incredible people that you invest in that you, you allow them to be good advocates for their, for the customer. And man, it's been such a benefit for Buzzsprout because people love our support team. Um, they love interacting with them, but because we're small, I really don't want to have people spread out all over the world to cover all the different time zones. And so that's a bit of a challenge with a small team.

[01:02:05.23] - Robby
Yeah, I can, I can, I appreciate that. You know, I'm also curious about like, how do you have like an ethos about when you decide you're going to pay, say, monthly for something versus building something yourself, like tapping into some platform or SaaS product to help, help like an add-on for your, for your product?

[01:02:25.21] - Tom
I don't, I don't know that I have a specific ethos about it other than I don't like, I don't like being beholden to other companies. And so if it's something that's mission critical, like, you know, subscriptions for who's paying us monthly, I don't want that outsourced. I want to be able to do it in-house. Now, that being said, there's nothing wrong. I mean, if I was starting a SaaS product today, I would absolutely use Stripe and I would absolutely use Stripe subscriptions because you can always build it later. You can always bring it in-house later. But for where we are as a business now, like we can absolutely do those things in-house. I'll give you a great example of it was we launched AI inside of Buzzsprout to listen to people's podcast episodes. So when they upload their episode, the AI will listen to it and then it'll give them ideas for titles, descriptions, chapter markers, um, social media posts, a blog post, all this kind of stuff that AI is really good at. It's really good at coming up with those drafts, right? And what it does is it helps podcasters finish that last, you know, they've, they've recorded the episode, but now they need to go live and they gotta give it a title.

[01:03:30.22] - Tom
They've gotta do all these things. So anyways, um, when we originally did that, we did that with a vendor that we worked with. So they, we would actually send them the audio file and then they would send us all the, the AI assets. But then as that feature was adopted and we saw that our customers really liked it, we brought it in-house. And when we brought it in-house, we now had control. We could make it better. Um, we didn't have this dependency on them. We could actually lower our pricing, you know, things like that.

[01:03:55.02] - Robby
So do you know when there's ever going to be a conversation about when we can make the episode, uh, descriptions longer?

[01:04:05.14] - Tom
You know, that's driven by Apple. I know, it's like, I'm just like, Apple somehow still has the stranglehold on podcasting because, you know, I mean, they were, they were the benevolent dictator of podcasting for the longest time.

[01:04:19.00] - Robby
It's, it's such an interesting— I'm like, why is this so short? I want to like, yeah, I want to add more details, I want to add more links, and it's just like, all right, well, I guess there's, there's a limit to this.

[01:04:27.20] - Tom
I'm like, I know there's talk, I know there's talk about just extending it and then letting Apple, you know, essentially catch up.

[01:04:35.14] - Robby
Yeah. And I mean, and I mean, if, you know, just as like a, as a podcaster, most of what I want that for is for like the benefit of like what we're displaying on the websites, right? I think, yeah, maybe in a podcast app and stuff like that, but I'm just like, I want the— what, what is the, uh, what are the AI bots and Google indexing and people browsing and finding useful links and stuff like that? So that, that's where I want the richness to be. I don't know if people are spending as much time maybe in the apps clicking around all the links. Maybe, maybe they are. I don't know. I don't know how to track that stuff.

[01:05:05.01] - Tom
It depends on the type of podcast, right? Uh, there's some podcasts that lend themselves to supporting information where I want to be able to, to be able to go read it, or I want to be able to follow links or want to be able to see things.

[01:05:16.23] - Robby
Yeah, I get that. So, you know, we've definitely covered a lot of ground so far with you, Tom. So when you step back from all this and you've kind of alluded to this, but for maybe for a good soundbite, but how has Rails actually been part of, let's say, Buzzsprout's secret sauce?

[01:05:34.03] - Tom
I think everything that we've talked about in this call, um, Buzzsprout has been able to move faster and more nimble as a result of sticking to vanilla Rails, being able to capitalize on when features and functionality is exposed in Rails, we get to take advantage of it immediately. Uh, you know, we didn't talk about caching. Caching is a great example, like fragment caching. I mean, that wasn't, that wasn't something I could do when I originally started coding in Rails. Well, now it's so easy. And then with Solid Cache, we got to the point now where we're, we're very, there's very little ejections in our cache. So, so much of our content is just served directly from the cache. We got all of that for free, all of that benefit of very smart people putting together an easy way to be able to create this, these caches. Okay. That, that's just one hyperfocus, but Buzzsprout, it has benefited from Rails by doing things like that where we can just move quickly. We can take advantage of the latest technology. Um, and we can do things that we couldn't have done without the framework that we have.

[01:06:36.23] - Robby
That definitely resonates. And I'm, I'm, I, I can definitely concur with that. You know, I'm curious, Tom, is there a book or like a technical book that you find was very foundational that you find yourself still recommending to peers?

[01:06:50.23] - Tom
Um, Sandi Metz. Mm. Sandi Metz. Um, I mean, I, I still go back. I mean, I, I've read her book multiple times, the, um, Practical Object Oriented, the Pooter book, Pooter book, um, and her videos, like, uh, of different talks that she's given. I think Sandi, Sandi has probably impacted me the most in terms of the way that I code. Um, and so anything that she's done, and then everything that 37signals has ever put out, like, that's really impacted. Um, I feel like we are our own version of it. Like, we've kind of settled into what does it look like for Higher Pixels, for our company? What does it look like to, um, embrace some of these things? But we've, you know, read and listened to all of the rework stuff.

[01:07:36.10] - Robby
Awesome. I'll definitely include links to Sandy's book in there, and also to an episode that I interviewed with her on my other podcast, Maintainable, which a couple years back.

[01:07:45.05] - Tom
And, uh, I also really— she is an incredible asset to the, to the Ruby community.

[01:07:53.08] - Robby
I know she's not been at any of the conferences in a few years though, right?

[01:07:55.22] - Tom
So I know, I know. Yeah, we miss you, Sandy.

[01:07:58.22] - Robby
Uh, we miss you, Sandy.

[01:07:59.23] - Tom
If you listen to this, you've made a huge impact on our business, and we'd love to see you again. We'd love to see you at a conference. I'd love to hear a talk. Likewise.

[01:08:07.12] - Robby
You know, You briefly just touched on AI, and I hadn't planned on digging into that. And, you know, we kept you long enough, but I'm curious, like, is there something that you believe about building software and SaaS products right now today that you believe more people will agree with in 5 years from now?

[01:08:29.02] - Tom
I think, I think we're all in agreement. I feel like we're all in agreement that AI is making our life better from a coding standpoint, right? I don't know about what's going to happen with our phones and stuff at home, but when we're talking about as a developer being able to have a partner in AI to go over ideas and strategies and work together with it, just I feel like it makes it fun. But I don't think that's controversial. I think we're all kind of— it used to be controversial, but I think everybody's kind of on board with it now, right?

[01:09:02.16] - Robby
I want to— I want to agree with that. I feel like that's— I've been, I was like a little bit of a slow comer to kind of a little skeptic in a weird— I'm happy to say that I was a skeptic, but I think it depends on which bubbles you're kind of hanging out in. And I think you can definitely go to some social media sites right now and it feels like a completely different world and in a weird way. And I'm kind of rubbing up against that a little bit with even running open source projects myself and like where I like, I'm a proponent of using AI as long as we're open and being transparent about it. And some people are like, I don't wanna use your software project anymore because you're using AI. And, and I'm like, oh, this is a, like, I didn't know that it had to be this or that. Like, it's binary.

[01:09:43.06] - Tom
I think that has to go away, right? Like in 5 years, there's no way that anybody's gonna say something like that. Cuz you, you're not gonna be able to, that, that would be like, oh, you can do it, but you can't use Google. What? You have to tell, tell us if you used Google. Well, of course I used Google. Like I had to look up something, you know?

[01:09:57.05] - Robby
And like, we got into a big conversation about this where specifically I'm like, well, what is AI even? Like when you're in the context of say using like VS Code, I'm like, is AI only when you're interacting with like the agent in a prompt and like, or is it if it autocompletes the end of your line? Like what, what, what, what, where's the distinction there? I don't know what actually happened behind the scenes there. So do I need to disclose that? Cause I'm like, I don't know that we can hold people accountable to that. So I don't know. It's a, it's an interesting time to be. It's definitely a lot of things moving here. And so I won't, I'm not gonna hold you to it, but let's, we'll check back in 5 years, Tom. Okay. Deal. Well, Tom, this, this was great. Uh, thanks for walking us through the long arc of building Buzzsprout on Rails and for supporting the Rails ecosystem much more broadly and for helping host this podcast. So thank you so much for coming on Rails, Tom.

[01:10:46.16] - Tom
All right. Thanks for having me.

[01:10:51.03] - Robby
That's it for this episode of On Rails. This podcast is produced by the Rails Foundation with support from its core and contributing members. If you enjoyed the ride, leave a quick review on Apple Podcasts, Spotify, or YouTube. It helps more folks find the show. Again, I'm Robbie Russell. Thanks for riding along. See you next time.


Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

Maintainable Artwork

Maintainable

Robby Russell
Remote Ruby Artwork

Remote Ruby

Chris Oliver, Andrew Mason, David Hill
IndieRails Artwork

IndieRails

Jess Brown & Jeremy Smith
REWORK Artwork

REWORK

37signals