Relicans host, Rachael Wright-Munn talks to Application Architect at Calendly, Dmitry Pashkevich, about his experiences breaking up a Rails Monolith and his recent talk about "Contract-Driven API Development.”
Should you find a burning need to share your thoughts or rants about the show, please spray them at [email protected] While you're going to all the trouble of shipping us some bytes, please consider taking a moment to let us know what you'd like to hear on the show in the future. Despite the all-caps flaming you will receive in response, please know that we are sincerely interested in your feedback; we aim to appease. Follow us on the Twitters: @PolyglotShow.
Jonan Scheffler: Hello and welcome to [Polyglot](https://twitter.com/polyglot), proudly brought to you by [New Relic's](https://newrelic.com) developer relations team, [The Relicans](https://therelicans.com). Polyglot is about software design. It's about looking beyond languages to the patterns and methods that we as developers use to do our best work. You can join us every week to hear from developers who have stories to share about what has worked for them and may have some opinions about how best to write quality software. We may not always agree, but we are certainly going to have fun, and we will always do our best to level up together. You can find the show notes for this episode and all of The Relicans podcasts on [developer.newrelic.com/podcasts](developer.newrelic.com/podcasts). Thank you so much for joining us. Enjoy the show.
Rachael Wright-Munn: Hello and welcome to the [Polyglot](https://twitter.com/PolyglotShow). Today I'm here with [Dmitry](https://twitter.com/dpashkevich), who's an Application Architect at [Calendly](https://calendly.com/), focusing on their API platform, workflows, and integrations. Hi, how are you doing?
Dmitry Pashkevich: I'm doing good. How are you, [Rachael](https://twitter.com/ChaelCodes)?
Rachael: I'm good. So I heard that [you recently gave a talk about Contract-Driven API Development](https://docs.google.com/presentation/d/1em4GHBrTt7TFLO4D-wZuYEg2yc7vbAwMl-w0_19Cw8M/edit?usp=sharing). Could you tell me a little bit about what that is and where the value is?
Dmitry: Sure. And there are actually two parts to the subject matter. I actually struggled to put them together in one title without making it very wordy. I think it's worth starting...the first part is a design-first approach to API where you spend more time designing how your API is going to work up front and then proceeding to the implementation, and of course, it's cyclical. So sometimes, you come back and revisit the design.
And then the other part that goes hand in hand with this practice is what you mentioned, contract-driven development, which essentially means you design your API. You implement it according to this design. But you also have a piece in your process and in your stack that continuously ensures that your implementation never strays away from that design. It essentially keeps the two in sync. So it's not that you design once, then you implement. And then, as you evolve your API, you fix bugs, you add features, you forget about this design that you created at some point.
Rachael: It ensures that your future APIs that you develop are consistent with the original ones you created.
Dmitry: Right. And any time you need to make adjustments, you go back to the design, which in this case, it's not just some text documents. It is a machine and human-readable definition of an API. It's a design; it’s a contract; it’s a documentation. It's all in one. There are tools to automatically test conformance of the implementation to this design spec, and it's called contract in other contexts.
Rachael: So you're talking about this contract, and you're talking about designing essentially a specification. You're saying essentially that this is verified by an automated system and also that produces the documentation that viewers see but also, it's the design that you're using. So it sounds like it's serving three different purposes at once.
Dmitry: Correct. So let's jump to specific technologies. At [Calendly](https://calendly.com/), this was in the context of specifically building Calendly's next-generation API, which is a [REST API](https://www.redhat.com/en/topics/api/what-is-a-rest-api). And in the past maybe five years, it's become apparent that there is a mature, stable specification standard for describing REST APIs, and that is [OpenAPI](https://swagger.io/specification/) specification. There have been several competing standards for a while, and when they emerge, they partner with some of the competitors. And this resulted in what we today know as OpenAPI specification. So as the name suggests, it's a specification for describing API behavior. What are your paths, endpoints, methods, parameters, all that stuff, and responses?
And I mentioned that it's human and machine-readable, so it's relatively easy to write. People usually use the YAML format for this. And this serves as your design documents as you're building the API. But you can also take the same documents, this YAML file, and generate full API documentation from it. Or you can generate a mock server that will mock the API behavior before it's even implemented, which is also useful for consumers.
Rachael: That's interesting. It ensures that consistency by using [OpenAPI](https://swagger.io/specification/) standards.
Dmitry: It's just the common language that many developers who work with API are familiar with today. And also, what's more, important is the ecosystem around it. So a standard is only useful and powerful when there's an ecosystem, and when there's adoption, and when there's an ecosystem of tools that consume or produce this standard. And this is true in the case of [OpenAPI](https://swagger.io/specification/). There are all these documentation generators or parsers or mock servers, linters, validators, all kinds of things that actually enable practical applications of the standard.
Rachael: That makes sense. Could you talk a little bit about the specific technologies in place and how this is built?
Dmitry: Sure. I can talk about our development process that we have adopted in the past year to 18 months at [Calendly](https://calendly.com/) as we were building this new API platform. So we decided to use [OpenAPI](https://swagger.io/specification/) standard for some of the benefits that I've already mentioned. It's not hard to write by hand, but even better, there are visual editors where a person can use a point and click interface to generate this API design specification. We use a tool called [Stoplight Studio](https://stoplight.io/studio/). There's an online version. There's a downloadable desktop version that anyone can use. It's free. It's like a WYSIWYG editor for [OpenAPI](https://swagger.io/specification/) specification. So that's where things start.
And what typically happens...so before we get to the design phase, we usually have an idea of an API that we want to develop or maybe an adjustment to an existing API endpoint. At that stage, it's not very different from developing end-user features. It goes through whatever process you have in your company. We do some product research, user research. The product manager works with the team to identify these opportunities for delivering value.
But from there, once there's a rough idea, a sketch, if you will, of a potential future API, an engineer takes in and designs this API in more detail. What we're doing is we're converting a sketch, maybe a bullet point list of there's this endpoint that takes this input and returns this output. This turns into a more detailed specification of the actual paths, the names, the methods, the types of input data, types of output data, possible errors, all that can be defined through [OpenAPI](https://swagger.io/specification/).
Rachael: This is where the design-first starts to come into play, is that you're sitting there creating this specification and then you're going through, and you're doing the design for the APIs that come out.
Dmitry: Right. What has often happened, and I've experienced this too in my career, is this design step is missed. Someone writes a short Google Doc or a [Jira](https://www.atlassian.com/software/jira) ticket with a short description of roughly how this endpoint has to work, and then the design decisions are made during development. As the developer implements, they make these decisions on exactly the shape of the data and all the names, and the paths, and validations and all that stuff. But the thing with API is, as we know, APIs are hard to change once you've launched them.
Rachael: Yeah, and those inconsistencies.
Dmitry: Yeah, there are inconsistencies.
Rachael: That's very frustrating for people who are working with the API is when they're like, I worked with this API for this endpoint, and then I'm touching a second one that works completely different.
Dmitry: Right. It makes it hard to learn an API, to expect how it works. And knowing as much about how an API works from the consumer perspective is so important because you're not an internal developer of that other system you're integrating with. You don't have that knowledge of all of its idiosyncrasies, but you want to deliver a great user experience to your end-users that will be consuming your integration. You want not just the bright, happy path, but you want to handle all the edge cases. You want to handle all the errors in order to deliver this great integrated experience between two products. And for that reason, it's extremely important to give the consumers of your API as much information as you can about how it works so that they know what to expect and they can code it into their implementation.
Rachael: Yeah. What sorts of problems begin to crop up when developers are just building the API endpoints one by one? We've talked about inconsistencies. But what inconsistencies could people expect to see?
Dmitry: A common one is just a lack of detail. You might visit some API documentation, and it may describe roughly here's the input, here's the output, but maybe there's a JSON example of one possible way of calling this API and one possible response that this API can return. But in real life, there will be variations. Some fields may be missing. Some fields may have different data representations. There are often huge variances in how the errors are returned. And part of the problem, coming back to [OpenAPI](https://swagger.io/specification/), part of the problem is there hasn't been a standard, a language for expressing these important details. So people have been handwriting this documentation. And when you handwrite it without a certain framework, you inevitably forget to mention some important details.
I like to compare this with designing and delivering end-user features, where typically, when a company delivers a new feature in their product that is used by end-users, you have a designer on your team. A designer creates maybe first a sketch, a prototype. But at some point, a designer gives you a high-fidelity mockup with all the metrics that the engineer takes and converts into the implementation.
Imagine writing a feature as a developer just based on the rough wireframe. This is the exact same thing that has been happening in API development. We've been trying to deliver APIs based on some back-of-the-napkin descriptions, based on sketches, wireframes, instead of basing our implementation on solid, concrete mockups. So [OpenAPI](https://swagger.io/specification/) is this mockup language for APIs.
Rachael: Oh, I love that. I love that so much.
Dmitry: It wouldn't occur to an engineer to implement things based on the wireframe unless maybe you're a two-person startup [laughs] and you don't have a designer. But once you go beyond that, all engineers understand, hey, we need high fidelity mockups, and engineers will spend a lot of time on the front end. So they ask follow-up questions. They call out missing states maybe that the designer forgets. They bring up responsiveness because they have this vocabulary that has been established. They know what to look out for. It's great that today we are equipped with this rich toolset.
Rachael: That's so true. When you mentioned issues with the error states, it reminded me of a transit API that a friend of mine was working on where he had built a front end for it. And what he found out was that a couple of the endpoints would normally return JSON. But if there was an error in fetching the data, they would return an HTML page. [laughs]
Rachael: So he had to deal with balancing between those two. And when I think back, I think I've looked in the past into the REST API. I'm not sure what specification document I was using. I don't think it was [OpenAPI](https://swagger.io/specification/). But I remember noticing that if you failed to destroy an object, there wasn't a specified response code for it. And so I think you're right. I think, for the most part, when we've been thinking about API design, there are certainly people in the past that have thought it through and come up with something good. But now it sounds like we have some tools and some new techniques that can be used to make this a lot more structured. So there's one thing that I'm still lost on. So we've talked a little bit about design first. We've talked a little bit about the [OpenAPI](https://swagger.io/specification/) specification. I want to know what the words like contract-driven mean to you.
Dmitry: So I started talking about our process where an engineer on a team delivering the API writes the specification on [OpenAPI](https://swagger.io/specification/). This is the design phase. Because the specification is code worth noting, it can go through the familiar development process, version control. It goes through code reviews, so this design can be reviewed and reasoned about before any implementation happens. And then, at some point, once this is approved, we can implement our API. But then the question is, once we've done this, how do we ensure that our specification that we typically generate the documentation from is in sync with our implementation? This is where contract-driven development comes into play for contract testing. And there are tools to help us achieve that with [OpenAPI](https://swagger.io/specification/) specifically.
Basically, there are open-source tools that can take these [OpenAPI](https://swagger.io/specification/) specifications that you authored and plug them into your existing testing framework. Or there are standalone testing tools that will just take this document and call your application as a black box and ensure that all the endpoints specified in the document exist, that they take the data in the shape as specified, that they return the data in the shape that is specified. And you can take this toolkit and put it on your CI server so that any code change that is not compliant will fail to build.
Rachael: It sounds like this would also help with providing additional examples. So earlier, you mentioned that a lot of times in the API documentation, we'll have like, here is one thing that you can pass into this API endpoint, and here is one result. And a lot of times, the results that we're getting back have different data based on the type that it's returning. So I wonder, does this help with providing those additional examples because it's part of that CI suite?
Dmitry: Absolutely. I'll talk about...people often call this governance, the enforcement of not just the fact that your implementation conforms with the spec but also enforcing that you put in certain details in the spec in the first place. It's worth noting that [OpenAPI](https://swagger.io/specification/) specification helps not only not forget to describe parts of your API behavior to the consumers, but it challenges you to think about all these aspects because you're operating within this design framework that's established. You have this rich language that is more effective than just a bullet point list in Jira. That challenges you to think about all these things, about the type of every field, about the exact shape of the object, about the errors.
So simply having this design language already helps development teams not forget things. But again, because it's a machine-readable format and there's contract testing, and there are other tools that can understand this format, there are linters for [OpenAPI](https://swagger.io/specification/) documents, so you can plug in a linter. And there are common rules, and then you can write your own rules. So you can say every endpoint has to have an example for every response code. You can enforce that through a linter that, again, you can enable on your CI server, and thus you can enforce your standards for building APIs across the organization.
Rachael: That's amazing. That is truly amazing. So we've learned a lot about your work with API design at [Calendly](https://calendly.com/). Are you working on anything else interesting lately?
Dmitry: Yeah. So another thing that has been an ongoing project at Calendly is...and I think it happens a lot to companies of our stage roughly is migrating from a monolithic architecture to a service-oriented architecture. It's a big cross-team project that we've been chipping away at.
Rachael: Yeah, that's really common. I hear a lot of companies are going from a monolith to a series of microservices. I was in this really interesting interview with [Maria Gomez](https://twitter.com/mariascandella). And she said that microservice architecture is more about working with the team and the people that you have than it is necessarily the needs of the application. Because a lot of times, what you're doing is moving that complexity from your application into the infrastructure around that application, and you're making it easier for individual teams to manage a microservice.
I can think about one company I worked at where we ended up microservicing a little too early, and we had three different microservices, and we had three team members. [chuckles] And managing the deploys and the issues and understanding the status of each of them was a huge drain. So I know that that transition can sometimes be very difficult and has to be made at the right time and size for the organization. I wonder what you're doing to make that a little bit safer.
Dmitry: Sure. And I'll preface this by saying monolithic architecture is great. It makes sense to start with, in most cases, and you can get great mileage out of it. [DHH](https://twitter.com/dhh) from [Basecamp](https://basecamp.com/) has a fantastic article called [The Majestic Monolith](https://m.signalvnoise.com/the-majestic-monolith/) something-something where he makes this point. And at [Calendly](https://calendly.com/), we have only started seeing the limits of the monolithic architecture maybe a couple of years ago. We are talking about growing the company for a few years to have an established place in the market with millions of users, millions of revenue. We were able to take advantage of that architecture for a very long time. And at [Calendly](https://calendly.com/), in general, we are very practical. We steer away from creating technical challenges just because it stimulates us intellectually.
Dmitry: I was actually amazed by how practical the engineering culture at [Calendly](https://calendly.com/) is. So we only started a conversation about that when we clearly saw that for the size of the codebase, the number of features that we have, the size of the company in terms of people, it's just slowing us down to work on one central thing with lots of interdependencies.
Rachael: Yeah. What does hitting those limits look like in an organization?
Dmitry: Sure. In the spirit of being very practical, there are actually two concrete things that come to mind. One way where this issue manifested is simply increased bill times. So at [Calendly](https://calendly.com/), we do continuous integration and continuous delivery, which means we don't stage a bunch of changes to the product. We don't have these week-long or two weeks-long regression testing cycles. Every code change, every pull request gets reviewed, tested. And once it's merged, it goes out in production. And we focus on quality a lot on multiple levels, not just in engineering, but we certainly have a very high test coverage.
It is a norm, a standard, to cover your code with tests unless it's something like a little CSS change or a copy change. A change is expected to come with tests. What this means is as the application has been growing, entire new areas of the application have been created beyond probably the core scheduling experience that most people are familiar with in [Calendly](https://calendly.com/). It's becoming more and more expensive to run these tests. But we still have to run all the tests before a code change is delivered. So it actually has become the limiting factor to how many times we can deploy per day.
Rachael: And by expensive, you mean in terms of time, right?
Rachael: Interesting. So you're saying that one of the main reasons that you're looking at a microservice architecture is you've hit a wall when it comes to testing, which is throttling the speed of deploys. So really, that's where that wall has hit. And it's gotten to the point where you basically need to separate these out into separate microservices.
We talked a little bit about design-first earlier. I wonder how does that come into play when it comes to separating out this application into different microservices?
Dmitry: Sure. And we had some interesting thoughts on that front too. But I will also say that bill times growing there are all kinds of firsts. There are all kinds of reasons why bill times keep creeping up. We have addressed various issues there in attempt to keep the bill times low. And it's really not a...I wouldn't call it a reason to migrate to a service-oriented architecture. It was more of a symptom, one of the ways where a growing codebase with lots of tight coupling between various domains has manifested itself.
Another aspect of this was just simply multiple teams stepping on each other's toes as they touch parts of the product that are supposed to be only loosely coupled. But instead, you make this change, and that affects a lot of other things that it shouldn't affect that you didn't expect to affect.
Rachael: It's difficult to test that because you don't necessarily know what's connected to what in an application until you get to almost that feature level. So if you're touching it somewhere in one place where you're like, oh, I'm going to mess with scheduling, and that's going to influence something all the way over here, that coupling is very difficult to test. You almost have to know how each of those features work and have coverage there. Meanwhile, if you break them apart, then you can say, "Okay, this is the specification that these two are communicating. If there are any changes at this level, we know it's going to impact other things."
Dmitry: Right. And we do have, as I mentioned, a great automated test suite that has saved us numerous times. But a lot of the time, first, the automated test suite, I guess, prevents you from shipping bugs to production. But that doesn't mean that it is okay for you to make one change in the product and see that’s all these 20 other tests and seemingly completely unrelated parts to break.
Rachael: You're saying the coupling itself is the problem.
Dmitry: Right. The test here is often just the canary. Of course, it's great that it prevents us from shipping bugs to production, but that doesn't mean we are operating in the most efficient way at our size.
Rachael: That makes sense. So I think we were about to talk about the design phase of breaking things out into microservices and how [Calendly](https://calendly.com/) has some interesting ideas around that.
Dmitry: So we threw these terms around, service-oriented architecture, microservices. The path that we have chosen to go to, and I think what you're alluding to, is it's extremely hard to make this big, extremely detailed design upfront on exactly what kinds of domains we're going to have, what kinds of services. What makes this whole exercise even more challenging is that our product keeps evolving. We keep hiring people spawning off new initiatives that didn't exist a year ago. So our codes and our data model, and our product are constantly evolving. So it's also not always easy to predict what our business needs are going to be three years from now, five years from now.
So the path that we've chosen in this decomposition of the monolith is we're actually not trying to jump straight to microservices, as in we're not taking a piece of our codebase that looks like it could be a separate domain and immediately packaging it up as a standalone deployable service. What we're doing is we are going through the exercise of logically splitting up the monolith into what we simply call modules internally. We refer to this as modular monolith.
So we're going through this exercise of logical segregation where you still go through the exercise of eliminating undesired tight coupling. But you're not worrying about the risky step of actually physically splitting things out. And then, maybe further down the road, you'll learn that the split that you did is not quite right, and now you need to redo it. And with things being separate services, it's more expensive to change.
So we are going with this logical modularization as also a learning exercise. It's like training wheels before flipping the switch and actually going to a service-oriented architecture. And I think it's been very helpful because it gives us this flexibility to change things around.
Rachael: I love the idea of taking the time to separate out these modules inside of the same codebase before breaking them out into services. That just sounds like a fantastic way of really figuring out where those lines are in the application and figuring out where all the issues are. But I can't help but wonder if these are just modules. Isn’t it possible for that tight coupling to still exist unnoticed?
Dmitry: Right. I think the devil is in the details here. That depends on what kind of modularization framework you've established there. Our goal is to prevent these kinds of tight coupling. So we are establishing boundaries, and contracts, and rules for how these modules are supposed to be made, how they're supposed to interact with whoever consumes these modules. This is where the enforcement really happens.
Rachael: That's amazing. So that's enforced through the technology. What tools are you using to do that?
And then we wrote some tooling around that to essentially enable all the...Rails engines aren't really built for this kind of marginalization. Usually, Rails engines are for people to write what you can consider a Rails plugin. For example, [Active Admin](https://activeadmin.info/) is implemented as a Rails engine and a number of other open-source projects out there where there's a whole pluggable piece, a mini-application that including your project may be pointed to your domain models, and it'll work. It'll have standalone pages, standalone UI that is usually implemented as a Rails engine.
But in order to make it work for our purposes of making these modules that are still parts of our project, we wrote some custom tooling to track things like dependencies between modules, to selectively rebuild only the things that changed our CI. Because one of the benchmarks and the tangible benefits that you get through splitting this thing apart is as you modularize your project, you're supposed to gain the ability to not run absolutely all automated tests but only the automated test for the thing that changed and the things that depend on what you changed. So as you separate out these dependencies, you also gain benefits in build times.
Rachael: That makes a lot of sense, and that's really interesting.
Dmitry: It's definitely been an exercise in adapting the Rails mechanism for what we wanted to accomplish. Like splitting out the data model is one of the things that Rails engines doesn't give you all you need for what you're trying to accomplish out of the box because our Rails, our modules...and they're not always Rails engines. We have some modules that are not Rails. They're just Ruby modules. Because these modules are potential candidates for future microservice, and we're trying to follow this domain-driven design approach, domain context, each module usually represents a domain context.
So it has to have its own data model that only that module can access directly, which if you have a typical Rails application, that's not the case. Any code can access any model any time. Everything's global. Everything's accessible. But we're trying to now establish these boundaries where if you extract a piece of your application domain into a module, according to the bounded context principle, only that module needs to have access to its internal models, internal data structure. And the outside code is only supposed to talk to this domain through this public interface that it establishes.
So there were definitely some tweaking and hacking at places that we had to do to enable each module that is typically a Rails engine to have its own models, its own migrations that are logically separate. But then, in the end, they plug into your root application, and physically, it's all still one database, but logically, they're all separate.
Rachael: That's interesting. And that probably helps out with database performance. I was talking with [Corey Haines](https://twitter.com/coreyhaines), and we were talking a little bit about application performance and the choke points that people have at the moment. We were making fun about single-quote strings versus double-quote strings because in Ruby, those used to have different performance implications. And he was saying that if the performance of your application…like, if the issues you're having with performance in your application come down to the strings, then you've got a pretty good application.
Dmitry: I agree.
Rachael: Because normally, you hit limits with your database, [laughs] and your API requests and everything else, and that is what is really hurting your performance. So you mentioned that being able to separate and have those database connections and I guess not waiting as long to request data from the database because you've got independent ones probably helps that a lot.
Dmitry: We're setting ourselves up for this. As I mentioned, physically, it's still one database; however, many shared database connections exist. But in the future, this kind of logical segregation will enable things like...we use [Postgres](https://www.postgresql.org/) for our database, and Postgres supports using multiple database schemas where you can get some performance benefits in actually keeping different tables in different schemas. And Ultimately, they can become their own databases running on different servers.
Rachael: That makes sense. And just to go back, why has [Calendly](https://calendly.com/) decided on this approach? So you mentioned that it's safer. You mentioned that it's a lot easier to get these boundaries correct. But can you start to see benefits from breaking apart the application into engines before essentially you've got the full microservice application?
Dmitry: Absolutely. And I think that three main reasons for taking this approach one is just the general handling of the unknowns. It is virtually impossible to get this right when you're breaking up a large monolithic application. It's virtually impossible to get this right in the beginning.
Second, is simply getting rid of tight coupling. It’s already a big enough exercise and a big enough lift that can take you a year or more to tackle that, adding on the operational concerns of okay, now I have to deploy separate services and manage dependencies between them. Now I need some kind of Message Bus to enable talking to them.
Dmitry: There are so many operational concerns that you need to solve once you actually physically split out things into services that it's also about managing scope.
And third is we need to be able to make incremental progress because we cannot just stop delivering new features and solely focus on breaking up the monolith and not do anything else. We need to rebuild this plane while it's in-flight and try to get incremental gains from that. So CI bill times is one of the benchmarks that we use to gauge our progress. The more modules we create, the more flexibility in CI. We have to rebuild only the things that need to be rebuilt and retested.
And the other thing is just developer productivity. As you extract things into the domains, now there are these nicely defined isolated contexts that ideally should map to different teams owning the corresponding domains. So, if a team happens to be working on something that’s, say, modularized, it is easier for them to...there's just less scope for engineers on that team to consider.
Rachael: That's so, so true. Also, "We need to rebuild this plane while it's in-flight" may be one of my new favorite quotes. [chuckles] That's amazing. I want to thank you so much for joining me, for talking me through how to design an API so that it's well-specified and consistent, for talking me through how [Calendly](https://calendly.com/) is breaking up its Rails monolith into engines so that it can support microservices in the future. And in general, just for sharing your experience with all of us.
Dmitry: Thank you, [Rachael](https://twitter.com/ChaelCodes). It's been a pleasure.
Jonan: Thank you so much for joining us. We really appreciate it. You can find the show notes for this episode along with all of the rest of The Relicans podcasts on [therelicans.com](https://therelicans.com). In fact, most anything The Relicans get up to online will be on that site. We'll see you next week. Take care.