Real World Serverless with theburningmonk

#6: Serverless at TotallyMoney with Nick Blair and Steve Westwood

April 08, 2020 Yan Cui Season 1 Episode 6
Real World Serverless with theburningmonk
#6: Serverless at TotallyMoney with Nick Blair and Steve Westwood
Show Notes Transcript Chapter Markers

I caught up with Nick Blair and Steve Westwood to talk about the state of Serverless adoption at TotallyMoney and their transition from running .Net microservices on containers & EC2 to a Serverless-first approach with Node.js/Typescript functions. We talked about the changes to how they do local development and testing, as well as platform limits that they have run into.

Get your free credit score report at and check out their careers page to get in touch about open positions.

You can find Nick on Twitter as @nickblair and find Steve on Linkedin.

For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.

Opening theme song:
Cheery Monday by Kevin MacLeod

spk_0:   0:12
Hi. Welcome back to another episode ofthe Real World Service. A podcast where I talked to real world that practitioners off surveillance and get the storeys from the changes. Today I'm joined by two guests, Nick and Dusty from totally money. Welcome to the show, guys.

spk_1:   0:28
I am.

spk_2:   0:28
Hello. Good to see you

spk_0:   0:30
guys again on We worked together for a while. A totally money on DH for the audience who are not familiar with where you guys to Do. You want to spend a few minutes and tell us what is totally money? What? You guys do their

spk_1:   0:41
Sure, I could do that. Hi, I'm Steve. I'm an engineering manager here. So that means, I think, well, are team who's on that team? Also on down the trenches, making codes as well. Both Nick and me work a totally money and

spk_0:   0:58

spk_1:   0:58
money. It's a provider of free credit scores. You can sign up. You can find out about your credit scores, find out factors that affect your critics. Scores on DH generally helped to improve your financial situation by knowing a bit more riot.

spk_2:   1:16
My name's Nick. I'm head of technology at totally money. Esso. I spend some of my time thinking about the tools on the platforms that we used to deliver our products.

spk_0:   1:27
Great. And how you guys using service at totally money.

spk_1:   1:32
So we, um, heavily bought into a ws That's our cloud platform rider on DH service was something that sounded quite interesting. Ahs ah, the initial interest wass having on demand services that didn't need to be alive and they didn't need to kind of have ah Mohr expanded provisioning framework for that interested us on DH the possibility of not wearing about infrastructure so much on DH. Also a bit more developers as Dev ops. That kind of change was also something that interested us. If you want to add anything, Nick,

spk_2:   2:12
from my perspective, we're also growing business. We're busy hiring engineers on investing in new verticals and expanding the range of things that you, as a customer value, can get from totally money. So that translates into technology and software that's built in a distributed way. So we have lots of different teams because we have lots of engineers and those engineers to be productive and to enjoy what they do there. There. We know we're trying to build micro services that talk to each other and there are that kind of can communicate across our ecosystem on that works well. We've landed because lam theirs and kind of functions as a service is inherently isolated and inherently distributed. That fits nicely with ours, and we and we kind of used the eight of US services like NS, NS and desk us quite heavily to wire up those systems. Andi, allow us to be productive,

spk_0:   3:04
Okay? And you say you've got a lot of engineers and was different teams. Are we talking about hundreds of engineers? Are we talking about maybe No. 40 50 engineers? What's your current? The text back and where do you kind of move from?

spk_1:   3:17
Yeah, our style for this. So I think traditionally our text back is probably more along the lines of dot net applications. So AP eyes very services written and dot there our team here. It's not massive team. We've got probably for developers. Now that's been growing steadily. Probably be 40 by the end of this year. We started out with C sharp. That's off dot net text back. We've got some f sharp in there, some functional stuff that Nick's very interested in and equally we kind of moved probably away from the paradigm of having dot net Mike Services Toe still functions as a service allows you to just kind of use whatever your fancy on that day, really. But with, you know, we do laws stuff with job script on typescript. So yeah, that's that's kind of where we are in a moment.

spk_2:   4:12
We were traditionally coming from a place with more monolithic three tier enterprise application and that you know that involved with running our own easy to machines using RGs and other than maid services, but without the kind of promise of scale on DH, without the ability to really scale the team very well, as we kind of slowly started to break up the application and we started to use varying technologies as well using react on the front end, leveraging that to build better, responsive and more modern Web products. On on the back end, we've kind of innovated more along the lines of integrations and being able Teo communicate with their parties. And that's where that's where we started to introduce Lander as an easy way to spin up new services on DH integrate with parties in a kind of offline, a sink, non customer facing way. That's where we kind of saw the least amount of risk to begin with. But today, increasingly, we're We're also using it to build on DH, serve our front end applications and customer facing parts of this stack because of the promises of things like scale and reliability and simply just a way that has much lower friction than traditional deployment methods and things that we've been used to.

spk_0:   5:29
Okay, so you got quite a big transition in terms of the technology stack intense off the languages are using, but also the services that you're using to run your application home. Easy, too, and containers to now land. What are some of the pain points that you experience along this transition?

spk_1:   5:45
Well, we've just finished a project that is entirely served this project we're now offering at totally money the ability to switch your energy provider like a service to a customer that for us, the kind of back end the plumbing of that is done all in service is being really great. Being so much more rapid being out Tio, add new end points and integrative, a gateway, et cetera. It's been so good that as it's grown, we've we've actually come up against the problem of cloud formation only allowing 200 resource is per stack with buying up against that in a moment on DH, trying to figure out howto architect it going forward.

spk_0:   6:30
So when you say things are more rapid, I guess you're talking about the holes of development cycle. Do you have something of a rough sense in terms ofthe where you were before and there where you are today?

spk_1:   6:41
You know, I can only really be an IQ, though, about that other thing. It's not the type of thing we tickling measure here, but it does feel like it's a bit more responsive in terms of spinning up. A stack is kind of like one line your terminal, and and the ability to have a project contains like that is very powerful. The ability to update quite easily on immediately rollout things just means that you feedback loop is a lot shorter. We've just found it particularly easy to add. New resource is like Oh, you wantto esri bucket. No problem. You wantto a new topic on SMS so easily configurable Andi Rapid to deploy.

spk_0:   7:27
Yeah, So one of the things that keep hearing from people is that there's no easy way to do local development on guest. If you're coming from Don Nets, you also quite used to that kind of local feedback where you run your code container or inside idea and you can see by the code. Have you found similar challenges as you move from your more traditional stack to now running on, serve less, you think note and the typescript instead of dot net on the container.

spk_1:   7:54
Yes, I think, you know, initially, that was something that just a different way of working for developer. Weirdly, I don't think it's so much for a problem now. We're just kind of used tio. When we develop something at the micro level, we just make sure that it's like thoroughly tested on DH Then, because of the ability to rapidly prototype and put up new stacks so effectively, the kind of old chucking project down the line from like a deaf stack Teo U U 80 stack and all that sort of stuff a lot more fluid now, so we get the feedback loop on that is a lot faster but local development paradigm of like having, you know, like a visual studio running, you don't application. You know, that has kind of that immediacy, I guess, is gone. Andi, so replaced with something a bit more fought for more based upon testing, because obviously we have tried running our land of functions locally with, like, used local stack and stuff like that. But I think in a certain sense, it's just sort of just easier to push the whole thing up and then see how is on DH. That's how we've been developing. And actually, now that's kind of like the new norm.

spk_0:   9:11
So pushing stuff to the air base and then testing it dear water than trying to use local stack to similar everything you said. What you mean

spk_1:   9:19
That's exactly right? Yeah, So we try to experiment with local Stack, tried their teo get us off local development thing going, but ultimately we found that was no longer really necessary. When we develop in new features, we just get the feature and think about how to do it. Write it, tester, push up on see how it works

spk_0:   9:44
and in terms of how you write tests? Has that also changed us? Well, we have the way. Now you're doing development with more Focus on against testing The real thing once has been deployed.

spk_1:   9:54
Yeah, has I mean, like, functionality wise? I think unit testing is is kind of how you prove that your business logic where excess that's fine. There was kind of a bit of research about how, like in this particular project, like how we actually go about doing something a bit Mohr end to end or integration test on DH. What we found was the actually, our C I pipeline now just spins up a test stack on DH. We run our integration tests against that stack. I make sure they were all fine before we deployed toe are pre production environment,

spk_0:   10:35
right? So that's a temporary stack that you spin up just for the purpose of running tests as part of your C. I see the pipeline, right? That's

spk_1:   10:41
exactly right. And again, it's just that flexibility of single line. Spinning up a new stack as you need it is fantastic.

spk_0:   10:52
Yeah, that's really cool. It's definitely a pattern that I'm seeing people employ more and more. I guess because, like you said, is that flexibility is really easy to do it. And you don't have to worry about taking care of all these test data they end up having in your staging or your shared environment. Another question. I have the intense off some of the platform limits you run into you mentioned that you'll run into the 200. Resource is limits for one class mention stack. What about any other limits you have running to know as you're working with for service and 80 breasts more mohr on a day to day basis?

spk_2:   11:24
I think anyone that comes to mind is, I think the number of invocation limit that you get with Lambda as it is a kind of an account wild thing. I think we we haven't approached that limit, but that's something that we're keeping an island, and that is that. We know we can ask eight of us to increase for us if need be. But I think we'd rather let that be a reason why we favour having more granularity in terms of accounts in this scenario, where we do need more kind of burst e on a kind of a larger overhead in terms of Lambda invocations them, then we kind of we know exactly why that's needed and what was causing it, as opposed to just to kind of just kind of having a knee jerk reaction. Teo expecting that we're going to reach that limit at some point. It's not. It's not a place where you wantto be operating near to and then you're ending up with throttled invocations. I think there's other other limits, like number of points on on AP Gateway. But there's no snow, no something that we kind of practically run into day today as to say we're a business that it operates. Only in the UK are customers will UK based that lowers the sort of complexity of being globally distributed of being. You know, I have redundancy and everything else. We kind of have, ah, lower target to aim for in that sense. So whereas customers of eight of us, you know, we're not often the ones pushing the boundaries, but we just kind of benefiting from the low hanging fruit that people before is that paved the way to achieve,

spk_1:   12:49
I'd also add, of course, like when we started off using service There was a lot of angst around Cold starts on DH. For that reason, we didn't really fully investing customer facing functions. But now our solution to that was, if you remember, we talked about being a kind of dot net house and having lots of expertise in the back end with C Sharp, etcetera. We've kind of transitioned away from that towards Mohr like Node, based on DH typescript based functions. And actually, there's been quite good to buy in from that from R. C. Sharp Back and developers

spk_0:   13:29
on DH. One other thing I remember from our time working together was that we were doing a lot of work on the log infrastructure on one. The limit that we ran into was the fact that you can only have one subscription future per LA group on that. The other day I learned that you can actually have more than one. But the only way to do right now is to raise a support ticket. No, a service limit race, but a normal support ticket on asked the database while specifically the club was team to allow more than one subscription philtre for the account. So you guys are doing with all this transition, and it's definitely a transition that I'm seeing quite a lot off and well about. Are there anything that you think a device can do better? And it could be documentation and maybe some of services. There's just no as easy to you work with. Have you guys so much work with incognito since we last spoke?

spk_2:   14:24
No, I don't think we've done much more than since we since we last spoke, a couple of things that I had in mind to deal with. The limitations of the things that eight of us could could improve upon well is just further development of products like CD Kay, the club development kit that we've done Cem probing in in that direction. And it feels like a really sensible way to actually describe infrastructure using, you know, like a fully fledged programming languages on having an actual programmatic control over your infrastructure, treating it like data and being able to kind of operate on it, keeping interest, controlling, you know, in the in the same ways as you would with a young far. But it's just a lot better than gamble far on, and but the difference is of the moment between, say, the service framework or Sam versus CD case, that the community in the the library of plug ins around seven its framework is far superior to anything that's in CK at the moment. So you actually lose quite a lot ofthe benefit if by moving directly over to C b k at the moment, so there's gonna be a few barriers Teo towards that adoption. So I would say eight of us could just invest in creating something approaching parity with the wealth of plug ins that the severest Framework House versus what CD kay has. I guess in addition to what a diverse could do better, I think part of what I found to be quite limiting, or at least inflexible, with a diverse Landers a product was the ability to control aversion ing on the deployment artefacts in the sense that there is a concept off versions and aliases within lander, However, they don't behave in the way that a zone organisation would like them to. You're not able to kind of describe your your own version number and nor does kind of aliases work to provide the same ability to kind of to quite clearly describe what version off the lander is running and the way that we would like to define the version of what is running on a given lander. Any given moment is by taking a built number from our C I pipeline. So I see I pipeline is part of a different system when you circle C I. That's where how we choose to identify our build artefacts. But there's no way to naturally pull that into Landa eso. We've kind of had tio kind of navigate around that problem by just using environment variables and having easy way to make it clear what running and then to be able to get that into our logs. So for me, that would be something that the rest could could improve upon. Additionally, Way didn't have a great experience with provisions capacity recently. We tried it out not long after it was announced a few months ago. We did that with the dot net solution on. We think perhaps that might have been part of the reason why we didn't have a great experience, but we were able. Teo set a greater provision capacity for a given lender that had the negative impact off, making it immediately more difficult to understand which lander was the one that was provisioned and that had the capacity like it was difficult to into it from the UAE and to navigate the you Whitey Teo, understand which land was being invoked and how you could see its logs. Furthermore, it didn't actually solve the problem. It didn't it didn't reduce cold starts. We still found that even in a production like lone testing that it really wasn't reducing the cold start time of dot net lander. So we struggled to see the benefit off it, and perhaps it is just very early days, but we're willing to give it more time. But that's been a challenge for us recently.

spk_0:   18:00
So I think that's probably a case off bad documentation or leased that user guys. Or maybe not as clear because the co stars in Central Initialization Time still happens and they're still reported in the logs. But there happen ahead of time. So even though you still see those being reported in the X ray trays, as was in your logs, but they actually not going to be on the use of facing path. So if you look at the actual engine response time, force a function, then you won't see those co starts. If the provisions on currency has been created, because when you add provisions currency, it gets added to a particular version, not the lady's alias give you are deploying with the service from work. Every department's gonna create new version, and you have to have something that shifts the provisions in currency to that latest version. And you can't allocate provisioning currency onto the latest alias. This is probably where you're seeing some of this confusion as to why is it not working with, you know, you actually using provisioning currency or the on demand concurrency. That's something you can see in the metrics for the particular version or alias that you are using. As for the alias, if you want to get your beauty version number into your locks than the only way kind of to do it right now is to do that through environ variables. Even if you have them as your alias, there's no easy way for you to run time. Figure out what the alias is, and you could create alias. We've any ABBA tree names. Get your point that no version numbers we all used to being cemented version ing those version number. She followed this demented version structure, but we have Lambda We've layers. They all just followed us a 12345 So that's not particularly useful. And the idea is to use aliases. And if you want to have cemented version ng, then you have to put that in your alias name instead for the provisions on currency stuff. I do think there's quite a few things that need to iron out from the developing experience other things, so that it is less confusing. And I do think right now is quite confusing. Also, I don't think it's intended for everybody to use, but if you're running down yet, and I do think you guys should be able to take advantage of that and use existing don that code and not have to worry about some of the coast of the country running dot net. So maybe there's something we can take off line and talk more about in detail. And maybe I could give you a hand trying figure out what's going on When you were running with provisions on currency and also with the provision concurrency. If you've got a spike in traffic, you can still get to the point where they're just know enough a provision in currency so that you end up using on demand concurrency. At which point you're going to see Cho start again.

spk_2:   20:34
Yeah, understood. I think. I think large. Our measurements weren't necessarily just from the the measurements of the start time when exclusively from the eight of us cloud watch logs, it was actually extended monitoring using external systems to measure the full Leighton. See of a request that should have been served by vision capacity recently. Sure that it's not just a problem of reporting, but you'll be happy to take that off line and likely more detail ity.

spk_0:   21:03
So the case that you have a pair gateway pointing to the right alias we

spk_2:   21:08
we believe we did all that correctly. We're using surplus that the recent updates to the severest framework that to support provisions Castaic on DH, we still were able to get valid responses on the system works and behaviour is expected in terms of input and output. It was just simply a problem of late and see that we still experience.

spk_0:   21:29
You have to. There's a couple things that you have to make Sure. I would say when you do that poor it best to cheque in the concert measure. The configuration is exactly as you expect because this, by devote the Seraphim with us and configured tapered gave a m point to talk to anything other than the latest ideas. And you can't stick a provision in currency against the latest radius. And also after you deploy, even though you know it deploys is has finished the provisions in concurrence itself can take a few minutes. Just get provisions so that if it has struck right away, they're not going to be hitting the provisions in currency B. L ESA member does take this offline and then we can talk more about it afterwards. So I guess we're coming up to the half hour mark on DH. I want to take the time to thank you guys very much for joining me today on DH. If you got anything else you want to tell the listeners about totally money, maybe you guys are hiring when it was over. Host You're looking for right now.

spk_1:   22:24
Yeah. So we are hiring. We're gonna try and at a bunch more developers toward development team. So if you're listening and you like the sound that, then just get in contact,

spk_2:   22:38
we're in London.

spk_1:   22:39
We really called to work for a common worker. It

spk_2:   22:42
in 2019 were voted number 52 in the rankings ofthe small to medium enterprises to work for by the Sunday Times on your will for front and rolls and back and rolls. You know, in our engineering team.

spk_0:   22:55
And you guys astute base near Old Street. Right?

spk_1:   22:57
That's right near the Magic roundabout.

spk_0:   23:00
Excellent. Eso you heard the guy? So you feel you're looking for interview? Interesting. And you're looking for for the service company. Then the cheque, These guys out. Thank you guys again very much on DH. I'll see you guys next time. So that's it for another episode ofthe real world service. Thank you guys again for joining us to find a show. Notes and the transcript for this episode Please go to real world. Several iStar come. I'll see you guys next time

how TotallyMoney uses serverless
on moving from .Net on EC2 to FAAS with Node/TypeScript
on the benefit of going fully serverless
on lack of local development
on testing
on platform limits
on the need for better versioning control of Lambda functions
provisioned concurrency is confusing