Real World Serverless with theburningmonk

#1: Serverless at Zoopla with Ricardo Espirito Santo

February 18, 2020 Yan Cui Season 1 Episode 1
Real World Serverless with theburningmonk
#1: Serverless at Zoopla with Ricardo Espirito Santo
Show Notes Transcript Chapter Markers

I caught up with Ricardo Espirito Santo to talk about Zoopla and how they're using serverless, their experience with it so far and the challenges they have experienced with serverless technologies.

You can find Ricardo on Twitter as @ricardoespsant and check out Zoopla's open positions here.

For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.


Opening theme song:
Cheery Monday by Kevin MacLeod
Link: https://incompetech.filmmusic.io/song/3495-cheery-monday/
License: http://creativecommons.org/licenses/by/4.0/

Yan Cui:   0:13
Hi. Welcome to another episode of Real-World Serverless. A podcast where we talk about the real world use cases and challenges of serverless technologies with engineers who are working with them day to day. Today we have a special guest, Ricardo Espirito Santo, who is a lead software engineer as Zoopla. Welcome to the show.

Ricardo Espirito Santo:   0:33
Hi. Thank you for having me.

Yan Cui:   0:35
Yeah, Ricardo, We met about two years ago, in a meet up in London. I think at the time you were working for a company called Evrything which, if I remember correctly, you were building an IOT platform on top of Lambda. And at the time you guys had the highest usage of Lambda in Europe.

Ricardo Espirito Santo:   0:55
Yep, that's that's it.

Yan Cui:   0:57
Since then, I think you've moved onto Zoopla based on our earlier conversation, you have continued to use serverless heavily.

Ricardo Espirito Santo:   1:05
Yeah, that's pretty much it

Yan Cui:   1:07
Cool. So tell us about Zoopla and yourself. What do you guys do?

Ricardo Espirito Santo:   1:11
Okay, so we're really trying to reimagine intelligent home decisions for all, that's essentially helping people find out what their home is really worth and find out their next place trying to sell their current one, all around that problem space.

Yan Cui:   1:29
And what do you do on a day to day basis? And how is Zoopla using service technologies.

Ricardo Espirito Santo:   1:34
Right. So particularly for Zoopla, I'm looking after two teams and we're trying to re target users both offside in social networks and different channels on and also on site with relevant, highly personalised content that's gonna help them conclude some user journey that they have started but they haven't finished. So serverless, for us, is quite relevant. We use it throughout but, I guess at Zoopla we're still very much in an infant state of learning and using serverless. We have different implementations of different patterns of it here and there. But it's definitely becoming the new direction, the new norm, it's very much being expected. And it's just something that we're starting to get more and more focus on right now.

Yan Cui:   2:25
Okay, Maybe talk us through some of the things that you guys's been building using serverless technologies. I guess you're running on AWS with Lambda?

Ricardo Espirito Santo:   2:33
Yes, that very much is the case so we  use a plethora of AWS services and products even so from API gateway. We obviously use the cloud watch alarms, events, logs. You try to use a bit of their monitoring system with logs. And the Alarming, obviously the lambdas, and ah, we do have some things in Fargate as well. That's from top of head. I think that that's pretty much the ecosystem right now that we have used serverless at Zoopla.

Yan Cui:   3:55
So for the stuff that you're building with API Gateway and Lambda are you doing much of the user facing  APIs where you're concerned about the cold start performance and latency, or are these more background processing you're doing?

Ricardo Espirito Santo:   3:55
So we're definitely in early phases, as I mentioned earlier on, and we're experimenting with some business to business integrations. I've been developing with API Gateway, and Lambda is processing the payloads in the background and maybe dropping some of those payload as a secondary backup. So we're experimenting with some of those patterns. It's a very, ah, distant reality from having everything running serverless, but it's something we're exploring. We're implementing a few new patterns.

Yan Cui:   3:56
I actually see quite a lot of customers who are using a combination of both serverless, with Lambda and API Gateway and so on with things like containers with ECS and Fargate and on one of the common questions that comes up quite a lot in this particular space is "Well, how do you decide when to use one versus the other? And also, how do you make the two of them work nicely together from both infrastructure point of view and in terms of the tooling you use" maybe talk us through some of the challenge that you guys experienced in this particular space. What are you thinking around? When do you go serverless? And when do you go Fargate? Or do you consider them to be the same thing?

Ricardo Espirito Santo:   4:33
Sure. So those are very good questions. And I guess they are the million dollar question, right? We tend to judge everything in a case by case and then try to eke out off each decision, see if we can come up with a pattern with something that works for Zoopla with something that ah, bit of learning experience that we can leverage on future implementations. So you've touched on some very key points in there. Having serverless side by side to legacy systems, for example, or to non-serverless systems, for example, is always a bit of a tricky one. There are patterns that helped us evolve generically in that direction, where you perhaps front face your existing APIs with an API Gateway and get a Lambda in the background or just the API Gateway with infrastructure as code to route your newer calls to your micro services or to your newer Lambda functions. Ah, and then the rest. Everything else that you haven't migrated yet to your existing ELB's or ALB's, and then from then onwards to your existing systems so that that's a pattern that we're looking to leverage in some point. And we've definitely put on in place in some internal experimentations with that, and it's been quite successful when it comes down to, how do we make a choice? I think that that's something else you were touching on. So it's on a case by case matter. If you consider that new implementations tend to have a bias towards going serverless for a lot of the benefits that we get out of it, the stuff that exists. And perhaps there it comes down to expectancies, of delivery, delivery times, roadmaps, etc. So the conversation gets very much business focused there. If we have the time to fully replace a particular piece of the system, then we consider doing it in a serverless approach where we don't and it's a quick change or easy win to make the change on the existing system. Or maybe that way we don't have the full scope to make that change into updating that system into a serverless model. Then we still make that change where where the system is, where it has been running for the past 10 years or however long it's been in production for. So yeah, it's, ah, it's case by case and we try to get the best of both worlds and there is a road map. Then we're also trying to deliver the value to our customers as soon as we can and the best and most efficient way,

Yan Cui:   7:13
so that resonates a lot of things that I have a lot of things I've heard, and also certainly for my person experiences as well. It's just some components are just not worth migrating to serverless, because to get the most of it, you have to rearchitect and rethink your architecture, and even though serverless give you a lot of benefits. Sometimes is just not enough benefit for you to take on all that work to migrate existing system. So something or Fargate is really good middle ground where you still get some kind of serverless benefits in terms of not having to manage underlying cluster and all of that. But you can lift and shift existing applications into the cloud into an auto-scaling environment. All of that.

Ricardo Espirito Santo:   7:51
Yeah, yeah, that is that is my experience as well,

Yan Cui:   7:55
and you also touched on the benefit of serverless. I know that people talk about how scalable it is, how cheap it is. But from your point of view for Zoopla, where do you think is the main benefit for Zoopla and also for you as an engineer

Ricardo Espirito Santo:   8:11
Right, so that's a question that we could spend the next couple of hours discussing. There's definitely a lot to be said about the different levels off cost cutting which you touched on, for example, if you just consider that on the execution, the amount of time that a particular Lambda isn't going to be running, the equivalent ECS cluster would have to rent to provide the same level of functionality. Then there is a value to that, right? But also what we're seeing nowadays is mostly that to be able to deliver the same level off functionality from a non-functional requirements specification, if that makes sense. So from all of the niceties that you get out of the box with the serverless products that you get with a good cloud provider, like AWS, or Azure or  Google Cloud computing, what we see is that ah, you just are able to achieve the business value and then a lot more around it with a lot less investment in terms of development time, with potential bugs or having to have difficult conversation about aligning the entire company to a pretty little implementation technology, etc. It's just it's provided for you in the clouds, and so just from a cost perspective, that's definitely ah ah, big benefit for us. But then, we also touch on the other nines as they're known. So the resiliency, the reliability, scalability, all of those are actually a big, big focus for us as well and they get more and more often frequently mentioned in our conversations in architectural discussions and considerations.

Yan Cui:   10:02
And I think you just briefly touch on some of the challenges there in terms of the aligning the entire organisation towards one particular set of technologies. What are some of the challenges that you have faced with the transition to serverless, taking the company like Zoopla, which has got quite a lot of engineers. What are some of the organisational but also technical challenges that you guys have faced?

Ricardo Espirito Santo:   10:23
So again very good questions. So I guess that at some point there's a fine line between trying to optimise for an entire engineering department, too. Have a consensus around how to how to how to do things, how to implement and develop things versus forcing everyone to stick to the same platform to the same implementations. So we tried to keep that somewhat flexible, but we allow, so we love a lot off responsibility and flexibility. So if you're gonna I own something that makes sense you gonna have it? Um, reviewed by your peers, you're gonna have it. Ah, commented on. You're gonna present the work. And if that were calls and you're gonna go ahead with that particular implementation that naturally cross pollinates to others that this try to develop and naturally get a sense off alignment from the work that's being developed not necessarily a top down or a bottom up rule writing to start off with it. So I guess that we're generating the culture in the past practises as we go along by the work we do. So we're developing on something and that something seems tto work really well, then we present that and other teams naturally try to follow the same way unless they have a reason not to. And if they do, then that becomes the new norm or maybe not just two different implementation patterns that come up.

Yan Cui:   11:52
So in that case, have you had in cases where or example was whereby some teams just refused to consider service for whatever reason, because they we've always known how to run things on servers and all I want to do you spring, have a servant and run some code.

Ricardo Espirito Santo:   12:09
I'm not necessarily the refuse but att the same time you have. You have people that have done things in a certain way for a very long time, and that's just the way that their most productive and certainly there's only so much effort. I can try to push to convince everyone around me. But please, a pretty particular place where people are generally trying to evolve and learn constantly. And I find that if you try to have the survivalist conversation, most people will be aligned culturally. Yes, you will have some challenges, and symptoms are just wired to develop things faster. In a more traditional way, they rely a lot more on them. Infrastructure is code that was provided in the frameworks that exists in the business. That's not to say that they're necessarily opposed to doing things. The several this way. Let's call it it may be more of a comment that they feel that the work that they're delivering now ah, is more valuable to be out the door quick and earlier. Rather than waiting a lot of time on revamping the entire way off, developing and deploying things in a circle this way, terrible different approach on. I guess that that's where the distinction is right now.

Yan Cui:   13:27
Okay. And what about in terms of just some platform lamentations, or are there any tooling challenges that are actually making your life difficult when you're working with service from a day to day point of view?

Ricardo Espirito Santo:   13:39
Yeah. Yeah, absolutely. So there's Ah, there's definitely something to be said about still how difficult it is to be truly cloud agnostic. And I know some off the tools that claim to do this. Do it with, um, a relative relative ease and relative capacity and achieve nous sommes. Um, but But that for us, is still very much a hard point on to be fully climbed. The grab agnostic aided us, for example, is very much focusing on developing the right product into listening. They're listening to their uses for that. But then I would guess that when they developed things, how they developed them from, ah, perhaps they could develop them from a more customer centric. Sometimes some off the guys that they put out feel much more like a tick box exercised to say, Yeah, I've got I've got an AP. He read it in. I'm going with you X frame off you towards how this How would these guys be best served on that? That's just to mention the eyes. But then if you think about the tooling develop developed on top of that, we're very much focused on delivering things with I I'II see infrastructure, isco, this very big for us. We use terra form for the majority of it. But if you think about ah, what Terra Form is doing is literally being built on top ofthe deep Gheisar and mimic the FBI's mimic what confirmation to leave us. And I feel that that's no where it started. The U X for a developing d s decay. The AP for a ws, for example, was wasn't properly waxed before he was developed. And definitely if you have to do something directly with confirmation. If you like that, it's it's not the most easy tool to to handle on DH. Um, Terra form, for example, helps a beat on that, but it still has the limitations that the underlying infrastructure it relies on. So you see that when you're creating I am roles, for example, at some point you're just dumping Jason because that's That's the most descriptive thing and easy thing you can do, because realistically, there isn't a better solution out there. Um, so, yeah, I think we're still very much in the civil list is still very much an infant. Um, maybe infant is a bit of fun and fair comment, but it's it's very much a new technology, and we're going to see a lot of improvement, a lot of benefits out of AA. The efforts being drawn into into developing it further in the next few years.

Yan Cui:   16:23
Yeah, definitely agree on a certain from the turning point of view, I constant running to limitations around tooling you mentioned. It's funny. You mentioned the sort of problems, your confirmation and terra form. I have run into those same problems. There's still many times s so frustrating there so many times a day. But when it would have now is a new feature, a new service and then cheque. Oh, he's disappointed in confirmation. Exactly. Just the other day I realised that I was using the same and I tried to store secure string in the same private distort and then realised that Oh, wait, it's not even supported by confirmation. of how that happened on the figured out. Okay, right. The only way that someone has got it to work is to sue a custom resource. We should have the right lam. The function just applied a cap in the confirmation that really should be supporting you created basically having a secure string in Assam promised. Or but yeah, I understand that the team is under a lot of pressure and they're doing some pretty good things now. They open up their I guess the road map is public. Now you can go to get happen both for future that you want. So if you are like me waiting for features to be implemented on confirmation, then definitely go and cheque out their get up page and then a vote on the feature request that you're looking for is 12 on guess you made earlier. I think I just caught. Just quoted that you mentioned about agnostic. Usually that's a stick that you get from a lot ofthe guests. Private Clara vendors. We try to sell U S o continue allies. The solution's S O B s o hear you mention it. Since you guys are also gaining a lot of stuff we've got Far Gate and Lambda, which are, you know, by bye bye diagnostic. Those have no exact the components that I would have thought.

Ricardo Espirito Santo:   18:17
Yeah, I mean, I think someone way clever than be said one day that it's not a matter ofthe whether or not you're going club agnostic. It's a matter of when going club agnostic, and we're not necessarily considering other clowns. Right now. It's a plot, but it's something that I would elected prepare myself as an engineer, too. Ploy, best practises and best code. If I'd had the better way to do into leverage right now to Ling, that would allow me to choose the actual implementation at a later point. I would think that's a lot closer to how I was brought up a developer. As a junior engineer, it was very much in the interface kind of mentality. So you dropped into face. The implementation comes up next guy. I guess that's where that mentalities right now, we don't have a need for it right now in Zuba. That's not to say that that's not going to change. It's just that if I could, I would be doing things a bit more cloud agnostic. Also, we don't have a need for it. And at the same time, I don't think Ah, the tooling is right for right now. So it's a bit of both, I guess. I guess there always

Yan Cui:   19:24
fascinates me as Teo was less so that how do we do, Clara? Agnostic but more the question of why, in this particular case that you said that super does have a need for it. So in your experience in your past employment have you so come across cases where you know, you would say right? OK, we do need to be Clara Gnostic because ofthe X

Ricardo Espirito Santo:   19:44
yeah, yeah, for sure. In one, off the other cos have worked for with one ofthe enough the very practical examples We had a client Ah, that was very much tied up with businesses in Asia. And they felt that there the market for issuer was way stronger. And that's what they align themselves with. So way actually had a known software customer asking us for our software to be deployed in their region but to be deployed in another cloud. So there was a particularly specific requirement from their client to just have this set off functions delivered in azure on DH. You'll be accessing our data is going to flow through your system too weak. We kind of like Phil that we have the right asking that on DH. Yeah, it become a challenge from, like overnight talent for us to go cloud agnostic for a pretty clips a part of our system. But it highlighted the fact that how not ready we were for it. And I think if you put a lot of companies through that I'm talented, I think he did. They're all going to be facing the same sort of challenges that we

Yan Cui:   20:53
were. Yeah, that's the interesting one. I've heard that a few times that we need to be Clara agnostic because our clients say so because I can't ask us to go use a particular cloud provider. I guess that is fair. And sometimes you just have to do what the client asked you to. Teo. I'm still yet to be convinced by any product companies that tells me they need to be Clara Gnostic. Whenever have that conversation is always just goes back. Teo Well, because we want control because we need to move cloud quickly something to some local banks in Holland, on base in the Amsterdam these days. And apparently when you are a bank, suddenly Netherlands, you have feeling a question here on one of the questions is no. What's your plan for moving from from one cloud provider to another, and apparently a lot of companies a lot. Banks with interpreter as well. You need to be Clara Gnostic. But as talented, this decided this company money. You, who's through some really amazing things, you first surveillance and also on a daily basis. Well, and apparently that's not the case. Oh, that's not waste men by that particular question over they wanted to hear was okay. Something that we thought about, I think for macro level, the central bank has position, Really? Is that what we don't want all of our banks to be running on eight appears on one cloud. We want to mitigate the risk of one car provida causing problem to all of our entire banking system. But it's very interesting how quite a single question often Dr People, towards this illusion that they need to be clogged, not there, because that's requirement, even though the question is just a really open question about Well, we thought about we're gonna do

Ricardo Espirito Santo:   22:28
Yeah, Yeah, for sure. I mean, it's as he said, it is quite interesting from from, ah, disaster recovery point of view. I think that conversation has ah, somewhat limited expressiveness. If you think about the number off again going back to the number off 90 get after the 99 point x percent, um, off reliability, resiliency, scalability attached with at some off big players provide s so it has a limited expression in that sense. But if you think about it from a perspective off, some clouds are better at something than others. So I know that I Then I don't know this from experience, but what every day is that sure is particularly strong in a a ay, an ml offerings. Where is perhaps, um ah, I ws was first to market with a few other tools. If you like R. D s is more ah developed, for example. And so those comparisons get more and more interesting as more details you bring out into the conversation.

Yan Cui:   23:31
Yeah, that I can definitely agree with smash, I think. I think that's great. I've been using Google Big quarry for many years. And how do you compare pitting the two are policy just lean slightly towards the Google big quarry? Intense off being a better product? I guess that can't princess more to disarm multi cloud conversation as opposed to the diagnostic a demonstration. But I do think there is a dangerous well with multi clouds because it's gonna be really half well, hi. Anybody who knows no familiar we have multiple clouds, but also in terms of the costs. Overhead is if you think about the Enterprise Airport in order that relationship building that you three helps you get no out off a really deep trouble. When the things are going wrong, you can build that relationship and that support with multiple car providers you want also in terms of reliability. Point of view is that if on a particular vertical you are integrating with multiple cloud providers than that means one craft robotic goes down, your entire stock goes down. Which kind of ghost, which kind of flies in the face off now, using Mahdi Cloud for greater reliability that you can you created multiple single point of Fabio got back, But yet it's intensive for clap of idle ideas, anything that you think they should be doing better in terms ofthe improving a critical area that's going to really help customers that are looking to adopt service.

Ricardo Espirito Santo:   24:57
Andi So I think I touched earlier. Oh, known on that. So just applying this him off the U Ex principles to ap eyes a nest ic a develop with used eccentric views. First, think about coming out with products Net serve in tow. End use case rather than oh, he is into the product. It just does more or less the same thing as AP a gateway. But it does. It's likely different. And then you're left wondering which one to choose and which one's best. And especially if you're an early adopter, that takes quite a lot of energy to try to figure out. Okay, so I'm gonna wait until I have read a bit more about these because I'm really not sure if I'm using this right. And sometimes in this field is still quite hard to come across good sources, and you gotta know where to pick your sources from to go for reliability on what you're learning and on some off their documentation could be a bit better. I find out Ah phew! Off. The examples are very useful for getting the hell of worlds out there. But then when she passed that point, you really stuck with not a lot of help after that. Um, I think some support, maybe listening to what's going on in the Terra form. And I'm very biassed or it's terra form, right? But looking into what's going on in there forums, listening to their users, from that point of view from that perspective would be very useful. Um, but ah, yeah, I think I think that's that's ah like it, Ed. I think all of the companies that off the big companies in displace of doing an excellent job, this is almost need picking some some finer details.

Yan Cui:   26:42
Yeah, I think that was the final details are very valuable feedback for a dubious. So anyone who's listening from A to B s now you've heard it okay over here. But I really like Oh, I love the things he mentioned. Certainly from the documentation point of view, one of my constant our guests complained about the documentation. Is eight of us always heading us to use it. You structures code, But then order the lamentations. Examples are go to console his passport. Click this button. Talk about consistent messaging here. Getting mixed messages touch way right on some of the reviews. The US side of things as well have been one of the things that's quite peculiar about eight. The best in terms of high operates is they call this the internal where they got this way off building a new service where they write thie release a statement first to measure the okay. We really understand the use case for this particular service before they call ahead ability it which I think is great. And I think they also get a lot of feedback early on when they were thinking about a noose case for this new service from customers. But and there's a massive gap in terms ofthe when they were implementing their service. Often they're just no feedback on Teo, right, the very end of about to release something that that's when they start asking customers to go and do like a private beater or game. But we've heard people such as myself who are really guessing grain into the community, too. I am giving back, but at that point there's only a massive gap off the face being done on. Then your feeble often comes quite late, especially when you've got new services that are about to be launched for re event. So, like people were ready to pull the trigger and then try to get the feedback to change things last minute and often times does change doesn't get in there because of time pressure. So I think that's interest of the party parliament cycle. That's for something that they really should be improving. Getting Mohr intermediate media feedback from customers who are interested in products that debuting

Ricardo Espirito Santo:   28:37
yaps. Absolutely. I couldn't agree more. I think you got it quite well.

Yan Cui:   28:42
And so So I think that's everything I wanted to cover with us. Well, tell us, how can people go and find you? Find out about you and the hacking people Read about stuff you're you're working on.

Ricardo Espirito Santo:   28:55
Okay, so I'm way said in the beginning. I'm working for super right now. I'm an engineer. Lead here. Um, you could find me. There's a There's a twitter handle somewhere. Ricardo tsb Santo. Um, there's ah, I'm pretty sure. I've got a stack of a flow profile as well. And it her profile that they all follow the same the same sort off pattern. And, um, that's it. Pretty much it, I think.

Yan Cui:   29:26
Okay, so are we. Include those in the in the in the show notes and thank you very much for joining us today.

Ricardo Espirito Santo:   29:33
Thank you. Just one last comment on ah Xue play. If you've liked what you heard me saying about it. And if you'd like to join us, we're always looking out for good engineers. And, um, we both get plenty of difficult challenges to tackle and interesting problems to solve. So we need help with that. Yeah,

Yan Cui:   29:54
We heard a man so going apply if you want to work for Adria's serve less and far Gate.

Ricardo Espirito Santo:   29:59
Yeah, absolutely. Thank you very much for having me.

Yan Cui:   30:01
He's playing. You take everyone. Teo

How does Zoopla use serverless?
How does Zoopla choose when to use Lambda vs Fargate?
What are the main benefit of serverless for Zoopla?
What are the challenges Zoopla faced when transitioning to serverless?
What are the platform and tooling limitations that make life difficult for you when working with serverless?
Let's talk about cloud agnostic
On multi-cloud
How can AWS do better? Where can they improve?
Wrap up