#46: Serverless at Polestar with Anders Quist Artwork

Real World Serverless with theburningmonk

A podcast where we talk about real-world use of Serverless technologies from engineers who work with them day-to-day. We will discuss use cases, why they chose serverless and the pain points and challenges they face. If you want to know what it's REALLY like to work with serverless, this is the show for you.

All Episodes

Real World Serverless with theburningmonk

#46: Serverless at Polestar with Anders Quist

January 27, 2021 • Yan Cui • Season 1 • Episode 46

You can find Anders on Twitter as @ea_quist and on Linkedin.

Here are the links to things we discussed in the episode:

Polestar, a Swedish car company where Anders work
GraphQL Mesh, an interesting technology similar to Apollo Federation

To learn more about opportunities at Polestar, please get in touch with Anders at anders@qinfo.se.

For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.

To learn how to build production-ready serverless applications, check out my upcoming workshops.

Opening theme song:
Cheery Monday by Kevin MacLeod
Link: https://incompetech.filmmusic.io/song/3495-cheery-monday
License: http://creativecommons.org/licenses/by/4.0

Yan Cui: 00:13

Hi, welcome back to another episode of Real World Serverless, a podcast where I speak with real world practitioners and get their stories from the trenches. Today. I'm joined by Anders Quist. Hey, Anders.

Anders Quist: 00:25

Hi, Yan. Thanks. It's great to be joining you in this podcast.

Yan Cui: 00:33

Yeah, so we've actually, I guess, crossed paths a few times now. You actually took my workshop a few months ago. You were a great participant, by the way, very active on the chat. And also, you offered help on the some of the open source issues that we, I guess, we highlighted during the workshop. So I guess maybe let's start by you telling us about yourself and what you're doing nowadays?

Anders Quist: 00:59

Yeah. well, as you said, my name is Anders. And I've been working as a system developer and architect for for well over 20 years now. That's the majority of my professional part of my life, I would say, in various roles and various contexts, in everything from developer and architect and chief architect and engineering manager in consultancy firms and product companies in lots of different domains as well, both logistics and payments and accounting. And now, I'm in the automotive industry. And I'm currently working at a company called Polestar, they produce and sell electric cars. And I'm part of something called the Polestar Devhouse. The Devhouse is kind of a virtual development department that spans over several companies in a joint effort, and works closely with the business to both build the post art digital presence, and also continuously performs digital business development as a joint effort. And I'm part of something called the platform services team. And the short version of what we do, we do a lot of things. But the short version is that we handle the cross-cutting concerns that spans over all the development teams that are part of the Devhouse. And we help them both with tools and also assist them so that they can focus on on delivering value for the majority of their time, and, and also continue continuously revising the architecture that everything is built on. So we can reap potential benefits from from, for instance, updated or new services from our cloud providers, amongst many other things. That that's what I'm doing right now and where. Yeah.

Yan Cui: 03:32

Okay, so Polestar sounds like quite an interesting company. Certainly the whole electrical cars is going through a big boom right now, I guess in part of what Tesla has been doing. So I guess, in this case, can you tell us a little bit about how Polestar is using serverless technologies? I guess when people think about electric cars, they probably think about Tesla and think about auto driving and all of that stuff. Where does the serverless technology fits into that puzzle?

Anders Quist: 04:03

Yeah. I don't see the total extent of everything that Polestar does. That is software related, so to speak. But the aspect that we are focusing on is, can we say from from where you, you want to actually buy a car and you go to either using the app or a web browser and configure what options you want and then you perform a payment or place an order. And you can follow that and track that order and see your car being built and delivered, etc, all the way to where you either want to look up something in your manual or or things like that, and also the business to business parts like handling leasing and fleets, etc. So so that that part or aspect of Polestar is what is on our horizon in the Devhouse. And all that is more or less all that is realised through serverless frame framework. So the business logic that handles all this and deploys this is using that technology.

Yan Cui: 05:32

Okay, so you talked about the serverless framework, and I guess the, you are using that to deploy all of your APIs as well as Lambda functions. Can you maybe touch on some of the other services that maybe you're using? Guess, you know, you mentioned the e-commerce aspect of the of the Polestar website, and the B2B systems are done using serverless components. So I'm guessing API gateway, Lambda, DynamoDB. Anything else that's in the mix?

Anders Quist: 06:00

Yeah, of course, there are actually quite a lot of services. But you touched on the most common ones, of course. And everything from Kinesis to Neptune in some cases. And also Elastic is also used for some cases, both for for things like logging aggregations, but but also more of the generic search functionality that is enabled for for the end user or the user. So that's, well, there are some constraints on what services that we could use because we also have the requirement to exist in China as well. So what is provided in AWS in China, kind of put a lowest common denominator on the services that are possible to use and would not be too much extra work, yes, to get around when it comes to China. So that's an interesting and sometimes challenging aspect of this whole venture, so to speak.

Yan Cui: 07:25

Right. Yeah, I've had some experience with AWS China, and that is notoriously difficult thing. Let's, let's circle back to that. I want to touch on a bit more around your architecture and understand some of the key highlights. For example, you mentioned the Kinesis you mentioned a few other things there as well. How many microservices do you have? And in terms of how the teams are organised? How do you go about in terms of ownership of these microservices? And also, how do different microservices communicate with each other?

Anders Quist: 07:59

Yeah, that's an interesting question. Well, looking at the services we we we have, they are on a higher level, I would call these islands and they are mapped to to the development teams and what is called a value stream. And that that's a business name on the domain that they are focusing on delivering. And, and almost all of them consists of a group of Lambdas that are fronted, often fronted by an API using GraphQL. And a static front end in most cases, that consumes the GraphQL API that only exists for for that particular micro or mini service or, or island, if you will. So these, well, it has been quite an accelerated development because I didn't join this from the start. So I wasn't there when when the initial decisions were made, so I can't really bring any insights into that, but I can tell you it has gone really fast. And, and, and the onboarding of new developers etc, has been in my, based on my experience, come extremely well. And there are now kind of a transition between having delivered these value streams and kind of a movement into something more like a product oriented life part of the life cycle. So in terms of how many teams currently exist, and how many value streams that currently exist, it's kind of a moving target. But looking at what has been produced, I would say round about 25 of these microservices miniservices has been developed give or take. And and, well, there are, the most common extent they are kind of autonomous. So, we have, well, a common pattern with SQS and SNS that is both used for for some synchronous communication within these micro mini services and also between them when there are dependencies, of course, and also to handle integration cases with with the external systems like your payment service provider, etc. And, well, yes I looked up the recent numbers, it currently consists of over 700, well over 700 Lambdas and 80 SQSes, a bit over that and 50, 50 plus Dynamo tables, and some SNS topics going with that. And so that that's on a on a higher level, what's what's going on behind the scenes, so to speak.

Yan Cui: 11:44

Okay, that's really cool. And I've guessed, with the GraphQL API you are building, are these Lambda functions running something like GraphQL.js or Apollo server? Or are these using AppSync?

Anders Quist: 11:57

We've been looking at AppSync. But since this was initiated back in May 2019. That was not something that was viable in China, as I mentioned, put some constraints early on. So the choice was made to go with Apollo server Lambda and hosting that kind of a monolith thing in if you want, having that in front of other Lambdas as resolvers. And I think it has served served us well going that path. But in some aspects, being autonomous is also kind of a challenge when it comes to consolidate all these APIs because they they kind of live in isolation. And you would have like, like domain aspects being handled in multiple instances. And that could prove to be something that can be challenging to consolidate further on. And while AppSync is pretty powerful, I think it can possibly prove hard to migrate to AppSync where we are today. But I have actually been tasked with looking into that. And I think it's possible in some cases to do it. But in others, we we might still be, well, stuck where we are at, but it's not a bad place to be. I need to add.

Yan Cui: 13:54

Okay. And in terms of consolidation, have you guys look into any thing like, I guess, Apollo has got Apollo Federation, so that you can stitch a schema together from all of your 25 different value stream based GraphQL APIs into one coherent schema to manage, I guess, some of the duplicated domain specific operations that you mentioned earlier?

Anders Quist: 14:18

Yeah. Federation. We... It was up on the table early on. But we actually did a proof of concept with something called GraphQL Mesh. And it's kind of a, I can provide the link but it's kind of a tool that acts or a facade that acts like a proxy in front of multiple GraphQL or any kind of API for that matter, but we focused on the GraphQL parts and you can you can make transformations so that schema can can, well, the composite result can can live together and be exposed as its own API in front of the others. So it's kind of a, well, not, if you think Federation is two ways, but this is more like a one way Federation. And we are now having that as a POC. And we will see where it ends up. It's it's early on, but it looks very promising.

Yan Cui: 15:41

Okay, that sounds interesting. Not heard of GraphQL Mesh before, I guess I'll definitely check it out. And I'll include a link in the show notes as well. So if anyone else who want to look at how this works, and how it differs from Apollo Federation, you guys are free to go and check it out yourself. I want to circle back a bit about Polestar's decision to go serverless in the first place. Looks like Polestar is quite a big company with lots of different teams. And in this sort of enterprise environment, and it sounds like you guys adopted serverless. And it's gone incredibly well for you in terms of delivery speed. How did Polestar decide to go serverless in the first place? And maybe can you quantify a little bit about how great it has been going for you?

Anders Quist: 16:26

Yeah, of course. Well, the, I, from what I've heard, I, I didn't, I wasn't part of this when when the decisions were made initially. But from what I've heard, and so they wanted something that was state of the art in terms of building delivering and running applications. And so they had kind of a kind of a competition, I think, between two main actors, and they they was very pleased with what they saw in the serverless way of doing things. And in terms of that, I think it has been very successful for them, well, of course, there have been some bumps along the road, there always are, I would say in any development project that I have been in. And that's part of the job, I think, and solving the unexpected. But, and also, I think, as I mentioned, scaling this, we were side, we have this development initiative, mainly in a town called Gothenburg here in Sweden. It's not a very big town. So you could say that development resources are, it's scarce. So it's pretty impressive to see that you can onboard new developers, and they catch on pretty quick and become productive very fast. So that has enabled both the scaling aspect and meeting all the really important deadlines to be to be having that being met was, well, it's kind of a mind blowing thing for me, because usually, deadlines aren't, if you're working with hard deadlines, it's quite hard to to, to to meet those because it all goes back to, it's hard to, well, have a common picture of what is actually going to be developed. And so that that's, I guess, a totally different discussion. But I think in terms of what Polestar is doing, I think that's the future of system development, because it's very efficient. It's very lean going the serverless way, and it goes back to all, well, a bigger question about validating, hypothesis, etc. that, I believe if you're a bigger company today and want to do a major development initiative or project, I don't see any other way, of course serverless is not the optimal solution for everything. But if it fits your use case, I don't see a real strong argument not to go that way, at least at least before you know if it's a valid idea and you need to prove the hypothesis and all that. I think Polestar quite satisfied with where we are at today, and what has been done. And there's a lot of work left, of course to do.

Yan Cui: 20:15

Yeah, I totally agree. I mean, the amount of work that you can save yourself by going serverless is just no brainer for most companies, unless you've got really specific needs, you've got challenges that right now just don't fit very well with the, I guess, the event driven model that serverless application has. But otherwise, I do think for most applications out there, it should be the default way of doing things. Unless you've got some other specific requirements, in which case, you may still go back to containers for just those workloads. And I'm glad you say Gothenburg. It has been on my list of places to visit for a while now. It’s known for seafood. And I’m a big seafood fan.

Anders Quist: 20:58

Yeah, let me know. And I'll give you some tips. If you ever get here.

Yan Cui: 21:03

Oh, yeah, yeah, as soon as the lockdown is finished it is one of my things, one of the places I'm going to go and visit. Okay, let's go back to the challenges of being with AWS China. There are numerous challenges. I've worked with some customers who have to like, like, like Polestar, have a presence in China and therefore have to use AWS China and you find problems like some services are just not available. Or maybe the even more basic problems like you can't sign up to AWS China unless you have a legal entity in China, or you're partnering with another local company with all the business challenges that you have to solve. And then there's also the fact that, AWS China is not really operated by AWS, this is operated by AWS partners. And they offer, I guess, API compatible implementation. But that's not really the same as the rest of the AWS services infrastructure. So sometimes you get weird, like error behaviours, or just how some of the scaling behaviour, error handling behaviour is just slightly different. Is that similar to the experience that you've had with AWS China? And how do you guys go about consolidating this very different AWS experience you have in China and outside of China?

Anders Quist: 22:24

Well, yeah. I can only agree with all that you have said. But it has gotten better, I think. Because when when it all started out? There were a lot of services missing, like for instance, certificate manager wasn't there. So generating SSL certificates, yeah, to do some other way. Now it's there. Yeah, I haven't tried it out. But it seems to be mostly like it works in the rest of the world in AWS. So there, there have been some some bumps. And today, I think it works pretty well. But historically, we have had some issues. And well, it's more like acknowledging small pain and big stuff, like not having a single sign on. And having lots of accounts is not perfect, in my opinion. It, it makes it just more of a hassle handling for me, and everyone that needs to work with China. And, and it also adds stuff, even if it's quite small things like Lambda@Edge, it adds on. You need to solve these things some other ways. And that builds to kind of inherent complexity that you can either contain in China or you can, you can have it as a holistic solution for everything. But what we are having issues with today is, well, it's not big stuff, or you, well, in some aspect, it's big, but it's like deploying to China. Yes, they have code build, but that's not the CI/CD pipeline that we have chosen. So we are deploying from the outside and that's not always optimal. So these things can prove quite a task at times, of course. And also the whole user experience thing. It’s… We're all reciting, most of us, we have teams that are sitting inside of China and developing as well for the Chinese market. But the majority of us, in this part of the Devhouse, are sitting outside of China. And that's, that's not the easiest place to be if you were talking user experience within China, for instance, how how, how do we improve the user experience for for someone? Well, using 4G in China that that's hard to accomplish. So that brings us to have other layers above the AWS solution. And I believe that's where the added complexity and handling comes in, so to speak.

Yan Cui: 26:11

Okay, so that brings up a really interesting business related question. Because I know from being Chinese, I do kind of follow what's happening in China, in terms of some of the technology trends is very much a mobile-first society, a lot of people just don't have laptops, they just have their mobile phones. And, you know, you got platforms like WeChat, which is more of a, like a, like a mini OS on its own. It's more than just a chat application. You can do everything on that. So I guess, do you guys have to build? Is this what you were talking about in terms of a different layer so that you can, I guess, give that integration, get integration with platforms like WeChat? And the focus for, I guess, mobile-first users, as opposed to someone who's most likely going to be buying cars and configuring cars on the desktop?

Anders Quist: 27:03

Yeah. Yeah, of course, it's it's about being where were the users at, of course, but but also, finding like, CDN that, that actually operates and is well established within China is, is very crucial, I believe, to to accomplish a mobile-first user experience that is worth worthwhile. So we, we are currently in in looking at the alternatives. We we... There aren't many CDNs that operates both globally and inside China. So yeah, it's, it's, and also having GraphQL queries is, it's hard to cache and handle in that in an efficient manner. You can cache if it's get queries, but then there are other constraints pertaining to that, of course. So so it's, it's high up on the agenda to solve these or continuously working with these. Because it's such such an important task. Since, well, you mentioned it earlier, for us to be able to be in China, we need a legal entity and Polstar is, actually, Chinese company, owned, Chinese owned company. So that's, that's not an issue for us to to to solve. But on the other hand, it's very important for us how we perform there, how the application fits for the real user.

Yan Cui: 28:59

Okay, that's actually another interesting point then because AppSync itself offers caching is one of the, I guess, one of the advantages of using AppSync is that it gives you a very flexible caching options. So are you guys going to potentially implement something similar in terms of the sort of resolver level caching and maybe integrate with something like memcached or ElastiCache as part of your sort of caching strategy for your GraphQL APIs?

Anders Quist: 29:29

Yeah, I, well, we have talked a bit about that. And, well, I think the AppSync way to go in handling caches is a very nice way to have it other than trying to solve it in our own, on our own. And while it it's something we will have to work out in when we are trying to do proof of concept in for for AppSync. It's it's on the roadmap. And, and also, since it has also been one of the services quite recently added in inside China as well, it makes it a potential candidate to try out but I'm not that knowledgeable in what might differ inside AppSync from what it's provided in China, and in in, in terms of what is provided in the rest of the world, so maybe you are,

Yan Cui: 30:40

No, I… No, I'm not either. Haven't used AppSync in China, I think the service I did use quite a bit before was, I think, Kinesis in China. And we have to look at the Lambda@Edge as well in China which wasn't available at a time. I don't know if it is now. But yeah, I think quite a few of the companies I worked with before actually ended up using AWS region in Hong Kong which is still close enough, but it still falls outside of the AWS China bubble. So it still gives you close enough in terms of geographical location, therefore latency and performance all of that. But it gives you the same consistent user experience or developer experience when it comes to using AWS. But yea, the few times I had to dip my toe into AWS China, it has always been like what you're experiencing now. Okay, how does this service actually differ from what I know from using AWS outside of China?

Anders Quist: 31:44

Yeah, yeah, exactly. Well, it will be something that we have continuously with us, so.

Yan Cui: 31:55

Okay, so one last thing I also wanted to ask about is that you talked about how you're using SNS and SQS, and you've got quite a lot of those as a means to do asynchronous communication between different value streams. Have you looked at the EventBridge, because I'm seeing a lot of people moving from using lots of SNS topics and SQL queues to having a centralised event bus in the EventBridge, because of the fact that you can do content based filtering. And they have added better support since the last re:Invent, for having a centralised event bus in in its own account. And then you got better support these kind of cross accounts communications through EventBridge. Is that something that you guys are thinking about in the back of your head as well?

Anders Quist: 32:43

Absolutely. That's high up also on the backlog to look up next. We we have discussions on it. It's obviously the best alternative for us. And also, it exists now in in China as well. So it's a viable option. And we... I think it's something that also needs to be done pretty soon, because the more we add on in the current setup with with more or less kind of point to point communication between the various value streams, it will grow pretty fast, it can grow pretty fast. That and then it will just make this kind of migration more, it will provide more, or demand more work to to perform it. So yeah, it's it's, it's absolutely on table, but we haven't really done anything yet. And it looks like the right tool for for what we need absolutely.

Yan Cui: 34:04

Right. Right. And I guess another downside of having lots of point to point communication between microservices is that now you have more chance for a, I guess, cascade failures. So when one service fails, and anything that needs to make a point to point communication to that service, potentially can also be impacted as well. So I guess that's where, you know, another benefit of having this sort of more asynchronous communication patterns.

Anders Quist: 34:30

Absolutely. Yeah. I, well, I think it's a clear case for for us but we we are kind of as a as a platform team. We are kind of in the midst of everything. So there's seldom, rarely, we rarely have the time, we need to to make the time to be able to do this kind of work because it doesn't really, it's a cross-cutting, definitely a cross-cutting concern. So it doesn't really belong in a particular value stream. So that's a bit of a challenge when it comes to these kind of architectural changes in the setup that we have. But we are, we're working on our roadmap. So it's it's on that roadmap, but... And hopefully this year, we will start working on it.

Yan Cui: 35:25

Okay, I guess the best of luck in that case. And hopefully you guys get all the changes that you want to get in there in terms of AppSync, in terms of EventBridge. And I want to thank you for taking the time to talk to me today, and sharing your experience with the audience we have here. Before we go though, is there anything that you'd like to, you know, anything else that you'd like to share with us? Maybe is Polestar hiring in Guttenberg?

Anders Quist: 35:54

Sure, they are continuously growing? And I think, and this is my personal experience, I, there's no other place working with this technology that I would rather be right now. It's because I believe this is, this way of doing things is definitely the, it belongs... If people don't think it's the future of development, they, they should be, I think. And it's really fun to to work with these things. And I learned a lot of things the past year that I've been with Polestar. And that's, for me, very important, and really, really fun. I have really fun working with this. Oh, yeah. And thanks for having me, Yan. It's been really, really interesting.

Yan Cui: 36:54

Cool. And if people want to ask you questions, potentially about what you guys are doing at Polestar, and maybe about opportunity as Polestar, how can they get in touch with you on the internet?

Anders Quist: 37:08

Oh, I can provide my, my contact info to you and maybe added to this podcast. So I can get them in touch with the right people if they're interested in in what we're doing.

Yan Cui: 37:23

Okay, sure, I will make sure those are in the show notes. So anyone who's looking to explore opportunities for working with serverless technologies that are really interesting, and I guess the cutting edge business area, then yea, check out Polestar. So once again, thank you so much, Anders. And hope to catch up with you soon and maybe even see you in person when I get to visit Gothenburg.

Anders Quist: 37:48

Thank you. Yeah.

Yan Cui: 37:50

Okay, take care.

Anders Quist: 37:51

The same. Bye bye.

Yan Cui: 38:05

So that's it for another episode of Real World Serverless. To access the show notes, please go to realworldserverless.com. If you want to learn how to build production ready serverless applications, please check out my upcoming courses at productionreadyserverless.com. And I'll see you guys next time.