Real World Serverless with theburningmonk

#11: Serverless at M&G Investments with Daniel Bass

May 13, 2020 Yan Cui Season 1 Episode 11
Real World Serverless with theburningmonk
#11: Serverless at M&G Investments with Daniel Bass
Chapters
00:01:45
Selling serverless to the enterprise
00:03:04
Selling serverless to security teams
00:06:07
On M&G's journey into the cloud with Azure
00:08:14
What are Durable Function?
00:10:46
How are you dealing with cold starts?
00:14:05
What has been the benefit of going serverless?
00:16:18
Are you using Azure Devops?
00:17:49
How are you testing your functions?
00:20:32
On the lack of an IAM-equivalent in Azure
00:24:44
On the tooling support for Azure
Real World Serverless with theburningmonk
#11: Serverless at M&G Investments with Daniel Bass
May 13, 2020 Season 1 Episode 11
Yan Cui

To see the open positions at M&G, please go to https://global.mandg.com/careers.

You can find Daniel on Twitter as @danbassdev and read his blog at https://www.danielbass.dev. He has also published two books on building serverless architectures with Azure:

For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.

Opening theme song:
Cheery Monday by Kevin MacLeod
Link: https://incompetech.filmmusic.io/song/3495-cheery-monday/
License: http://creativecommons.org/licenses/by/4.0/

Show Notes Transcript Chapter Markers

To see the open positions at M&G, please go to https://global.mandg.com/careers.

You can find Daniel on Twitter as @danbassdev and read his blog at https://www.danielbass.dev. He has also published two books on building serverless architectures with Azure:

For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.

Opening theme song:
Cheery Monday by Kevin MacLeod
Link: https://incompetech.filmmusic.io/song/3495-cheery-monday/
License: http://creativecommons.org/licenses/by/4.0/

spk_1:   0:00
in this episode ofthe real World Service I spoke with than your spouse, who works as a senior developer at Himeji Investments, a large financial institution with nearly 10,000 employees. And we spoke about their journey to service with a jaw and your functions and what they have learned over the last three years, including security and some of the pain points that exists with service development today. Hi. Welcome back to another episode ofthe Real World Service, a podcast where I speak with re award practitioners and get their storeys from the changes. Today I'm joined by Daniel Pass from M and G. Hi, Daniel. Welcome to the show.

spk_0:   0:52
Thank you very much. Good to be here.

spk_1:   0:54
So let's start by talking about your background and what M and G do. And how do you guys came to using service?

spk_0:   1:02
Sure. So energy is savings and investments company. So essentially, while company does is helps people safe or invest money to achieve like financial goals. So that could be retiring. You know, sending a child to university to do that. If you see we have light very high security requirement around our technology because there's a lot ofthe money moving around on DH. I'm a developer within the private credit section of the business. What they do is find small companies who want to borrow money on DH investors who want to get a good return on. Essentially, we build products that fit for the investors.

spk_1:   1:45
Okay, that's quite interesting, because as financial company as the furniture enterprise you guys are not known for risk taking, especially reinforces security requirement. Did you have a hard time selling service to the higher management that they let you use? Service Technology's running fully on the cloud.

spk_0:   2:05
Yeah, so we started our journey about three years ago on. There was definitely an education peace when we first started. But one of the great things, particularly around security for services, that Storey is actually pretty good, particularly compared to rolling your own solution. Because the run times managed on DH. Basically, all of the rest on DH attack surfaces with the clap rider. That was quite a potent message for us to sell, you know, saying well, rather than just do it will let the people who you know have a massive cloud security team who really know their stuff. Do it Instead, there was certainly some brush back in terms of it being a new way of architect ing things as well. I think that's where combining service components with Powers components has really sort of helped us because pass pass components operate more like your conventional applications. So that's what allows us tto take the best of both. You know what suits us best?

spk_1:   3:04
So we love the security teams that are interacting with in the past, one of those off stumbling blocks when it comes to thinking about service and deciding whether or not service is a good fit for the company, from a security point of view, has been that what they just don't know how to reason about the security implications? Andi Also, I guess it's that uncertainty around. How do we manage security going forward in this new world where we have no control on the infrastructure to remember some of the conversation that you had to have with your security team in order to convey some of the messages that you had and also made them feel confident about the security going forward?

spk_0:   3:44
Yeah, so I think our security teams Avery open, which is good Teo sort of new ideas and things like that. Some of the issues have definitely been around networking. I think that's particular common one on particular stumbling block with service. You know, I think a device on mark softer only just recently really have sorted out using surveil us within private networks. So the conversation I sort of had was around basically rephrasing the question. So usually when engaging with security, obviously what they want to do their interest. And it's an interest I share as well as they want to stop people hacking into whatever we're building on publishing it on Twitter or whatever. So they want to minimise that attack surface and won the ways they're used to doing that by locking down networking. I always try to flip the conversation and say, rather than relying on networking, why don't we make every single component of the network hardened? There's a Google paper on it. A while ago, zero trust networking how they ditched the traditional VPN and instead made every component within the network on off controlled component that kids would stand attack on its own. On DH, I found that sort of buy rephrasing the approach and the setting, right? I want this thing, this app on its own. Teo, just be able Teo authenticate some users. I wanted to have its own, you know, services In terms of detox protection, etcetera, I found that was definitely a good way to show both. There's a different way of doing things and also that I'm actually not just the developer trying to get away with it and get into production as quickly as possible. We've actually thought about their security issues with what we're doing.

spk_1:   5:25
Yeah, I'm so happy that you mentioned that. Zero trust the networking eyes. One of the things that that just makes so much more sense when you think about it. But it's such a challenge to communicate that Teo many the traditional security teams who are just so used to doing things in a particular way. And I'm really glad that you guys are much better luck with your security team to be with you. Convince them that this is a way forward and zero Trust, always much better, must safer compared to just hiding behind your VPN or your B P. C s, and you can still get compromised from within if the application is compromised through your dependencies and whatnot. So just because you're hiding behind the network Ah, fire war. It doesn't mean that you're actually totally protected from external actors. So your guys are running on as your on their running, using your functions heavily in your stack. So you say when you still joining Started street years ago. Was that when you first went into as your were you guys running other stuff on your You're already

spk_0:   6:22
s O That was That was the start of our journey into is your infact we started off with what's now Google Cloud product called Apogee. We started an internal rester ap I push trying to make sure all of our systems are available. Ahs internal AP eyes just really paid dividends. But then that was my first taste of cloud. And then we moved on to Microsoft is your on DH started to basically workout. Learn a cz we went on. What's the most appropriate architectures? You know, all of that sort stuff.

spk_1:   6:57
So now this is your service stack involves a lot of your functions. What are some of the work clothes that you're doing on that? Sounds like you guys doing a lot off a P I stuff. What about any of big data processing any machine learning stuff?

spk_0:   7:11
Yeah, So there's a lot of rest FBI things there's. There's also some stuff that it's always difficult to know the definition of big data these days. But being a financial company, some of the elements Aaron is your functions would count his big data elsewhere, but we probably don't think of them as such. So it's a lot of your functions, which basically have Cem paralysed workload onto them. But that might not be megabytes, megabytes. There might just be very complex calculations in terms of other title workloads like machine learning, a big data processing proper. We tend to use as your data bricks for that sort of preferred platform there. Although I have been doing some experiments with Python, is your functions to sort of, you know, see weather for the smaller for where spark is, too a bit too heavy way, But normal is your functions aren't quite good enough. There's your durable functions, which sort sit in the middle. I've bean sort. Playing with that yeah is mostly rest eight guys timer's cues that sort of stuff

spk_1:   8:14
and for the listeners who are not familiar with the Jew and the difference between normal as your functions and your durable functions can just maybe explain to us what are durable functions and why are they better and how they are using it? In this case,

spk_0:   8:28
Yes, a job or functions I find really interesting. They're basically an approach to state full service, which I believe is a similar in concept to the A D. E. B s step function on DH. What they effectively do is the main thing that a jury will function introduces. It's fundamentally different is Thie orchestrator function, which is basically a single state form is your function which can call out to normalise your functions. To do work on DH where that's really helpful is in a normal service. Architecture. Er there's a lot of talk around doing event based processing, which is which is great, but inherently is part of that. What most people start to do is use choreography, rather an orchestration. There's a solution to the events model, so in choreography, you effectively function here, put a message onto a queue, then you know another function picks up on. Basically, each individual component as a result of its local action, gives rise to a global workflow On. Where that can get tricky is if there's an error or something and you want to roll back, that can get quite hard. And if there's bugs that can get quite hard to understand what state work flows in. So what job functions do is they introduce an orchestrated function, another the hood that uses all the same cues and stuff that you had set up yourself. But instead, it's a single functioning code on you, Khun Debug it, using normal development, all TVs code which your studio, whatever. But you still get the scale and benefits on DH off. Being distributed

spk_1:   10:03
right now. Are you guys say a traditional dot net house? Do you write most of your functions in node? Or do you write them in C sharp?

spk_0:   10:11
Yeah, so we had a strong dot net background, but also a Java background, so they're sort of two strong language, backgrounds and energy. We I'd say most of our functions are written c sharp. My team, personally at the moment, is using typescript mostly no Js, and that's because we're doing a lot of work with Slack, which has a very well supported no no J A s S t k and framework. So as sort of local teams approaches, you know, choose best language for the job is a broad rule. I'd say it's C sharp java.

spk_1:   10:46
Okay, so on a devious one of these off common complaints about the using languages, right, See, shop and the Java is that your co start is just gonna be so much worse. And this, of course, make it difficult for you to use those languages on AP eyes where you're more concerned about use of facing Leighton, see from the numbers have seen your has even less predictable coastal agency compared to 80 is Lambda. So how you guys feeling with that Costa issue when it comes to functions that are written in C shop in the driver?

spk_0:   11:19
Yes. So there's been a lot of good work from as your functions in terms of improving that has definitely got a lot more reliable. I think we kids almost do with Rerunning II. Don't if they've got internal benchmarks or whatever, but some of the ones I see around don't reflect what I see in the wild from our production logs, for example. But the same time, inevitably, you're loading a virtual machine. You know the CLR. So it's always gonna be a bit slower than interpreted language like python or job script. So we have a range of approaches. Sometimes it doesn't matter that much. If its user facing, then obviously it does matter more. But quite a lot very prize aren't so Wait, just leave them. The other approach is that we have because is your functions the way they're deployed? A slightly different 80 years London. So they tend to share. They share a function up underneath. So you deploy multiple functions. That same scaling unit. That's very useful because you can basically say if you've got a a whole bunch of is your functions, you want to deploy and you've got a cue that you know is gonna be going off all the time. If you just attach your http, trigger your rest ap I function toe that function that cues right more than it's always gonna be warm so that that helps too. But mostly what we do is make use of premium functions, which is a little bit like reserve concurrency. So in the West London, as far as I'm aware. So basically you always have an instance running or were deployed to ease Europe's services, which is thie past instance of a similar sort of thing. So, yeah, it's sort of a range of approaches we'll use, and it depends on the team's how comfortable they are with serve us as well, because we're quite big organisations. So the maturity varies from team to team

spk_1:   13:07
rights, the rights culture. So with their premium functions. That's the thing that you're launched the last year. Remember correctly, where you pay for having stood in number containers, always running so that you switched your payment model so that you also painful up time as well as the invocations. Funny mentioned the wording. There was reserved concurrency. So lam the after Happy Spain called Reserve concurrency, which is something different. Sets of similar feature Teo criminal functions. This court provisions concurrency a lot. People confuse that when they think reserve concurrency the thing. Oh, that's the number of concurrence containers that are always reserve and always available. But it wasn't how the thing worked at all, but because the name it was bad so long. People got that confusion and then last reinventing announced provisioning currency, which is equivalent Teo, saying I want to keep a certain number of containers always warm so that I don't see any co starts. Yeah, naming is hard, so you guys are three years into your journey into survey lists. Where are you seeing as some of the benefits that the service giving you as a company do you find that the teams are doing servers are just moving a lot faster? Are they've seen less outages in production?

spk_0:   14:21
Yeah, I definitely say the teams that are making heavy use of survivalist will tend to be. I mean, it is always a difficult thing to solve. Evaluate with any software team, right? But I'd say they tend to be moving quicker on DH, tend to have happier customers. Being a big company, we have a mix of customers being actual custom. It's, you know, people giving us money, force our internal customers. Yeah, I'd say there's definitely benefit there. There's also the benefit in terms of like lack of outages, etcetera. So I'd say surveillance systems are definitely a lot more resilient. We have a lot less. You know, a part of this comes from the fact that the same time you moved over to infrastructure, it's code on DH, fully automated, deploying pipelines, etcetera. But in general, we're requiring a lot less maintenance and production. You know, there's less that you basically deploy your functions as long as it's no switched off. For some reason, it work. We want to use power services, lots of these Europe service, and that's that has a similar. It definitely has a few more issues and surveil us. But it does. That's also contributed a lot sort of increasing velocity and reducing outages.

spk_1:   15:27
So you mentioned the infrastructures coat there on a PBS. We'll be using confirmation or using some other saw third party to with, such as the service framework what you guys use at M and G.

spk_0:   15:41
So the recommend it all and is the one of youse is something called her arm templates so that would be roughly in are just to your cloud formation and that it's built there. There is a Microsoft product is first class support for Microsoft. They're built in Jason. I'm no, a great lover of um, to be honest, but yeah, it's your classic sort of state ful. Declare what resources you want. Pass in some variables on DH. Ahs. Your goes often create social resource is in this state you want them for

spk_1:   16:18
and for the C. I see these sort of things Are you using as your develops, which I've heard quite a lot of things about.

spk_0:   16:25
Yes, way make heavy use of is your develops is really we used to use Jenkins. And when we move to the cloud we switched over to using is your develops for pretty much everything except for issue trucking storeys. Jared, that Andi yeah, is your develops really is very good. The particular benefit I find with it, particularly with junior developers on my team, is the previously people would fear, you know, the power short scripts or whatever scraps that ran on the world's over. And it's it definitely took a long time to sort of get people comfortable with actually editing a release. It was sort of treated as something that he learned when you got Mauretania like, whereas with is your develops, the barrier to entry is just so much lower, you know you can get people productive with it very, very quickly on DH, it's Yeah, it really is very good. The only trip why do you have with it is obviously because you're leaving away from having a script checked into your repo, although we will still do that. Some degree of your release pipeline we live in is your developed only which can make creating local integration tests etcetera a little bit more challenging. But yeah, in general rides, it's really is an accident tour, I say. It's a major part of why I m Angie's like automated U. C I. C D. Pipeline stuff has progressed so far. Is that

spk_1:   17:49
so? You mentioned the testing there. How do you guys go about testing your application? We have slammed the the approach. I'd normally take it. We'll invoke my function locally. On talk to review a diaper service is to make sure that the equation to say Donovan, Deby and other services are working correctly. So I get some confidence before I deployed and then test the whole thing and end by hitting tow a ZPM points. I'm not sure what's off. Things you could do with your functions are the other tools that allow you to see Emily things easier locally. Like you said, it's harder to run integration test for your functions. So way so approach there.

spk_0:   18:25
Yeah, so is your functions run locally, in fact, with the exact same time that they run in production, which is pretty hopeful. So when we do integration tests, it will be a mix so some of them will run locally. That's pretty helpful when connecting up to Riel is, your services are not generally a family of mocking cloud services of them when they do silly things, but yeah, well tend to connect upto remote services. But then what will also do is there's usually a step in our release pipeline, which is a fool environment that we spend up in the cloud. And then we'll just run a whole bunch of tests on that and that they tend to be into end tests, you know, controlled by Cellini, um, or something similar. We find That's definitely a lot more of a reliable way of doing what you want to do, which is preventing user facing box.

spk_1:   19:18
So when you say you're so I see the pipeline speaks of a new environment doesn't mean that every time you run your CIA deployment pipeline, you run you integration test, probably running locally. And then you bring up a new environment and your to run the intern test. And then afterwards you tear down the environment.

spk_0:   19:36
Yeah, so what? We'll do it sort depends sometimes how long those tests take. Obviously, some of them, particularly in tow, end 12 He's spinning up the entire barn, and tearing it down can take quite a while. So integration tests, which are just depending on cloud resource, is, you know, maybe a storage account. Roughly in are just 80 reassess three, but with a few other extras on it, we might keep one of those just deployed all the time. And then those integration tests were run on every single build because they can run very quickly. But things like for them to win tests. Yeah, there'll be a step in a pipeline which deploys the entire environment, spins up, runs the tests and then spends it down again afterwards. That oversee depends on the scale of the project. You know, big projects will do that farm or small projects. Sometimes it's no, is not necessarily worth the outlay and effort up from. It's a case by case. I guess

spk_1:   20:32
so. One of the things that found that when I saw experimented with your function before was that this was one of the pain point that I came across pretty quickly was that there's nothing similar to thes I Am permissions model that you have for land, where you can specify what every function can ask us through the I am permission in model and we found your functions. If I remember correctly, you have to do a lot of work yourself from the function to authenticate against, I think was your 80. And then from there you can control what that particular function can ask us. Is that still the case?

spk_0:   21:05
Yeah, I definitely say it's a place that mark softer working on is definitely the harder part of using, and it's not just is your functions. I'd say it's sort of across the platform, but they have introduced manage service identities, which has made that significantly easier. So that's where basically, whenever you deploy a function, it auto creates and is your active directory app on that lives with the life cycle of your your function or your APP service or whatever it is that you're using. And then you can grant that permissions in a similar way to you would on a tube us to, for example, key volts. That's quite common thing that we'll do. So we'll stick keys to get into services or, you know, whatever in these key volts and then gave permission using the manage service identity into those cables. We've also got a internal piece of software which basically managed the helps us manage their life cycle off active territory Service principles, which APP identifies basically on, also will rotate keys for us on a regular basis. So if we want to access some of the resource will basically set up a job with the service, which will rotate. That resource is keys through the keyhole. And then we just give the man your service identity access to the keyhole on that. That was quite nicely. I'm sort of looking forward, Teo. Over time, those manage service identity is becoming available on absolutely everything being fully first class. But yeah, it's definitely a more difficult point at the moment.

spk_1:   22:41
Okay, that's good to know. List that working on something that improvement things gradually. What about are there any others are profound limits or lack of tooling that you find are making it difficult for you to work with your functions on a day to day paces?

spk_0:   22:57
Yes, it's tricky. I'd say there's definitely issues around. Armed templates is know as your functions directly, but it sort of contributes to it because you basically need to use them, obviously have to. But it be very inadvisable not to use. Infrastructure is code with service on. They are very tricky and unfriendly. I make you definitely do it. Cem Cem Better tool him, which I no work being done on, but it's yeah, that's particular source of paying in terms of your functions itself. There's probably some issues around version management. So we're on version three now, porting over from version, too. On DH, That sort of journey has been a bit mixed with US customers, realising the version, too, was built on a version of DOT net that was going to go out of date. Gonna go end of life with only a few months notice now, obviously, Mark Soft immediately said right. We'll support it beyond end of life will be fine. I think things like that they can be a bit charging. The other side is the Lenox support. So is your functions sort started on Windows and has Lennox now is fully supported, but it's definitely the experience is a lot better on Windows, and it is on Lennox. Still, I'd say on DH, languages like Python are only available on Lenox, so you sort of have to fight that if you want to use those languages. So yeah, I think things like that feature parity as well. There's some issues around features. Will. All features will be available on CD. Shop. Windows can sometimes be hard to tell which features are available on the you know, for example, that expired. You know which. Which of those is your function. Features are available. Those would be the main things. Generally tooling like itself for developing is very good.

spk_1:   24:44
Okay, And one of the guest. One of the things I found when I played around with your function before was that what a zoo in general is that a lot of tooling Zo tied to visual studio. So if using with the studio, everything looks great. But if you're not, there's a clear absence off support for Seelye tools and other things that that has that been improved now always issue very much. You have to allow the things that through which your studio.

spk_0:   25:14
So I'd say I actually prefer developing as your functions in V s code. The support has changed drastically, while suspiciously you support stuff, see very goods. And given that it's Microsoft's off premiere products, there's benefits there. But I wouldn't say there's a meaningful difference between the two. Now here's your functions. Extension Soviets code is absolutely excellent. You can do everything. You can do it for official studio and then I can do that on Lennox On my math book, The hole is your functions. Platform is now fully cross platform. So that work on your Mac on a bun to whatever. So I definitely say as your functions on tight official studio, Which is good.

spk_1:   25:58
All right. That's good to know on DH. Thank you very much, Daniel, for joining us today Before we go Is there anything else that you'd like to tell us about Mng? Maybe you guys are hiring. Maybe you have some interesting projects and stuff coming up.

spk_0:   26:13
Yeah, So we are hiring. If you'd like Tio search for energy careers and the search engine of your choice, then we're hiring mostly in our number. Office at the moment. Please do get on deploy.

spk_1:   26:26
Okay, I will put the job links in the discussion below. And what about you personally use anything else that you're doing currently on DH? How do people find you on the Internet?

spk_0:   26:37
Yeah, So I've just sort of rebuilt my blogged on. I've got a few bits and pieces coming along. I want to do some work around globally deployed as your functions. So rather than deploying to a single region, deploying Teo thie entire is your cloud go. So that's that's working progressive moment. Hopefully Oh, I'll get sorted and come out with that soon. On DH, I was hoping to speak. It's, um, events. But obviously, given recent global global events, it doesn't look like that's happening until until September.

spk_1:   27:10
Yeah, order events that I'm supposed to be speaking at the next couple months are kind of cancer as well. Although they've gone virtual. So maybe that's where we're gonna find you on some of these virtual conferences. So again, thank you very much. Then I'm gonna make sure that your block and you're too the handle are available on the show notes so that people can find you and go read about your brand new block. No. Stay safe on DH Hope. Did he see you again sometime? Either virtually or in person?

spk_0:   27:37
Yeah, Well, thank you very much for having me. Yeah, it's been good.

spk_1:   27:40
Take it. But by so that's it for another episode ofthe real World Service. I hope he enjoyed his conversation with Dan. You pass from M and G Investments to ask the show notes and transcriptions. Please go to real world service dot com and I'll see you guys at the same time next week. Take care.

Selling serverless to the enterprise
Selling serverless to security teams
On M&G's journey into the cloud with Azure
What are Durable Function?
How are you dealing with cold starts?
What has been the benefit of going serverless?
Are you using Azure Devops?
How are you testing your functions?
On the lack of an IAM-equivalent in Azure
On the tooling support for Azure