Real World Serverless with theburningmonk

#4: Understanding risk and vendor lock-in at Moneyou

March 24, 2020 Yan Cui Season 1 Episode 4

This is part 1 of my conversation with Olaf Conijn and Tjerk Stroband at Moneyou, a Dutch bank based here in Amsterdam, Netherlands. We discussed Moneyou's journey towards serverless over the last 2 years and why it serverless makes sense to Moneyou. We talked a lot about risk and many of the misconceptions around risk when it comes to serverless. And of course, vendor lock-in was a big part of that discussion. It's refreshing to see a bank approach serverless and the vendor lock-in debate with such thoughtfulness rather than just following the mainstream narrative without understanding the underlying risks first.

In part 2, we will talk about the challenges Moneyou has faced with AWS Organizations and the difficulty of managing a complex AWS environment using the built-in tools. And why they built and open-sourced org-formation as a way for you to manage your entire AWS organization using infrastructure-as-code (IAC).

You can find Olaf on Twitter as @OConijn and check out org-formation.

For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.

Opening theme song:
Cheery Monday by Kevin MacLeod
Link: https://incompetech.filmmusic.io/song/3495-cheery-monday/
License: http://creativecommons.org/licenses/by/4.0/

spk_0:   0:00
Hi. Welcome back to another episode ofthe Real World Service a podcast way. I speak with real world practitioners and get their storeys from the changes. This's part one off my conversation with Olaf and the Juric from money You, a local bank based here in Amsterdam, Netherlands, and we've got some great talking points around the vendor lock in. And how money you as the financial institution got on the path ofthe going fully serve. Ellis, I n Hi. Thanks for having us. And that is I know you guys a few times and I'm really impressed by what you guys doing with surveillance and also the fact that many of you being a financial company and taking this re brave decision to go for a non service maybe let's start by talking about what do you guys dio waste money? You and your rose

spk_1:   0:59
here. Okay, Money. Who is Ah bank way are ah, dollar bank off the A B A number which is one of the top three banks in the Netherlands, and we have a number of financial products that we offer our customers in the middle of this and in Germany on DH, most people here. Netherlands would know us from our savings product, which is completely online. We don't have any physical branches, but at this moment we are focusing a lot on setting up our payments product, which is a nap. You download the app you get to your Mort so you we get to know. You need to know information about you need to get your passport. You need to take a selfie and at the end of the process you get a fully functioning bank accounts and you get a guards. You can use that for your payments and your anger savings. And this is this's what we do. Yan is, of course, the Airbus for people. Not here we are. We're nothing what you would think of as a classical banking office. We are situated in Amsterdam, east on the university campus. We have tree floors here. We developed this proposition and we run all the operations and everything that comes with it from three floors with a relatively going out ofthe people working here. I work here as the principal architect, so I get Tau think and plan and come up with a strategy on how we built this proposition. And now we continue to you, Wolf. It's an excellent is charity.

spk_2:   2:26
Yeah, I'm here is ah, Cloud architects Together we're setting up our frivolous platform to facilitate the workload Stacked makeup Teo Banking up.

spk_0:   2:37
Okay, So in terms of the work, you guys are doing what I do something that you're actually beauty. Ap I since I'm gonna be a lot of background data processing beyond she learning we

spk_1:   2:50
have a two moment. We are not fully service. So we have a core banking system that doesn't run turbulence. He's not at the moment. And we actually come from a background where we used to work with a partner organisations I just write to software and hosts over for us and this was typically dawn viens on mission critical clouds. So this is ahh, private cloud that we use as a corporate data centre. Over the last 2 2.5 years, perhaps we have bean adopting the strategy to insource more of this overdevelopment and also the operational responsibility to come that comes with that. And we have chosen toe select eight of us service as he darted bathroom to do so. So as you can imagine we've bean providing financial services on our systems for 10 years now. And that means also that toe. Five years ago, we already had a lot of legacy, So we started to incrementally move these workloads toe eight of us and then specifically adle yearsservice because we really wanted to stay away from all the over half that comes with having to manage your own operating systems, For instance, is your Phoebe sees? We started with small bugs. Then, as it turned out, significant steps waiting. We started with, for instance, the address book. In the payment of that we have, it's, Ah, very simple functions. It's easy to think about there's little risk associated with it, and we started creating a proof of concept daily air service in which the contacts that you use when you do payments to somewhere else, you will be able to manage in his first word loads that we put it on and were surveillance and ineffective peaches. Wasit wasa dynamite bebe to store the contacts into there's a couple of land after it allows toe reads, update and or delete these contacts on. We have a D I gateway with a custom authorise her where we can use our authentication services that we use for the application toe. Verify the identity of the user. And this is this was our starting points closer two years ago, and and and from there on, we started to build new functionality on eight of us. And we have servers over dinosaurs grew in number and then also in complexity and criticality. Looking back, we have developed a lot of different systems on NATO. Yes, most ofthe. It is still supporting our payment proposition. So this is Ahh, front and facing, typically runs behind FBI. Gateway on typically start to support our mobile application that she was very customers.

spk_0:   5:33
So that's I think that's a quiet, classic migration pattern. When you start something not so business critical on DH, then use a graduate. You use this a shandra pattern to take more and more your motive and put them into a brave new world. Where service? Another question. I had to guess. Is it more for you? Jerked is how many teams do you guys have working with service today and how the guys go banters off the operational overhead to the developers, own everything and end on the encore for your services. Have you guys go about change the culture off, enabling developers arm or the application?

spk_2:   6:14
So our strategy was not to lift on shift work loads into anyways, because that's just basically moving stuff, and it sold some complexity. But so we focused on basically developing new features, and we currently have round five teams that are working with service. We do want them to own it right through you. Build it, you own it. We are doing develops. Not everybody is participating in it because our teams are mixed. Some are you okay rolls from our external. So we try to get the talent together that we need to quickly develop and move forward. But guess so. Basically, what they do is they developed the features more or less an trend, and we deploy it fully automated. You need Lassie I and CRC processes there.

spk_0:   7:02
So you said the team's own. It can be explained in terms of what does that actually mean? You own it guest. Part of that means you own the architecture decisions on what services use, how to deploy things when deployed them, but he also become responsible for up time. This? Well,

spk_2:   7:19
yes. So basically, we do have a design process where he left one team, and your techs are able to make sure that we don't take decisions that would take away from art or the availability of resilience that's around because we do have a lot of constraints there. Luckily, using anyway of services that is typically very easy. So it's a simple. It's a simple process of just making sure you use all the best practises and really think about. Let's say we'll be in cases So that design process is daunting collaboration with the architect. But yesterday team developed the software. They're in charge of testing, and they can decide, basically went to deploy new features through production. We're using tools like she me. So So we know almost instantly when stuff is going wrong. Our experience actually does far is that it doesn't break could more or less so they're stairs hiccups. And obviously so most of the errors we got are actually confetti arrows, true to back and services that are hosted on prints. And that's actually the majority of issues. So Landa Dynamo D b sus snl step functions has all proved to be extremely reliable and actually never cause errors that are outside of our own fault. So teacher, a programming error or some some back and service not available. But that's done on premise. So is providing support. Operational sporting develops duty, asking a breathe

spk_0:   8:56
okay and that the police are coming from the development in yourself, Or

spk_2:   9:00
do you have, like, a centralised supporting for? Interestingly enough, we both participating the any develops duty, so we have a duty once every week and, well, I haven't been missing. Obviously, that's very good. Or at least not because of that. I have anything to add to

spk_1:   9:18
that we do give our teams a lot of responsibility. And also autonomy and design is ations stale because we do have to conform to all sorts of risk and compliance Framers. We have invested a lot in making sure that we have all right guard rails in place. We invested a lot in our multi account set up. We know exactly what permissions is needed. We have love operational tasks that developers might need to do as part of the death of duty. But we have alarms on these, and there's always were attractively appointed, which from within the best from team, we need to go back. And we need to be able to understand what somebody did and why s Oh, yes, we do give our silver development teams a lot ofthe autonomy on a lot of responsibility, but staying in control, tow us is here more important. And there has been a lot of work that is going to

spk_0:   10:06
that. Okay, let's touch under some multi account organisation bit later. Because when you guys appear here is quite impressive. For now, you mentioned risk and risk frameworks many people would. It creates a being in finance that we've reversed the decision. May Kane. I don't want to touch anything new that the cars is new. We understand it's dangerous. How do you guys came about being a bank in the Netherlands and still going on this brave journey into this service? Playing where so many of your guests in the competitors are still okay? We don't want to go into the cow because it's not safe. It's not curious. No, no Esswein dead. How did that decision came about? For you guys

spk_1:   10:48
actually, locally being part of the A B and A B and E se. I think one of the banks in the Netherlands and I must say I don't I don't know the other banks that well, but I think they have bean quite advanced in adopting the public labs. And this, of course, a very helpful for us because being part of the area thinking and planning our software, only eight of us allow us to work together with the departments from within. A bien did have been ableto have already done so and also helped us to adopt their thinking and their approach toe risk and our compliance. When it comes to the public that we were looking for an application development, that's one that fits our company. And if it's the assignment that we have from a Canadian, and for us, keeping a effective or inefficient organisation is very important, we are we are three floors, which is to say we are we're not that big. Maybe 150 people that work here and we have you to emissions and went with a number of people. We have Andi efficiency. We would like to keep in the organisation, which is we we can't build our own data centres. We gonna start mentioned around the ends. We don't get into vulnerability skin on hosts or or manual steps and remediating these risks. So was we've been looking for is a glass form where we can effectively outsource most. The complexity benefit from the services that that eight of us off ourselves a service and and that began our thing right. We would liketo outsource the complexity that comes with building or maintaining infrastructure. And where we see our value is is building on total city services.

spk_2:   12:31
And one of the traditional approach to risks is the risks for a bank and risk friend. Works are not changing that frequently, writes order. So given the long history of tea within banks, ah, lot of traditional solutions that address on the risks were all the older controls that the risks require have being more or less standard solutions and a bank banks have been building on top of that to develop new features. We have been given the room who actually maybe go back to look at the original risk and then try to see how we can essentially mitigate the risk without using looking at the traditional approach of building a castle defence building seven wolves, but really look at at the risk and how we could in a new way. The fun controls that address these risks and serve. Ellis has essentially given us a very good solution, built on top of all the security that Adria's offers and their bare Riggers internal process of making sure that they're compliant with, with all the standard

spk_1:   13:40
so true. It always starts, and people forget this. But it always starts with understanding the rich, because people then Teo want to jump a hat and look at ways to mitigate the risk

spk_0:   13:52
or solutions

spk_1:   13:53
that their usedto mitigate a certain risk. But you need to always on their stand to risk, understand the problem. You're solving it and and have a framework or a model that allows you to reason about this. Either. Either mitigate, sometimes accepted,

spk_2:   14:10
and but that requires a lot from your sizzle, basically, in terms of being able to actually reason about the risks, and some can actually under some actually understand the risk on a lot of Ci's O's are just basically following your playbook and were described and saying, Well, this is the risk. So this must be to control that addresses tourists. And so it takes a special kind of see so And we've had the fortune. We were fortunate after you two have a good c So gave us this this room And I think that's key. If you don't have that, then you're gonna It's gonna be a struggle, and you're gonna probably get very frustrated. I'm trying to explain something to a person that just is not susceptible or open to them and

spk_1:   14:54
total the people listen to the false God security or you're probably on a job. But this this this world is changing, right? And if you are approached to risk is to want to control the lot off its there. You're becoming a dinosaur. This is not gonna work, Andi even less so towards the future. You don't need control to mitigate risk. And and somehow people built this with South right? It doesn't seem counterintuitive for a large enterprise to contract the serum in the club as a south. But then somehow, when it becomes about building software on top of these size or bus services, people somehow get nervous. But it's a very interesting

spk_2:   15:37
me. So RC so was actually open two days because she also realised that by using sir villus his life would be much easier, right? So s. So let's let's work less headache. Because guess a CBS does manage all the risks and they are complying tool, the sock choose and the ISO standards. All that complexity is actually moved into vendor management and is still important, right, but is no longer giving you a huge headache as a CC because most of the controls are dealing with operating system security network security, penetration, network penetration, testing, monitoring users, logging into servers and restricting that and then but still allowing him to do their job and logging that also and trying to still like intelligence from all that, like lock data. There's so much complexity there and for service. I mean, try to log into the landline, right? It s so we have none of that, and that allows us to stay lean and mean and really focus on delivering business value. And that is ultimately, for us being a small that ambitions bank, that is where it counts. And I think

spk_0:   16:55
that's the really key message that you have to understand the problem right into sort of copied a solution from everybody. I realise that that's a really good line from a sun wildly So he's now over What a mapping, the valley treat mapping.

spk_1:   17:09
I love mortgage

spk_0:   17:10
here he's got is really of fun example for if Theo Ah, general and you want to be successful, you bomb here was because 80% of other successful generals also bomb Hughes. Doesn't matter why I do that. Just copy. And you hope that this causalities there. But that's not the case you had understand why, with a bomb in here was in the first place Understand where the core problem is before trying to apply the same solution Hoped to get the same results. You have a sex change contest study contest. Is king really great guys that mentioned that? I guess in this case, when you guys decided to go on this journey off going serve Ellis, have there been any sort major challenges or tooling limitation of problems that have made life difficult for you? Yes, Plenty

spk_1:   17:58
of Yeah, it starts with I I always weigh at this make or break moments where we had our first tiny successes. I was in a programme meeting and this was about where would we build next feature? And of course, you have different people at the table. Our CEO was there and people from the programme or the product organisation were there. I was there. Of course, there is this tension and Nani's because you want to quickly be able to develop. You're next, you're next feature or next increment and and using a platform until risk rides. Luckily, luckily, this is not even my doing, but are our CEO was really settled, allowing ourselves to take time to build up our own development platform. Banged his fist on the table, he said. This is going the direction of service periods. That's just a name long right?

spk_2:   18:52
And it makes a lot of sense because e think right now, especially if you're in the director consumer market. I think the successful companies of the future will be software companies because if you're not a software company and you're developing, you wanna have bringing innovative proposition to consumers or even business to business. If you are not a suffer company, you are relying on others to build that foreign sweeter. You need a big bag of money or a lot of patients because you'll be last. And so if you want to stay in control of your destiny, become a software company and that's what one of you has been doing in sourcing this delivery capability. And I think that's a key strategic decision and its CEO made rightfully so, made that decision at the right at that moment and said, This is how we're going to go forward And frankly, given a company this size and the ambitions, there's There's no honourable and I think that applies to a lot of companies.

spk_0:   19:51
You look at that, you just need to look at the top, gets top 100 companies in the world today by revenue. By any metric, at the top 80% of the top 10 is just gonna be a software companies. I think apart from that, the oil company

spk_2:   20:07
had a defence, probably and maybe mission

spk_0:   20:09
plans cos a proper human defence. Companies are more more becoming technology focus because the fusion voice also going to be take knows your eye. Sir,

spk_1:   20:19
we are finback Rights waiver used to work with partners Interesting enough So this has been quite a journey over last year and 1/2 built his digital delivery organisation.

spk_0:   20:27
Now, in the

spk_1:   20:27
last couple of months, we've added quite a bit of developers are payroll, and by that time there was already a bit of bus like this. We're building a band conservatives, and if you understand, how cool does he show you should you should come work for us and then the people, we now goats, they're they're really talented. They really understand why they chose to work here. But they also bring great energy and its its now over the last couple of months. And the chapter is moulding because he agreed. It brings so much good vibe, so much good energy to have your own people to build your own so far and to be ableto managed its top to bottom and take the operational responsibility looking that we kicked a lot of great things here.

spk_0:   21:07
Yeah, that's the stuff engineer. There's nothing more frustrating than not being in control off. The decisions are in so many companies where on also know many companies that working their way with the software developers essentially stifled off creativity off autonomy where you read If you said you are just being paid to suffer a supposed to it being when should be richest created the job where you can take a lot of their own initiative to the table and create business value very rapidly on. I think a lot of things you guys talk about today, and a lot of things I've seen in my own solo career is that when you give developers that are taught me and given the guidance and context rather than micromanaging every decision, we're going to get much happier people, much more productive people. Also, you gonna change so much manual more quickly and the more cost efficiently as well. So it goes back to the question I had around challenges one us off the main path or limitations and tooling imitations that you experienced data, they working service. That's a great question. One

spk_1:   22:13
of the things we ran into a lot is a perceived problem. I don't believe it's big problem is its perceived to be a big problem, which is circled based credibility. So we had I don't know how many talks with people that were building their silver and communities goes it's supposed to be more portable, and and this is also to mitigate risk off lender Looking right. Our problem has no bean contest portability, but more than anything, interception off our lack of ability to mitigate the risk and I, to be honest, so risk its impact times like the hoods. So the likelihood of us needing to move from the public loves toe the authorities is probably not that big. And and and then the impact. Honestly, if you think about it, you used the service of your cream work way. Use wrong times that are very default. The services on eight of us are pretty much on bar with the services that any other public club golfer. I don't think there's a lot of risk there, but you run into the perception ofthe,

spk_2:   23:16
and it's interesting because this is one of the key controls that also, the regulators basically require banks to at least think about is what is your exit strategy? And the question actually is quite literally on the form. Do you have an exit strategy? But that's it. Do you have an exit strategy and that is perceived as we need to be that is somehow translated into, we need to be able to move from one ply provided to the next within a limited timeframe. And that's not the question. Do you have an exit strategy? Because if you don't, you should probably get one. And the exit strategy could be work. Not gonna take the revolutionary approach, but just moving everything lift and shifting within a couple of weeks. And therefore we need to run a crew. Grenades. No, we can say, basically, no, we're going to use the evolutionary approach and just move were close gradually to another cloud, and probably new possibilities will be there to make that easier. We are using surplus framework, and we make an effort to separate the handlers from the business logic. And that in itself means that all the business logic, including interacting with our backend systems, is essentially portable because that code is not tied into how land that executes. For instance, if we're talking about Atlanta and we used to standard like sovereign, um, their patterns like dials and services that said about you, is you kind of take that, then the impact of actually moving into another cloud provider and having to basically use I don't know table instead of dynamo TB, or then the impact is very locust. Basically, the services are more or less dropping replacements, so devious dynamiting the service is almost the same as the azure equivalent, including if you look at the actual service levels, like maximum road's size or record size. Or so yes, club providers are closely monitoring and clocking each other. I feel, and that besides that, the low risk that serious at some point would double their prices. So the trend has been quite the opposite.

spk_0:   25:30
Yeah, just looking at some data over the last 67 years. This not in a single price hike. That's been a lot of price International June, though, because sometimes they count one price reduction for for one serviceman region. Sometimes it's for the whole service there. So still, there's just no data to suggest that the ever gonna just spiked the parts on. Also, given that it's a competitive market, is getting more competitive, getting bigger all the time. There's no business and incentives for them to jack up the price on DH. I think what the all of you mentioned as well that whole point about it being much bigger perceived risk than that of being a risk. I think us reach the point where if it asked 100 companies, boys, the bigger challenges you face today, no one is going to say portability near the top 10 or top 100. And yet they're making all this technical decisions affront. Because of this perceived risk for portability, we shall think to me just crazy doesn't make any sense again cause Packard point of understanding what is the actual risk and waste proportional response to their bases procedures While everyone is talking about his mental locking thing, Maybe we should think about it. So,

spk_2:   26:41
yeah, we've had example of sister companies to remind innumerable So daughter Cos of a bee and that word then actually saying that they were kind of reluctant to use a TBS because of interlocking, but then had no trouble in building everything on many experiences, right?

spk_0:   26:58
Yeah. And also, that's whole looking thing is ah, hate his world. That locking, because it's never lock in, is a couple in costs as a cause of moving. But you can always move, you know, after that time, yeah, it's never really a true locking per se, and also you always look into something anyway. Yeah, and

spk_1:   27:14
there's a huge opportunity. So imagine having to build all the services and stretching layers that you that you get from a public cloud thunder if you go Elaine over the benefits you get from there, like one of the things I love about eight of us is that everything's integrated with identity Exit management, the work that you will need to put into implementing fine grains thes privilege. Identix management over different services from perhaps different third party providers in different models. That goals is not there, which becomes your benefits. Are there stairs a huge opportunity benefit in modelling your ight Alaska towards a single

spk_0:   27:56
cloud provider. So this is part one off my conversation with Olaf and Jurich at money. You please come back next week for Part two of his conversation. If we want assets, show notes or the transcript for this episode, please go to real World surplus store. Come. I'll see you guys next time