Real World Serverless with theburningmonk

#14: Serverless data streaming with Anahit Pogosova

June 03, 2020 Yan Cui Season 1 Episode 14
Real World Serverless with theburningmonk
#14: Serverless data streaming with Anahit Pogosova
Chapters
00:00:54
Using serverless to process 500M events/day
00:08:29
Doing ML and serving real-time predictions
00:11:57
Trade-offs for using API Gateway service proxies
00:15:56
Use cases for Step Functions
00:17:25
What's the biggest paradigm shift with serverless?
00:24:54
What's the biggest challenges you face while working with AWS
00:32:02
AWS needs to improve documentations
00:37:22
On cost mistakes
Real World Serverless with theburningmonk
#14: Serverless data streaming with Anahit Pogosova
Jun 03, 2020 Season 1 Episode 14
Yan Cui

You can find Anahit on Twitter as @anahit_fi and check out Solita's website at solita.fi and their blogs at dev.solita.fi and cloud.solita.fi.

Also, check out Anahit's talk on Serverless data streaming at the AWS Community Day Nordics 2020.

For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.

Opening theme song:
Cheery Monday by Kevin MacLeod
Link: https://incompetech.filmmusic.io/song/3495-cheery-monday/
License: http://creativecommons.org/licenses/by/4.0/

Show Notes Transcript Chapter Markers

You can find Anahit on Twitter as @anahit_fi and check out Solita's website at solita.fi and their blogs at dev.solita.fi and cloud.solita.fi.

Also, check out Anahit's talk on Serverless data streaming at the AWS Community Day Nordics 2020.

For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.

Opening theme song:
Cheery Monday by Kevin MacLeod
Link: https://incompetech.filmmusic.io/song/3495-cheery-monday/
License: http://creativecommons.org/licenses/by/4.0/

spk_0:   0:13
Hi. Welcome back to and in the episode off Rheal World Service A podcast where I speak with real world practitioners and get their storeys from the changes. Today I'm joined by Anna Hit Welcome to the show

spk_1:   0:25
Hi. N great to be here. Thanks for having me.

spk_0:   0:29
It was really nice to meet you at the A dubious community day in the stock room

spk_1:   0:33
was really nice to meet you as well. I was looking forward to your talk and was hoping that my took what won't overlap with your so so I can listen to you as well on didn't. So

spk_0:   0:43
Yeah, that'll work. Tell quite nicely. And I really enjoyed your talk about the other things you've been doing with penises on pretty high scare us. Well, that was quite impressive.

spk_1:   0:52
Thank you.

spk_0:   0:54
So let's go straight into that. Can you? Chose it about star later and there were even building with service.

spk_1:   1:02
Sure. So, as you said, I'm gonna hit. I'm a software engineer from a company called So later on, it's ah feeling. Originally female found company. It was found, like 24 years ago or so. And at the moment we have over 1000 professionals working for us in six different countries. So we have Finland, Sweden, Denmark, Estonia and also Belgium and Germany. On with you a lot of different things with the software development with Cloud and Integration services, analytics, data science. And we do also some strategic consultants seond service design as well. But I personally have been working as a software engineer, as I said, and being doing full stack development really for over a decade have done it all like back and front, antic integrations, late engineering, all that kind of stuff and have been doing on prime development as well as a ws cloud development. So a lot of things really under recently for for for several years I have been working with one of our customers know Wiley Fearless National Public Broadcasting Company. You can think of it as the BBC in Finland and I've been working as a part of a detente. Aye aye team over there. So we have been dealing with different data solutions on things like that on DA, we have been working on a double USO. So the entire Wiley Organisation is ah using a double Lewis as a cloud provider and our team in particular, we are focusing on data user interaction data mostly, but really all kinds of data on DH. One thing that we're responsible for is, for example, the recommendation for our streaming service, which is kind of similar to the Netflix recommendation That is probably familiar for you on DA. A lot of different, like article recommendations and audio recommendations and also a lot off internal tools using machine learning. A lot of posies would try under things like Imagine extraction from the automatic image extraction from a video on DH from some video clip extraction from from a bigger video and things like that. So a lot of interesting things, really, Ana, we have AH data pipeline that streams about half a 1,000,000,000 requests per day, which makes about half a terabyte of data per day, which is quite a considerable amount for Finland if you think about it. And for that we have been using mostly serve Earless on DH, fully managed services. So we do have a pipeline off muses and llamadas on DH. Both can use fire hosts can use this data streams can use analytics way, have bean, I think, for a couple of years. That pipeline has bean pretty much fully server list. So we use it for loading data and also some Muriel Time Data Analytics. And I must say, I think we haven't had a single problem with the pipeline itself or with the A w services per se hold the problems we have. They were mostly related to our old code and our over mistake. So I think that's pretty good for for the scale that we are. And also we're using surveillance for, like, smaller things. We use it for him, eh? Be testing for our recommendation algorithms on DH. Our data scientists have been using it for some poc is that they are doing on. We have some user feature store that is also fully server lesson has a low late on C p I in front of it. Att the moment we're building newsletter management system. It's also meant to be pretty much their allies would step functions in Islam innocent Bond esque us and stuff like that on numerous other other smaller things. So the nice thing for me is that Wylie, the broadcasting company, is actually an excellent customer. They allow us a lot of freedom as developers to choose the technology that we think is is more suitable for use case. We're working, and so we have a lot of freedom to try things and see if they're goingto they're gonna work on. Also, a lot of our things were kind of bring grass project so we didn't think have to think about like migrating Roma off Prem monolith architectures Teo Server Less Cloud based one. It's a lot of things have been done from scratch, and we always try to evaluate the possible architectures when we go with the new things. And I personally tried to think like a service first, of course, but it doesn't always work like that. If Sir Villa seems like the correct choice and the best choice for the task at hand, we do go with that. But we also tried to kind of explore that the several lists scene and tried anew new technologies that come up and see Maybe we can. We can't use it at the moment, but maybe we can have it as a tool to use it later when the appropriate use case comes up. So that's been pretty fun I would say. And of course, if you think about the benefits, the innovation speed is one that everybody brings up, so I must bring it up as well. So it helps. For example, t roll out the new P O. C s with them new things that hour. They decide this, for example, come off with pretty quickly and try that we'll try them out pretty quickly, but also to kill them pretty quickly if things didn't work. So that's pretty good. And as a developer, personally, I think less code. You have to write less errors you are going to make. So I'm really or go for the existing solutions and on DH existing built in integrations. And I think I think Eric Johnson that has the staying when he says that you slammed it to transform no to transport. So I kind of try to follow that in a sense that if we don't need a lambda, I tried to avoid writing alarmed her like, for example, with FBI Gateway direct integrations with the services which not many are using. But I think they should so kind of eliminates the need off the Lambda functional together and less off. The code you have, in my opinion, is the better. So that's about the things that we are doing.

spk_0:   7:49
That's that's That's quite a nice range of different things you guys been working on with surveillance, and I can't agree more with yours where you said they're at the end about having less code. It's better. I think one of the things that we as engineers often I guess we often get it wrong is that we feel that our output value that we bring to the company is in the code of the writing and not seeing the code as actually a cost. Onda company don't really care about having Koda's output. They have no caper having some business values on impact and writing code. It just what we often have to do to deliver those value and impact. And the one question I do have, though, is ah so forth. You mentioned you guys also building low latent C AP eyes is therefore doing petition serving or for the issue of running your machine learning models,

spk_1:   8:43
the one that I mentioned. It's actually pretty new one. We have built this user future store, which we can query for for particular features. For a user like gender or age prediction on DH, it was that the requirement was it to have as little agency as possible because it's going to be used for the machine learning online machine learning algorithms along the line so we don't want, you know, to pile up the Late insists on. That was a big concern, but I think we have managed to keep it under 15 a millisecond 20 millisecond in P 95 percentile. So I think it's it's pretty good at the moment, so I don't know. Does it qualify for Lola? I don't see in your books, but for me it's pretty low.

spk_0:   9:33
Yeah, I think that's always a bit of a guess. Great area loaded into the hollow. Do you need to go because some, even for real time system. So sometimes this a couple of seconds is considered real time. And when you're building live now, mobile games, multiple mobile games. So you're talking about really wants a round trip to be less than 150 minutes seconds. Ah, and yeah. So loaded his c. I think it's good enough.

spk_1:   9:58
Yeah, me too. If we had this 100 milliseconds kind of hard to limit for the entire machine learning part. Because after that, users might start noticing that you know, the Leighton see so smaller, Smaller the AP, I small little agency, 48 p i d. Better, of course.

spk_0:   10:17
So for that are you guys ever had in trouble? Was we for the coast? Are they get we're lambda. I do anything special too, so keep your containers warm.

spk_1:   10:26
Yeah, well, as I said, this is a pretty new thing. So way started it before the London provisioned concurrency actually came up. But at the moment, we don't use anything particular s o. We still managed to keep it, as I said in, like, 20 millisecond, millisecond late and sea area for at least 95. So for that, I we try to use different options, but the one that seems to work best this is the FBI, Gateway Dynamo, direct integration. That's that. So we kind of reduced the Lambda removed to the Lambda all together at some point from the pipeline?

spk_0:   11:05
Yeah, that makes sense, because the order costar performance issues said that all comes from having a Lamby between the even if you go straight from AP Gateway to know Tom Debate and just can't sidestep that right

spk_1:   11:16
off course. But But there were different, like thing that I've noticed with that is that Well, when we had the London in between, we could regulate the time outs right for the dynamite baby, which is one way to tweak the late insist. So the dynamite baby, baby Connexion will be retried faster than the default setting. But with a p a gateway, direct integration. You don't have too much control over that on DH. I think the response times from Dynamo Deby were affected by that in a way. So it went a bit up compared to the Lambda integration, you know, which was quite interesting to see, but it's still lower. Well, as you have mentioned, we don't have the Lambda cold start issues anymore.

spk_0:   11:58
Yeah, we've done it with Deby because that you mentioned already that this because you don't have the HP Keeper life that you can enable on the Lambda between on land the function. Another thing I think you also lose this well is when you're using Lambda to call Donovan DP with the stick you get billed him, reach UAE and exponential pack off which AP Gaby didn't do, therefore you. But I think given your requirement, I still think that's Ah, that's a really reasonable compromise to make. Andi. You can still do some reach wise on the clients are anyway, especially we've done on the TV. If he doesn't get that many errors, maybe just don't need it as much. Us. Ah, no smudges that it's a nice 1/2

spk_1:   12:42
Exactly. And it's always the consideration off what are really the requirements. I mean, of course, everybody, if you ask that in a way, the client, they always will say it first as possible. No errors, nothing but like the realities, we know that we will get some percentage of errors from that nobody before example. And so on on some laden see, so so is that, like the compromises like, what are the actual limits that we have in this case? I think we pretty much met the limits that we had,

spk_0:   13:12
and it's also a matter of course as well, because when we talk about reliability and up time eyes, I know it's going to cost you a certain amount to get two p 999% but in difficult. I don't know. I want to go. It's three nights or four nines and fives nines, then the cost of actually achieving that is gonna go up exponentially is not linear exactly on the same ghostly performance as well that you get to certain level performance even going too fast, that every minute second's gonna cost you so much more than going on the second faster than the previous me. Second, I used to work in the banking and some of these over high if we can see arbitrage trading system. So we had they they were going as far as building their own dark cable that was going to cost the tens of millions just improve agency by a single digit millisecond. In some cases, people were doing even crazier things that I using microwave transmission, which is faster than those fibre optic cables over a short distance. So they build massive, really long kilometres long relays for microwave transmission just so that they can get, I think, couple minutes, seconds faster on, then the already ordered. The PGA is on just so that again they can execute code on the hard way. I suppose you have to write software. I mean the matter overhead and complexity and cost goes up. So just sickly when you I need to optimise those couple minutes seconds to be faster than the exchange themselves. But again, it's about cars and how much you're willing to invest into those actual milliseconds and so on and so on.

spk_1:   14:44
Sure. And then your example. I guess that the return on investment was pretty high, so they were willing to go that extra mile to achieving that extra milliseconds. But most of the cases are luckily not like that, in my opinion.

spk_0:   14:57
No, not awful. Most Web traffic, and you're happy with the They're just one second or three seconds on a P 99 right? Exactly. So for the machine learning stuff, Are you guys building or your custom models and serving them yourself? Or are you using some of the beauty? Eh? I services from A to B s.

spk_1:   15:16
We actually build our own motels at the moment. I'm That's not my area of expertise per second, elaborating it only too much. But guys are building their own motives, and they're running them so they don't use sage maker or the usage maker only for, like, poc a prototyping stage. And they mostly used build aero models and dry run them themselves. So guys front want to run it like a big amount of data quickly like to do some analysing and stuff that you stage maker. But I think they mostly use like they build them. There's themselves to the morals and drawn them themselves as well.

spk_0:   15:53
Okay, Okay, that makes sense. And you also mentioned the functions earlier, which is something that I'm really interested in personally. Can you tell us a bit about what your building was? Often you're puting. We've stopped functions

spk_1:   16:05
Well, it is pretty new, so it's still work in progress. I can't tell yet if it's gonna be a success or not. But we are trying to rebuild this new last newsletter management system that we have so that the newsletters that we sent t the customers like, for example, with new programmes that coming that are coming up for our shows or something, we try to rebuild it based on on the step function. So we will have some workflow management down there like I would have a possibility to enrich the newsletter or newsletters are personalised. The newsletter is based on the performance preferences off the given customer. That's the idea, but it's still work in progress, so let's see how it pays off. But we are trying to use the dynamic parallelism feature that came three events. So it looks pretty promising at the moment. So I'm keeping my fingers crossed that it works.

spk_0:   16:58
Yeah, that's a pretty sweet feature. I've been waiting two years for that there for them to announce. It s really glad that it finally came out, opened up so many different use cases around Bab reduce and other things that I was waiting for that I thought that Stefan was gonna be a good fit but just couldn't do that. The dynamic perilous, um, headed the only things myself, too. Push things into s and s. Listen, for some reason or some signal from Lambda.

spk_1:   17:23
Exactly. Exactly. Yeah.

spk_0:   17:25
You know, you and I have a very similar in history off going for on premises to the cloud running easy. Two containers now doing surveillance. So what were some of the biggest changes for you as you go through this or change or paradigm off, how you build and run and ship sufferer.

spk_1:   17:44
It's a really good question because I think I've never realised for myself personally. When the change happened, I just like, tried the server less and then I got hooked hooked on it, and I just kept on going. But I think that the like, the bigger challenges that the learning curve nowadays for the server less if you start from zero, I think it's pretty steep because there are all these services that you have to know, and many of them are so similar and they all have these details and you have to be able to choose between different, different services that are available on. On the other hand, I think you said like about a paradigm shift and I can totally related out because I think it's it's a totally new paradigm off developing software on its requires that paradigm shift on DH, The paradigm shift should be conscious effort from the developers. So they do. They need to want to go there in a way, t be able to fully get the get the full benefits of the surveillance and as maybe surprising that it might be that the biggest challenge I think you're one of the biggest challenges is resistance to this change that you see from, for example, more experienced developers who are used to doing certain things in a certain way for a long time. And they are the most productive in that way, and they don't want to change. And again, in my opinion, it's fine if they get the valued the customer value more quickly. The in the ways that they are used to work, I'm not going to force server less on them. You know, on the thing is that I think part of the issue is that with a W S, they are releasing their new services in a pretty infant stage, so they're pretty role when they come out. I think that can be both strength and a weakness because, like you get the services to the end users, you get the feedback feedback from them. You can't eat a raid and develop. But then again, for some users who will try the services on the very early stage on, get some negative experience with them, they might be know that we linked to give it another go. You know, later on when? When the time comes and I have heard a lot off, I've seen a lot of scenarios where somebody has tried. For example, I know AP I gateway like four years ago and they didn't deliver the Leighton See wise or or cost wise or in some other way, or didn't have all the future is necessary at that moment. Or somebody used to have a lot of problems with Dina Mahdi be scaling, which should have been done like with animal throughput. I mean, with all the scaling by hand. And they didn't even look into the, you know, new out of scaling possibilities in on and so forth. And, of course, the classical one is the Lambda with the cold starts on, if you ask somebody who just tried Lambda like four years ago on DH, they tried to be than a P I. And then they had a terrible experience with huge laden season. Then they said, never again, and they are not kind of following what's happening in that world, so they are stuck. Their mindset is stuck that, like few years back in that negative experience, on. Then they get like a confirmation bias effect every time there is a they hear about some challenges with several solutions. So the first thought is that you shouldn't have used server less in the first place, you know? So I think that's pretty challenging in a way to turn those people around and to show that things are developed developing and the world the server less world four years ago and now are pretty much two different worlds altogether. And that's I think that's That's the beauty of surveillance in a way that which I like a lot is that if you build something today and like, for example, that a P I was mentioning, If you build the FBI today and you have Surgeon Layton's is today, the chances are pretty high that in a year you're gonna have lower latent sees rooted in higher LL agencies. So, like we're getting improvements all the time, and let's kind of without even doing anything ourselves, and I kind of enjoyed quite a lot, But yeah, I sometimes feel like I'm a broken record, but on I always suggest like survivalists as the first option. But then again, when Another person from my team chooses Salama for their new prototype, or POC. I feel like it was all it was all worth it, you know? It wasn't in vain.

spk_0:   22:14
Yeah, I'm very much like you in that physical senses so that our most of service first, that whenever I talk about new projects whenever working with others, but do you think this paradigm shift is a bit of a double edged sword? On the one hand, it does open up a lot more optimisation and opportunities going forward, especially as the platforms themselves matures and improves over time. And as you said, the best organisation now this is just do nothing. Your coach is going to be faster and cheaper in the year's time as the power for themselves improves under the hood. At the same time, it also challenges a lot of developers identity because, you see, you see a lot people. I guess most of us have the same thing as well that if we spent all of our careers building with one set of technologies, we identify ourselves as C sharp developer or no developer. Never mind the fact that the only value the only reason why we are employed. It's not because of what we used no c sharp or whatever is because we produce business empire at the end of the day.

spk_1:   23:15
Yeah, I completely I completely agree with you. And I think in that sense oh, I mean the conversation about server less versus no cerebral s. Sometimes it's pretty similar. Teo programming language choice, conversation There There will be people who will be defending that particular language that in their in their opinion, is the best. It's the one to rule them all, but But I guess the reality is there's no one to rule that all concept there is like preferences. And what are you most productive with and what, what what what are you? What can you like? Create the value business value with the best?

spk_0:   23:51
Yeah, absolutely. And even though there is a lot of learning curve involved, especially if you jump onto a best today, they're just over there. What? Nearly 200 different services, even though I think you can be very productive, especially if you're using service. If you just learn a small handful of services like Donna DBS and aske us, maybe kidneys is a gateway. Maybe that's 67 different services. If you just know recently, well, you can use them to build most different systems. Most were close. Can be done with those A few. Just a few hand for services on DH. A lot of things that you have to learn and difficult to learn, especially around networking. You almost don't have to touch them if you stick to just the service components. Of course, they can't do everything The moment you need to go into relational databases, you could learn about the BBC again. You're gonna learn about RD s, which are quite complex machineries. We've lost different configurations on loads of different options that you have to set up yourself s Oh, in that case, what are some of the biggest challenges that you see? We have a database right now in terms of the tooling in terms ofthe platform imitations.

spk_1:   25:06
Well, I would think that the main challenge for us really is not directly related to a double us. It's rather the terra form that we are using. And don't get me wrong. I do love to reform. That's my preferred weapon of choice in the way. So I do choose it over a clot information way are using it across the entire organisation. So it's a pretty much kind of a given thing to us. We have a lot of tooling on top of terror, for which makes like life easier. We have a lot of prepared models that there were ops Team has created on all that and it works. It was a trade, But I guess you know pretty well that Terra form is not the best choice for that. For the server less stack. So if you want to create a simple ap eight with Lambda Dynamite dynamite, baby, well, it takes you like an enormous amount of flights of code to write. And of course, you you kind of get this weird sense of satisfaction and achievement when you actually make it work. But that's all time away from from creating a business value, you know, that you spent spent on terra forming. But for that reason, I personally found that you know, developing prototyping in the A. W s consul is that, like, in case of Lambda is the fastest way for me personally. So I stick to the consul I code in the consul as long as possible on then, in the very last moment I do the Terra forming on DH. One thing that has helped with that immensely was the Lambda layers that came out, I think, last year or two years ago. So we used layers for for external, like Paris when you or external packages independence is we we need to use in our llamas quite extensively during the prototyping stage. So that's that. That's a kind of a good, good strategy for us. And then we do. I guess a lot of people speak about, like, testing and like, different kind of concept of testing in a world of several s so att, least what has been working for us? Pretty nice lease that, of course, with a lot of unit testing for the llamadas with the, you know, like the normal error error handling things like to see what what happens when, for example, s decay returns an error or something like that. But then we also try to do some load testing in the test or staging environment. So to test the entire pipeline from start to enter to the integration testing in a double us. So that's one thing I don't I'm not sure it's a limitation, really. I think it's again a new new way of working. Maybe it did probably needs some improvement, and some were out of meditation. But it's just a new way of doing things nowadays. I think then, with some limitations, I guess like logging is one thing that I've had, kind of, ah, headache with, you know, with llamada on three lines of logging for each Lambda invocation. And then all of a sudden you have this huge amount of invocations on DH. Then you start to see that you're paying a lot of money for just logging. And those logs don't tell you much, except for the Lambda has started and lumber has ended. You know, that kind of thing's so the club, which insides have bean a lot of help for for looks, I use it a lot again. X ray has Bean has been good, but again, as you know, it has limited integrations like, for example, for Kinney's. It's still not supporting X ray, and we would really, really need that, though what last year's reinvent I was promised that it's coming soon so I'm still waiting. But yeah, I think it still is like a work in progress in a way. But I don't think it's stop. It can stop anybody per se just needs a bit of a again. That's a paradigm shift, a bit of different ways off, thinking about testing and looking and all that kind of things. And, well, if you think about like particular services and some, you know, we she's from from my wish list while we are, as I said, I have a users of kidneys is can use the data stream. So the one feature I have been asking about over and over again, it's King Jesus out of scaling, so I'm still hoping it will come soon. At some point, then they're like small things related to the A P I gateways to the AP that recently became generally available. I actually tried it, I think, last week, and I really loved it. How how simple it has become compared to the to the rest AP eyes. But of course, it's still lacking some some features that I hope we will be there soon as well. But again, I'm kind of I'm pretty optimistic with those because I know things will come. You just You just have to wait for that, you know?

spk_0:   30:01
Yeah, on that. The Canisius auto scaling. I feel quite if your order scaling for Genesis Systems myself in the past. And the strange thing is that they have got the that one, a p I. You can call it to change numbers shards, but it's stopped one step short of the don't give you the other scaling option that you can just enable out of the box. Can not Sure why? It seems that one of the things that everyone is asking for that you're using Connexus and that Z that one thing that you mentioned earlier to start that you put this thing together. It takes a lot of work. And then you feel really proud of the end having me and Joe Emerson was talking about this quite a while back on DA. Is that the sense of pride? I think now they suddenly see that as a ref, like almost you know, you have to work hard to get a sense of pride. So why do you have the work so hard in the first place? could have done it differently. She'd have been a service.

spk_1:   31:02
Exactly. I totally agree with you. But, you know, developers are strange people. WeII do have this sense of pride in doing difficult things, but yeah, I'm all in for a simple our ways and simpler things, for sure.

spk_0:   31:17
So I think that's where you know, you have to work hard to build something. Maybe you should have used a service instead.

spk_1:   31:24
That's a very good point. Yeah,

spk_0:   31:26
yeah, yeah, Because imagine how hard it's going to be here for you to take care of that going forward. Exactly. Andi, I think that's probably part of a list of questions that I wanted to cover. So you say anything that you like to anything else it doesn't tell us about? You know what you're doing? Maybe still eater is hurrying.

spk_1:   31:46
And so I think we were actually one thing that I wanted to cover us. Well, if you don't mind about you know that the things that can be improved as well in a double U. S. Sight. I think I kind of skipped some of the things I wanted to say. Go for it. Yeah, sure. So I think What could be very helpful is better documentation on better guidance from a double U. S side. I think it has become much better in recent times, but still, I mean, you get a lot off, you know, Happy path, the comet documentation everywhere on DH. I would really appreciate more off the not so happy path options on DH when what happens when or what should you do when things go wrong? And that's the thing about frankly, I mostly referring your blocks when I do something or I start something new, and I want to know how things are really working because, you know, like they did. Official documentation is often more like to sell you the service service. But as a developer, I would really want to know, like, what is it good for as well as what is it not so good for, you know, like patterns and anti patterns kind of way of thinking? I think that would help a lot for the developers who are just starting with a survivalist because, like you can really get lost in in all that like abandons off off services and information on DH like things that I personally find myself working the most on recently is that, like the error handling on DH like failure, handling and re tries and the thing like built in re tries that I personally didn't know nothing about until I stumbled upon them in my unit tests and I tried to figure out what is going on here, and I tried to find it from the documentation, but it wasn't too easy to find, you know, And you would think that's like a basic thing that should be stated everywhere that the S decay does retracts automatically. But it's not, really, I mean, maybe nowadays it is already. But at least that part it wasn't so. At least I personally when I talk about these things, I tried to emphasise those kind of error scenarios. I like to help other developers. Maybe when they bang their heads against the wall trying to figure out the same thing. So I would really want to for a double used to be more kind of brave to emphasise this not so happy parts scenarios, and I think another another thing that could be improved on it. It has some. I mean, you can do it already, but cost estimation on cost count calculation before hat for the server less solutions. It's still pretty tricky, in my opinion, and in the end, the best estimate you get when you put something in a production and watch over it for a month and then you see what the Baileys and and if you are in your developer, it's so easy to do things in a way wrong with server less because you kind of you might get into this false thinking that service doesn't cost anything or it's very cheap. Maybe because lumber is very cheap. But when you're getting to a bigger scale, you kind of not so optimal architectures. You start to see the cost of them like you don't have cashing. You have excessive lambda invocations. You are over provisioning your dynamo Debbie tables. You're invoking too many llamas and all that kind of stuff. So or doing too many AP gateway invocations on those things should be kind of ah, made easier for for people who are just starting away because, as you said yourself, there is just a bunch of services which you can start with on that is true. But I think there's a lot off details about the services that you need to know in order to make a cost efficient and reliable architecture. Er, in the end like that will work in production. So I would want that KWS would support that better in a way.

spk_0:   35:50
Yeah, absolutely. With the thing with databases that being such a big company with such a wide range of different services they offer, they tend to be not opinionated almost by design. Said they don't really tell you when to choose different services and when to use. AP. Gateway vs ao Pi's given, though. No. For someone like me, that's clear decision point based on the Mount traffic and hook us a cost projection you're going to be looking at. There are some cases where AP Gateway just gonna be excruciatingly expensive if you're running really high stripper system on it compared Teo P. Where for a DBS that they can't just give you this, I guess the buffet menu off different things options you can choose without much guidance on when you to choose which service best suit your needs, even for things like Oh, I need some kind of a queue. You have to

spk_1:   36:46
think what exactly? You have so many options?

spk_0:   36:49
Yeah, that's a nest. Ask us Canisius Diamond streams. I ot course even bridge. Just some uneven options went how they choose between different services like this whole lack of guidance from a DBS. It's basically keeping people like me the job

spk_1:   37:03
that's true, that there's a good positive side to that as well.

spk_0:   37:09
But the more. But they're also more guest, valuable things I could be doing. Just you can just tell you always use the MM bridge, this case and you skinny season in the case, the whole costs. A lot of things. I think that is quite that's quite interesting. I do wish this better prediction model for a beer, especially when using some of these paper use services. Like I said, with dynamite Deby and we've we've lam the know how the projected costs, especially when simple mistakes can become really, really custody. It hasn't happened that often, but I have seen quite people post about how they end up with a big bill for Lambda because they're going to some kind of infinite Riker Shin like that release. You want to make it land prices

spk_1:   37:56
horror storeys about Yes, yes, yeah, No, it's it's actually, it's actually pretty easy. Tio get overboard in a way too, too enthusiastic and overboard would serve a less really, because it's again. It's easy to start with, and then you are maybe not that cautious about kind of following the costs as he would be the case ofthe city, for example, or some other services. But that

spk_0:   38:21
said, I still think I love this horror horror storeys that we never hear about. You look a build off how much people talk about, you know, a couple $100. Not great on your personal account, for sure. But in the grand scheme of things that I've seen, people make far, far, far bigger cost me stakes. We've easy to, and containers don't make it look like a drop in the bucket.

spk_1:   38:40
Yeah, it's a matter of perspective. That's that's That's a good point.

spk_0:   38:44
Yeah, I mean, I've heard companies running hundreds of East two instances X four large with 5% CPU, utilisation or less. I mean that you're talking about tens of thousands dollars over the course of a couple of months or just being raised. It because they forgot to tend on other order scaling or have really poor or this gaming policies in place or some simple mistakes around configuration on. But there's also companies are spending. I don't know how much they're spending on the Net, because that gave the charges, you

spk_1:   39:16
know? Yeah, you Goodbye. Transfer rice being there than that. Yes. Tried making a mistake with not Gateway. And you will see the bill pretty quickly.

spk_0:   39:26
Yeah. I mean, we're so back in prizes and some of the mistakes that you you can make, we have land, and that's going to incur costs Would be a voice more costs by comparison to some other mistakes you can make in a containerised environment.

spk_1:   39:40
Yeah, but that being said, I would really appreciate, like, more transparency in a way, like, or predictability really when using Not even Lambda Lambda eyes. Somehow easy. But like I know, Dina, baby was when I think a year ago I tried to estimate what an on demand dynamo Debbie would cost with our throughput. I don't think I got a different estimate every time I tried to make it. And of course, it might have been my own kind of fault that I couldn't do it properly. But then it probably tells something about how easy it is to kind of to find information on dude estimates. So

spk_0:   40:13
oh, yeah, those estimates are really hard because I love them have nothing to account the size of the payload, which, of course, depending on what you're doing, is not always easily predictable. And Theo always rounding up things that that I don't think this three. I've not seen any services that that's a good job in predicting the cost of service, but it's a few of them. They do a reasonable job at tracking the costs and chucking changes so that at least you can see month or month based on the decision that you've made right now, what has that? What impact has had on your costs, so the least you can react to it relatively quickly. About the same time, I also had a few projects we've crimes now that are using the pay as you go model you get from from service and then building it into their own products that you can offer them. You can offer your customers pay as you go, so pricing Model OS and use that as a cost advantage as well. So we've had to build some guests Cosmo to ourselves so that we know okay, based on the utilisation based on service that we're using for supporting these customers transactions, we are going to see this amount ofthe cost and then we can then put a premium of that so that we can then include that into our into into our bill for the customer. So we've had a few a few things ourselves with abuse, some kind of cost projection models. But again, when you've got different tearing and all that becomes really hard to be accurate, reflect where you're gonna end up spending on a DBS. But hopefully what we fear is close enough that we're not going to end up losing too much money on. But again, we're slammed, the everything's cheap around it. Unless you're running a massively high skill, in which case you probably want to be moving things back into containers for just that one thing that's running over a high scale. So we're coming up to the hour mark S O is anything else that you may want to tell the audience like maybe Solider is hurrying on. How do people find you on the Internet?

spk_1:   42:12
Sure. So for so little you can visit our website. It's so lit up dot f I on DH. Yeah, we are hiring steel. He even in the mist off pandemic. So there is a lot of options in those streets countries that I mentioned in the beginning. So if you are interested, it all go visit our website. We also have some block study can find from the website like deaf dot solid identify and cloud dot Sorry dot If I like blocks that our developers are writing about all kinds of things Cloud no clouds over less no server less So go cheque that out if you want. Teo, my personal twitter handler is on. I hate underscore if I so you can follow me There are some block post that I have bean meaning sitting for a long, long time. So maybe now under lock down who finally have time and and ride them up. So, like about that the data streaming things and service ap eyes and the step functions that that was mentioning. So keep a knife for those I hope I hope social pressure will help me to write those words some foreign soon. But yeah, that's about it. Okay, that's

spk_0:   43:22
great. And the semi this length afterwards are mentioned. I include them in the show notes so that when they show is people can go and find and read about those new proposal. You're promising us now?

spk_1:   43:35
Sure. Here.

spk_0:   43:37
Okay, great. It's been a pleasure talking to you. Stay safe and take care of yourself.

spk_1:   43:42
Thank you. Same to you. It was a really great pressure. Pleasure talking to us well and hope to talk to you to you someday again.

spk_0:   43:48
Yeah, hopefully see you again soon either virtually or in person at one of these events.

spk_1:   43:53
Let's let's let's hope for the best on dear. Stay safe and stay home. Take you by.

spk_0:   44:12
That's it for another episode ofthe Real World Service to ask us to show notes and the transcript. Please go to real World Service star Come and I'll see you guys next time

Using serverless to process 500M events/day
Doing ML and serving real-time predictions
Trade-offs for using API Gateway service proxies
Use cases for Step Functions
What's the biggest paradigm shift with serverless?
What's the biggest challenges you face while working with AWS
AWS needs to improve documentations
On cost mistakes