Real World Serverless with theburningmonk

#16: Meta-programming Lambda functions with Tom Wallace

June 17, 2020 Yan Cui Season 1 Episode 16
Real World Serverless with theburningmonk
#16: Meta-programming Lambda functions with Tom Wallace
Show Notes Transcript Chapter Markers

You can find Tom as @tomincode on Twitter, and you can find his blog at

For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.

Opening theme song:
Cheery Monday by Kevin MacLeod

Yan Cui: 00:12  

Hi, welcome back to another episode of Real World Serverless, a podcast where I speak with real world practitioners and get their stories from the trenches. Today I'm joined by Tom Wallace, who is a lead developer at DevicePilot. So welcome to the show, Tom. 

Tom Wallace: 00:28  


Yan Cui: 00:29 

So I've known you for a little while from the fact that you answer pretty much all the questions on the serverless forum, you answer a lot of questions, you help a lot of people there. So tell us about the DevicePilot and what you've been doing there.

Tom Wallace: 00:46  

So DevicePilot, we're a service management and assurance platform for connected devices, which means that I spent most of my time processing the scruffy and complex data that comes from the Internet of Things, and helping companies work out exactly what faults or opportunities they need to focus on next. My role at DevicePilot is the lead developer, although there's actually only three of us, we're a really small startup. So that means that I've had the guiding hand in pretty much every architectural decision we've made.

Yan Cui: 01:18  

Okay, so what have you been building with serverless at DevicePilot? Tell us a bit about some of the projects, some of the services you built?

Tom Wallace: 01:26  

Sure, so serverless, these out, sorry, DevicePilot these days is serverless, all the way down, which was good. I mean, you can tell what day it is just by looking at our bill, we use AppSync for our main API layer, probably the most interesting thing that we do is our whole analytics pipeline is essentially Lambda MapReduce. So every time someone asks us a question, we fire up, you know, 1000 or so Lambdas, to read 1000 or so files at S3. We process them all. And then we aggregate that all back up to a single answer. And that's allowed us to do some quite interesting ad hoc analysis and adapt very quickly to new requirements from our users as they discovered them.

Yan Cui: 02:07  

So that's quite interesting there. Because normally, this kind of thing you would do with Athena which kind of do that the whole MapReduce thing for you and give you a nicer SQL syntax. How come you guys build your own engine for doing that?

Tom Wallace: 02:21  

I would love to use a Athena. But Athena very much requires SQL like data. And the truth of most of the sort of telemetry, you get off IoT systems and not very SQL like you don't get a full row every time, most of them only report on change, or you have to join that with slow moving metadata. And lots of queries will be about the duration matching something, or detecting when a device hasn't reported, or 30 or 40 seconds or, you know, five minutes, whatever your timeout is. So quite a lot of the the problems that you get in IoT are actually solved by running for a stream and modelling, how the devices are behaving during that stream, and then re-aggregating back up again. So, at last, we ended up having to build our own.

Yan Cui: 03:12  

So in this case, how do you go about orchestrating all of this MapReduce job? Do you build some custom thing or are you usingStep Functions or something like that?

Tom Wallace: 03:22  

Actually, we just use Lambda all the way down. It can be a bit of an anti pattern, but it works for us. We we start with the master Lambda, which calls 100 or so Lambdas, running the processes, which in turn, finish processing the files and then report back to the master Lambda. We did have a look at Step Functions. But our real advantage, and what we see in Lambda, at least with the MapReduce, is the ability to go from zero to 3000 in no time at all, whereas actually we found with Step Functions that they don't like getting that parallel and a single step.

Yan Cui: 03:56  

Yeah, that's true. But I guess the, the trick there is that now you've got this thing that have to potentially run for very long time, then potentially, you can run into the Lambda timeout issues, if you got the master Lambda function to have to wait for essentially the slowest worker Lambda function that you invoke. Is that something that you guys ever run into, or something that you have some plans to work around?

Tom Wallace: 04:20  

It's something that we periodically look at. Actually, the biggest trouble we had was successfully calling Lambdas. Anyone who's been on the receiving end of a spike in Lambda realises that although Lambda claims to be able to scale, I think it's 3000 in the first minute where we are and then 1000 every minute onwards, you'll get a lot of 429s or slightly weird errors back from Lambda, which you'll have to automatically retry on while they're desperately scrambling around for the scale. So we had to build in retry logic around the AWS SDK for that. But in all honesty, once you've fired up 1000 Lambdas, having one Lambda run for five minutes really isn't much of an issue. And actually most of our queries return in about 20 seconds. I mean, when you're, you've got 1000 Lambdas processing files up, well, S3 pause at about 50 MB a second, you can get through most questions very quickly.

Yan Cui: 05:13  

Yeah, that's 3000 limit, the 3000 initial burst limit is also based on region as well, I think depending on the region, that might be as low as 500. And that step up after that is only 500 per minute as well. We've run into that quite a lot of times in the past when I was when we're doing load testing at DAZN. I guess, for you guys, is probably okay, because you don't have too many other things that's also running at the same time that uses the same pool of concurrency that you have.

Tom Wallace: 05:42  

Well, yeah, when we, when we first set it up, and we were testing it, we had everything in our production account. And of course, that meant that every time something really big happened, bits of our architecture would just randomly start failing, as we would lose capacity and our Lambda pool. So we now have a separate account for query, which solves that problem, because it doesn't matter if query exhausts itself, because it can just back off and retry. And then that doesn't affect all of the other bits of our infrastructure. And that's definitely a pattern I see around the place, where people will have to set up separate accounts to manage different bits of limits and how they want bits of their infrastructure to fail. I think we had high hopes when AWS announced concurrency limits, but the reserved concurrency is sort of exactly the wrong way around for us, we would like to have said, ensure that at least, ensure this doesn't go over the capacity of 1000 or so. But we wouldn't want to just constantly reserve that 1000 for one of our Lambdas.

Yan Cui: 06:41  

Yeah, so much issues with that reserve concurrency thing. It’s great that you can use as a limit, but then they also eats up your concurrency, regional concurrency limits. The fact that it also accesses both is just weird. And also the name reserved concurrency, it’s just confusing. People thought that means that you're always gonna have five concurrent Lambda functions running all the time. And then they come up with provisioned concurrency, which actually does that. Yes, that, yeah, I have so many conversations, we have to correct people in terms of what the reserved concurrency actually does, just because the name has so much connotations around it. So this whole MapReduce thing you've been doing is, I would say it's kind of unusual. It's fun, and also I have seen quite a few of your, your talks, you have done quite a bit of experimentations around metaprogramming when it comes to Lambda. I've actually used some of your ideas in one of the client projects I've worked on in the past. Can you tell the listeners a bit about what some of these ideas you've had about around metaprogramming for Lambda?

Tom Wallace: 07:48  

I did see that you used a little bit of metaprogramming with your Kinesis stream processor was that?

Yan Cui: 07:55  

That's right. And I did use it for the Kinesis stream to control the batch size dynamically based on what the response time and error rate we get from the downstream.

Tom Wallace: 08:05  

I mean that it's definitely one of the use cases I would have for metaprogramming a Lambda. What I mean by metaprogramming Lambda is that because of the AWS SDK, from within a Lambda, you can alter the configuration of a Lambda. And you can even redeploy the code of a Lambda. And as most of Lambdas are in scripting languages, it's quite trivial to provide a new set of code for that Lambda function or for a new Lambda function. And because most Lambdas behaviour is actually defined by its configuration, i.e. what event source it responds to, and how it responds to that event source, you actually have a huge array of tools available to you. At DevicePilot, we don't use a huge amount of it, because obviously, it adds complexity. But we use it on our very front edge Lambdas. We ingest a lot of data, we have to ingest it very quickly. And we have to validate it against the schema very quickly. And to do that we actually save the schema inside the validation Lambda. So there's no lookup time. But of course, that means that we're internally caching, so every time the schema changes, we effectively just automatically redeploy that Lambda to pick up the latest changes. And that's just sort of an example of where you can use metaprogramming to get that sort of edge or, you know, to be able to, to do user bespoke things.

Yan Cui: 09:27  

Yeah, I've not had to come up with, I've never had to use that particular technique. But I do see how that can be really powerful. Essentially, you're creating this dynamic in memory cache that's always going to be there. And it's not tied to the lifetime of a Lambda container either, because you're just rewriting the whole code to create a new cached values. But it's it's quite interesting that you know, you've had to resort to this kind of unusual optimization for your use case. But but I do think that's, that's one area that I'd like to see more people experiment about as well. Because I do think there's probably other potential use cases out there. Even things like, dynamically optimising the functions, memory allocations, CPU and a bunch of other different things as well, which can be quite useful in some, I guess the latency intensive, well, performance or critical scenarios.

Tom Wallace: 10:24  

Yeah, I definitely agree, there's a, there's a huge scope out there, something that I'm really interested in because it doesn't really, I mean, limits aside, cost you to have 200 different Lambdas just sitting around the place not being executed, I think there's a real scope for people to start building applications where you can deliver per user or per customer behaviour. Because if you have, you know, you can dynamically create 200 or so Lambdas that do that slight specific thing that your 200 or so customers want. And because it doesn't cost you anything to have it lying around only when they're executed, it's a sort of completely different paradigm than trying to manage one or 200 different servers with 200 different versions of your application. And I think that's something that I'm quite interested in sort of experimenting and looking at. And we do little bits of that in DevicePilot, like nowhere near the scale that I've just said. But because we've got our bespoke querying engine, it's very easy for us to have two or three different versions of branches of our query processor that can do just that slight little bespoke work for our customers. Without having too much of an overhead of managing the infrastructure associated with that.

Yan Cui: 11:35  

We've seen, we've seen quite a few, I guess, IoT platforms that allows customers to run bespoke functions. It’s not exactly the same way, the same metaprogramming we're talking about here, but definitely does that element of just creating functions dynamically, and just having them lying around, so that they don't really cost you anything when you don't, when you don't run anyway. But you create this bespoke compute function for your customers that, you know, ask them when they need something, so that you don't have to manage or maintain all the code bases, all the code base associated with each of these functions. And from there, it's a small step up to say, okay, you know, what, we can also just dynamically generate functions that have got different behaviours for each of the customers.

Tom Wallace: 12:19  

Yeah, I mean, that's exactly something that Lambda would be pretty good at as long as you are able to sandbox your environment.

Yan Cui: 12:26  

Are there any other examples of interesting use cases for Lambda that are not perhaps a common place that you think has got a lot more potential than we have seen so far?

Tom Wallace: 12:37  

I definitely think so. And I wouldn't say that I, I know them all, I've got a long list of failed call for papers, submissions, where I've tried to come up with new ones. That I mean, one thing I found, I found myself musing on with Lambda is, is really an AWS optimization, you know, if we all moved away from servers, and we all went into this sort of compute chain, one of the things about renewable energy that people complain about is that most of it is quite tidal or seasonal, or depending on the time of the day or the wind that you get. And I mused that, we could, we could definitely have asynchronous and synchronous paths in Lambdas where, you know, if you just need something done, and you don't really mind when it gets done, you know, you could be connect, you could be on a queue that's waiting for it to get windy enough in Ireland, for there to be enough compute capacity for the next thing to be done. So that's one thing that I've been sort of musing about with this sort of rise of serverless, which I sort of categorise on, as on demand work.

Yan Cui: 13:45  

So that's actually quite interesting that, the whole green angle, because that's something that the Paul Johnson has talked about for a while now. And I think a few other people have seen around the soft containers and serverless we're also talking about the same as well, that if you look at the last report I saw. I think that was from 2017 or 18. There was something like that, you know, only 10% of the servers in all these different data centres are being utilised. And so most of the compute that, most of the energy and and and cooling has gone into maintaining the servers are just going to waste at least 90% of the most of the time. So having something that's like more bespoke that's on demand, computation on demand, like Lambda should have a very big impact on the carbon footprint of this data centres. And something that I do wish, well, I guess it's kind of hard for Amazon or big cloud vendor cloud vendors to talk about this as a selling point because, I guess as a customer, you probably, you probably care about the environment to some extent, but you care about your own sort of bottom line way more. So I guess it's quite hard for them to push serverless from the green angle. But I do think there is some potential there to optimise for better utilisation of the clean energies that we have.

Tom Wallace: 15:01  

I mean, I guess the the option is spot Lambdas, I guess that's what I'm, I'm suggesting here is that you can you can pay whatever it is for your Lambda to be executed immediately, or you can be, you can pay a 10th of that. And it will get executed when there's room when there's capacity. And I suspect that would, you know, allow them to optimise better for that.

Yan Cui: 15:24  

Yep. So similar to Ad, right. So that's how the whole Ad platforms work that you put your, your bid down, your maximum price, and then the when the price is dropped down to your level, your Ad is gonna run and be shown to people. And there's nothing, I guess, you can probably apply to the same mechanism to executing those asynchronous tasks that you have in the backlog.

Tom Wallace: 15:46  

Yeah, yeah. I mean, it's, I don't see AWS bring it together anytime soon. But I see that as an opportunity. And I guess it's just part of a sort of wider thing, which I find myself quite interested in, which is, like a, I have my infotel theory that there is sort of three steps to serverless. There's one, just appreciating what and why serverless. And there's sort of part two, where you try and work out how you do what you already do, but in a serverless way. And I think there's sort of stage three, which is emerging now, which is when people try and find the sort of unique opportunities in serverless. And, you know, I touched on it with metaprogramming, but I think it's sort of more of a larger thing of seeing cloud as your programming language, you know, you use SQS instead of an internal queue function, you know, you store stuff on S3, you have that sort of EventBridge pattern really comes out of treating cloud, like your first class programming language. And yeah, you might have some other code around the place, but you're just bolting together these services to create your application.

Yan Cui: 16:50  

Yeah, I love that the cloud is your programming language. There's definitely some truth to that. And certainly you feel, certainly, I feel nowadays, most of what I do is just connecting things together, and write as little code as I can get away with, and then put them into a Lambda function. So I don't have to manage the infrastructure and everything else that comes with having servers in my in my infrastructure. And that also means that I can rely on all the expertise that Amazon has to keep those things running while I'm sleeping safe and sound.

Tom Wallace: 17:20  

Yes, no, definitely. I mean, I've, I'm long happy that I no longer have to have an opinion between Nginx and Apache or you know, everything that goes for that. I just want to write the smallest amount of code to achieve my product, achieve my feature.

Yan Cui: 17:37  

So in terms of the third step of a unique opportunity that Lambda offers. So along that similar train of thought, Tim Wagner talks about how you can have, you can use Lambda, essentially as your own personal supercomputer, because similar to the use case you talked about, you can bring up, I don’t know, 3000, maybe you have more concurrent executions, or running concurrently, just like that in parallel, so you can get massive parallelism very, very cheaply. That's another area that I think, would be quite interesting to see how potentially, academia is and the universities and other large institutions can potentially run a lot of their current workload on Lambda, especially if Amazon in the future relaxes some of these limitations around how quickly it will scale. I think the 500 per minute limits, even though it's hard limits, if you've got the right use case, I'm sure it's one of those things that Amazon will be happy to negotiate on a per customer basis as well, because they're normally quite good at these kinds of things where sure for most of customers, you don't need to go up by, I don’t know, 5000 constitutions per minute. But you know, you are, I don’t know, certain you are one of the big institutions, and you've got some unique use cases where this could be a really good fit. And for them, I think that's just like an arbitrary limit. I don't think there's any technical reason why they can’t go beyond 500 to whatever number you need anyway.

Tom Wallace: 19:02  

Yeah. And he says, it really needs to be Lambda. I mean, I think that's what AWS is good at doing is infrastructure building blocks, and a building block where you just hand over machine learning workload or, you know, something a huge computational workload, and they run it, and then they come back to you with the results. I mean, that's, that's well within their capacity, and it would be nice to see them have a look at that. And I guess they are kind of looking at it with their products like Glue and everything, but they... At the moment, they seem too close to sort of script wrappers around EC2 to be of much practical use.

Yan Cui: 19:39  

Yeah, and that might just be this is the first step of moving towards that the ultimate goal, you know, just give us some of your machine learning code, and we'll run it for you. So I want to switch gears slightly. One of the things that I've seen you do a lot with your own time is answering questions on the serverless forum or the Slack channel. What are some of the most common questions that people are asking?

Tom Wallace: 20:03  

I think, really the the two questions, or the sort of two families of questions I see, one is the sort of why serverless problem or the why serverless question, you know. I remember talking to someone, and he’s done all the math, and could demonstrably prove that running a server on EC2 was much cheaper than having, you know, serverless functions attached to API Gateway. And really, it's trying to get in that away from that mindset of just that. There's not capex, but just just the raw infrastructure cost. And look at how expensive that solution actually is, when you factor in all of the management and your lack of agility. Once you've, you know, you commit to an EC2 server, even if you only commit for five minutes, you commit to configuring it, you commit to doing so many decisions, like which operating system, etc, etc. that serverless just takes away from you. And that's worth an incredible amount of money. So, there's that family of questions. And then the other questions I see quite a lot of is, you know, I know how to do this on my own local machine using Express or whatever, how do I do it in serverless. And I think that's just a factor of the way that because serverless is built from these tiny building blocks, you have to understand a lot to get started, like an incredible amount, it's very hard to get much beyond the Lambda hello world without facing the horrific documentation that is API Gateway, or, you know, the monstrosity that is Kinesis. And immediate second you want someone to log in, that's it, you're in Cognito. So I, I tend to find most of the problems are really just sort of trying to extract what people think as sort of simple and monolithic problems into the serverless services.

Yan Cui: 21:55  

Yes, it's funny that the whole cost thing I’ve seen that a lot as well. I've been asked a lot of questions about that. And I've written a couple of blog posts around that. It's funny that you look at your AWS bill, which is what you're trying to save a couple of bucks a month, and maybe 10, 100 bucks a month. And to do that you have to hire an engineer for, I don't know, 100 grand a year. So I mean, the math just doesn't add up when you look at a total cost of ownership. And the cost of AWS is just a small fraction of that. And the cost of brain powers of engineers is far more far more expensive. And the more engineers you have the more support structure you need, office space, paying their pension. One person who's, I don’t know, on a grand a month, could end up paying twice, three times that once you account for everything else around. It is is crazy that all we look at is just that one number, because it's easy to measure, it's easy to see every month, you get a bill from AWS, you see how much Lambda is costing you, versus how much EC2 is costing you, and not looking at the bigger picture. So the other thing you mentioned in terms of how you transition from that monolithic mindset of doing things. Everything has to be run locally. Have you seen some strategies that help ease the transition for people who are still clinging on to that mindset? Having a paradigm shift it’s difficult to just pull people in one step; you have to, I guess, be kind of gentle and help them along. Have you found some strategies that kind of works? 

Tom Wallace: 23:28  

Well, I think getting people to go slow. I mean, I've seen things where people sort of start a new service and decide to do it in Lambda, or using serverless technologies. And that's, that's quite good, because you're completely greenfield. And, you know, you've got that time to experiment and set down your requirements and look through the AWS services, but also look at like our own transition to serverless. And, you know, our first we were serverless first when I took our monolithic application, which was MEAN stack and put it in a Lambda. And then we were serverless. I mean, we weren't, it was bloody awful, it was really slow. But that was that was the start, you know, we were suddenly running in a Lambda. And we were only paying per execution. And then when we started getting hammered with ingestion, I could take that same Lambda and put it on the receiving end of a Kinesis stream. Therefore learning about Kinesis and the streaming and the buffering, you get with that. And suddenly we had horizontally scalable ingestion. And of course now you know, we transitioned further and further. We started breaking that up into or initially API Gateway, but these days AppSync. And I think, I think people just need to take it incrementally. There's so much to learn. And there's so much that like, no matter how many times people tell you, you've just got to experience, you know, the, the way that Lambdas are deployed and the way that you monitoring your instrument, your Lambdas and your serverless services because it's such a huge shift. And I think you've just gotta take it in your own pace.

Yan Cui: 25:02  

Yeah, well, I guess the good news is that experimenting with Lambda on AWS is just so cheap. There's just not a lot of overheads in terms of having all these different functions and trying something out. And then when you're done, either delete the stack, or you can leave them around, and they're not going to cost you anything.

Tom Wallace: 25:17  

Well indeed.

Yan Cui: 25:19  

In terms of, I guess, that transition, you talked about your, your personal journey at the DevicePilot. What are some of the most challenging aspects of that transition for you guys?

Tom Wallace: 25:30  

I mean, I guess we're lucky that we're quite a small team. So we, we could sort of communicate and work together and learn together. I think, is definitely that shift in mindset, and the number of technologies you have to use. And I, I owe my serverless adoption almost entirely to the serverless framework. You know, if I couldn't have just done a very quick serverless framework tutorial, to get, you know, API Gateway, and Lambda all set up without really having to learn them from the inside out, I probably wouldn't have adapted Lambda, adopted Lambda as quickly as we did.

Yan Cui: 26:07  

Are there still any sort of platform limitations or constraints that are still making your life difficult? Maybe you have an AWS wish list items that you want to share. And hopefully someone from AWS is listening.

Tom Wallace: 26:21  

Yeah, I mean, I, I'm a cynical believer that there are only bad platforms and platforms you haven't used yet. I mean, I have a huge AWS wish list as you want. I think, I think there are two sorts of core things. One, I would love more transparency from AWS, especially being a small startup on the bottom rung, you know, my, my point of contact in AWS rotates every three weeks. So building relationship, it's pretty much impossible. And knowing what's going on, I mean, it's... There's a running joke that you implement something. And next week, AWS announces that. And just having that foresight of when things are going to be delivered, or what's next on the roadmap would make a huge difference to the architectural choices that you make, even in the, you know, quite agile world of serverless, you know. We would quite like timestream to come out, which was announced maybe two and a half years ago now. And we've yet to find anyone who's even seen it. So I'm beginning to believe it's a work of fiction. So, you know, I definitely think AWS could afford being the big agile market leader that is to be much more transparent about the bugs on its platforms, and the features that are coming, you know. Something that really jumps to mind is when the Node 10 execution environment for Lambda was quite broken. And they were definitely fixing it. But I stumbled into upgrading all of our Lambdas and ran into no end of trouble, which I then later discovered everyone knew about, but, you know, there was no way of AWS informing me of that, or let me know when it was fixed. So we just didn't upgrade until, you know, seven months afterwards. And I could find a blog post from someone saying that it worked.

Yan Cui: 28:09  

So on this point of transparency, I do have, I do echo everything you've just said, even though for my position, I've been quite lucky. Being a AWS service hero means that I get a lot more assets and the most most people do. But even then, you know, you get drip feed information. And there's a lot of things that I know they're working on. But it's again, you kind of wish that you know what's coming. And even as a big customer for AWS, sometimes you find out about services on the pipeline, if you just happen to be talking to the right person about some problems that you're having. Otherwise, there's no, I guess, regular roadmap update, unless you are maybe bigger class of customers. Maybe you get that roadmap. But they do move really quickly. I guess that, you know, you did give them that. And sometimes maybe you wish them to do a better job of checking things out before they release it like the whole Node 10 thing that was notorious. I think Michael Hart was the was the one that did a lot of that analysis. And nowadays every time there's a new Node runtime comes out, he's kind of the one that everyone looks to to find all the bugs so we don't have to.

Tom Wallace: 29:20  

Yes. Although, you know, on the flip side, for complaining of their transparency, I have to say AWS, out of all of the cloud providers are also the most accessible in a weird way. Like, it's actually very easy. Most of the serverless events, you know, you'll be able to bump into someone who can at least point you in the way of someone in AWS, who will be able to answer your question and I have found like even very senior level people to be quite approachable, on Twitter on various forums and you know, even in person in events, so, you know, for all the lack of transparency, and it depending who you know, is quite easy to meet those people that you should know.

Yan Cui: 30:00  

Absolutely, they they are really, really good at interacting and getting feedback from the community and feeding back into the team. I mean, as a customer, I was just talking to someone else recently as well about the difference between AWS and say, Google, where with AWS, you know, if you've got a problem, you complain to someone, within a couple of days, you can be talking to the team behind the service, to give your feedback directly to the team. And nowadays, if you go to the AWS console, and if you are on a console for, I don't know, SQS for Lambda, you can click the feedback button, and your feedback goes straight back to the team. And that's the sort of thing that you don't get with, say, you know, Google, whereby you were talking to, you will be talking to a solution architect, and they don't even have access to the development team behind the services. So they don't have that bridge of communication from the customers to the actual teams that are building services. So Amazon is great at doing all of these. And plus they got this whole army of developer advocates and evangelists and solution architects that are constantly in touch with customers and getting feedback, and helping them guide the roadmap for the services and features and so on. So yeah, I definitely think that's one of the one of the biggest strengths that Amazon has, and probably explains why, you know, they are still quite, they're such a, they’re such a leader in this particular field. And that whole, I guess, customer centric principles that they have.

Tom Wallace: 31:23  

Yeah, I think I guess another thing that's on my wish list really, is for them to focus more on smaller infrastructure blocks, like I find that AWS is at its best when it's delivering infrastructure, and it's at its worst when it's delivering a product. So, you know, the infrastructure of Lambda is fantastic, you know, you get that scalability, you get that capability. But the product of actually deploying and making Lambdas is actually quite terrible. I mean, I hear good things about CDK. But SAM seems rocky at best. And then you get this sort of great product, which is a serverless framework, which can, you know, deliver serverless applications and a sort of quite easy developer experience. And I find that again, you know, CloudWatch in Cognito, you know, CloudWatch is fantastic piece of infrastructure for collecting together logs, but it's just a building block that you can then build your monitoring solution off of, and the product really is Datadog. And same with Cognito, theoretically, you can put together, you know, a world class authentication system using it, but you've got to do all the work, you know, it's just the infrastructure behind it. The real product is something like Auth0, which comes with opinions and comes with the user interface and the guidance that you need to really deliver working authentication. So I think my wish list really is for them to... They seem to be doing quite a lot of quite high level product things, which they then deliver quite badly. And I would really love them to focus a bit more on infrastructure things, you know, that there's tonnes of patterns in serverless, like, you know, using SQS to back off and retry or circuit breakers that everyone ends up having to implement. And just having that infrastructure block, wrapped up in, you know, a next piece of infrastructure like the same way they repurposed CloudTrail to become EventBridge, you know, that same level of building blocks rather than products, I would like to them to focus much more on.

Yan Cui: 33:18  

Oh, yeah, that whole CloudWatch thing. And Cognito is one of those services that are so powerful, but so poorly documented, and even most of the, most people I know from within AWS has trouble understanding how to work with Cognito as well. And also, there's just so many still small missing pieces, how to do group based authentication, all of that you're not going to do build it yourself. I guess AppSync kind of does a good job of the fact that you can quite easily limit access to certain mutations by user users group in Cognito. But then if you want to use API Gateway, you kind of have to build that layer yourself. And I've seen a lot of people are building multi-tenant applications, and they're having to do that themselves. And there are just so many, I guess, yeah, like you said, a lot of building blocks, it feels too low level, too primitive. And then there's a lot more, a lot of the common use cases that you wish they kind of just build something on top of that, that you can just use it. And that kind of is a goal, right? You know, something like Cognito, you can just use it. That's pretty much everything that I wanted to ask you about. Is there anything else that you'd like to tell us and listeners about personal projects or things that maybe DevicePilot is doing?

Tom Wallace: 34:36  

Yeah, so we're not hiring at the moment that you know, if you're a company that has connected devices, and you have service assurance targets, or you want to get more insight from your data, we'd love to come and help you at As for me personally, well, once all of the pandemic is over, you'll be able to find me at events again. And I am always happy to chat. I’m @tomincode on Twitter. and at gmail if you want to just send me an email. I'm always happy to talk serverless and be corrected on everything that I get wrong.

Yan Cui: 35:14  

And I guess if someone goes through the serverless framework’s Slack channel and ask a question, you probably gonna pick it up and answer it as well, right?

Tom Wallace: 35:23  

I'll try. I'll try. These days, the questions are getting too hard. I like the days when people just didn't know how to write a Lambda and that was the answer.

Yan Cui: 35:33  

Alright, man. It's been a pleasure talking to you. Take care and stay safe.

Tom Wallace: 35:38  

Cool. You too. You too. 

Yan Cui: 35:39  

Bye, bye.

Tom Wallace: 35:40  


Yan Cui: 35:54  

That's it for another episode of Real-World Serverless. To access the show notes and the transcript, please go to And I'll see you guys next time.

What have you been building with serverless?
On the 3000 initial burst capacity limit
What are the use cases for meta-programming with Lambda?
Are there any untapped use cases for Lambda that you'd like to explore?
"Cloud as your programming language"
What are the most common questions people are asking about serverless?
What's on your #awswishlist?