You can find Paul on Twitter as @paulswail and
For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.
To learn how to build production-ready serverless applications, check out my upcoming workshops.
Opening theme song:
Cheery Monday by Kevin MacLeod
Yan Cui: 00:13
Hi, welcome back to another episode of Real World Serverless, a podcast where I speak with real world practitioners and get their stories from the trenches. Today I'm joined by Paul Swail, otherwise known as Mr serverless testing.
Paul Swail: 00:24
Thanks, Yan. Yeah, it's great to be here, thanks for having me.
Yan Cui: 00:30
So Paul, before we get into your upcoming book on the serverless testing, which is your niche, now your focus, which is great, I think. Can you tell the audience about who you are and what you've been working on, and how you got into serverless?
Paul Swail: 00:43
Yeah, sure. Yeah, so I'm an independent Cloud consultant based in Belfast in Ireland. I have been working in software for about 20 years, like as a software engineer and then architect, been independent for about eight nine years and been using serverless for three years, so I got into serverless mainly via just contract market. I've been doing it with big clients, and was sort of the server based, express.js app, was all on AWS but started to sort of see the benefits Lambda was starting to get a bit more mature at that stage so started seeing the benefits there. And yeah, I have a SaaS product as well which is a side project. I started putting, building parts of that with the serverless. This is really going to be the future so I decided just to make up my main focus of my, my sort of contract and and consulting work since then.
Yan Cui: 01:37
Yeah, and you've been writing a lot about serverless testing and make that kind of your niche. And I love that, because that's the one, so, one of the things that a lot of people struggle with as they move into serverless. So you've got a new upcoming book, can you tell us about that?
Paul Swail: 01:51
Yeah, yeah, so it's very early days so the general... I haven't even got a definitive title yet but it's tentatively named Testing Serverless Backends, so it's mainly a backend focus, AWS focused. The idea really came out of, last year when I ran a survey, it’s based on asking developers about 150 to 200 experienced serverless developers, what's your biggest pain area working with serverless? And monitoring sort of came like the top. So okay, fair enough and testing was second. So monitoring was, there are a lot of like products within that space. So okay, there's more room like, so testing is something I can really dig into. So, yeah, and just from my own experience working with client, developers and client teams who are new to serverless just how do you test this. It's... there's a lot different about how you could test it compared to like a sort of like a standard server based architecture.
Yan Cui: 02:46
Okay, maybe we can dive into some of these differences that you mentioned. That's very much of my experience as well that a lot, I guess, the failure modes that you have with server for applications are quite different to those of serverless applications. What are some of the big difference that makes people, I guess, struggle to change their mindset when it comes to how they structure the tests? What do they test? And how do they approach mocking and local simulation and so on?
Paul Swail: 03:17
Yeah, I think, just what you mentioned, the different failure modes is key so I think people are used to writing tests based on existing, pre-existing architectures which, and they focus on their code, their application code like the runtime code which in monolithic architectures in particular that is where probably most of the risk is there, because all the code is all talking to each other, whereas with serverless application, a lot of that goes to, hopefully you're using infrastructures code but that code is more declarative or YAML where you're, you're, you're putting different components together so with the risk, really, the risk profile changes to being the integrations between these components, the developers still think I should be writing code for my… I should be writing tests for my code and they don't really think about the integration parts, just as much. Yeah, and the whole local vs Cloud testing, where do you do your development where do you do your testing is, it's, it's a hard one to for it to get people to, to think about because, like everybody is used to, a lot of developers are used to have everything working on their own machine, especially for monolithic apps. It's great for isolation from other developers, you're not going to have other developers interfering with what you're working on. And it's also for speed for speed a feedback loop so you can just quickly make a change and you don't need to deploy, it’s just there on your machine, which is fine, and there are some tools which can do that with serverless like local DynamoDB, or local, local stack or tools which can help with that. But a lot of those integration related issues that I mentioned, you won't really hit, you won't notice them in development, testing, and local testing so you'll, you'll run your development tests that work fine. But things like IAM permissions which aren't in play on your local machine, will be in play once it's deployed. So the tests that you do write, you can probably for a limited set of services, run them locally but the confidence that you get from them passing isn't the same. You will still, you're still more likely to hit issues, as you go to other environments, any sort of Cloud hosted pre-production environments.
Yan Cui: 05:32
Paul Swail: 07:32
Yeah, this is really interesting actually, I think, I think it was Martin Fowler this week I’ve seen tweeted an article around even unit testing was never that well defined, and he sees two subcategories of it, called I think he called it, solitary and sociable. I'm not sure if solitary is the right word but the idea, what I think of as unit testing, was what he was describing as a solitary unit test. I hope I'm using the right name, where it is effectively your code is running in proce— I wouldn't strictly say in process, but it's not making any calls to external resources. So, if you needed to call, say an AWS resources SDK call, that would be marked out. That would, that's what I would see unit testing, testing actual business logic within your application code only, without any integration points in play. Integration testing, again, it's, it’s one I've struggled with. I do a workshop with with people around the different forms of serverless testing, and it's still I'm not totally happy with my terminology that I'm using for this but that for integration testing specifically is testing your specific component. So it is scoped to a certain component, but integrating with another component, so that could be your Lambda function, talking to DynamoDB, there is where there's a bit of a grey area is that if I could still invoke my Lambda handler on my own machine, and talk to a real DynamoDB table. Do you call that, which is use— a useful technique for quickly running, you've just made a quick change to your Lambda function code you want to test if it works, or do you invoke that remotely and then that's a full and effectively deployed test like so the Lambda is actually running with IAM in play. So, that is a distinction between those two forms of integration, whether it's sort of partly running on your machine, but talking to real Cloud services or fully running on the Cloud is. Yeah, it's just getting that distinction clear. I'm not really sure is that an integration test or end-to-end test doesn't seem right either. I've sort of, like, because it's not, it's still only testin, a small part of like our use case, but it's... I think the key distinction is whether it's running locally on your machine, and at least partly the system under test, or is it running in the Cloud. So what what, you said you have a specific definition of that?
Yan Cui: 10:04
Yeah, so for me, running your code locally but having them talk to the real AWS services so the real services as much as possible I still call them integration tests. And once I deploy and test the whole thing. I call those end-to-end tests, even, even though, like you said, it's just covering just the back end side of things, and not necessarily the front end and UI. So I guess my end-to-end test is just, well, the end is not the whole end. It's not all the way, is just as far as the API is concerned, this is, you know, this is the end, which I think depends on what it is you're building that might be the only end that you need to worry about because you may be building an API that is not used by any front end, but it's some public REST endpoint that other services can use. So, effectively you are testing end-to-end because you're testing against the contract of your API, what you say you would do. So I've kind of classified those as my end-to-end test, and then of course there are full end-to-end tests that involve the front end. Those are typically written by somebody else. Maybe the front end team, maybe the QA team, but me being more of a back end specialist, you know, the sort of tests that I write tend to be focused on just on the backend components.
Paul Swail: 11:28
Yep. Yeah, yeah, that makes sense and that is generally the, in my workshop, that's the sort of terminology that I'd used, the integration and end-to-end for those, but I do get a lot of questions about that it's never clear so and so I'm not totally happy with the terminology there but yeah, it's...
Yan Cui: 11:43
Yeah. I like that Martin Fowler's Article. Everyone kind of disagrees on what it means to be to be a unit test and vs end-to-end test vs integration test. And I guess one of the key things is that, like you said, is that you're running your code locally but you’re exercising those view integration points with the real services. I guess that one of the things I saw a lot in my career is that, you know, you got, you got a bit of code that does some business logic and then it calls DynamoDB or something like that. And then, like you said you write a unit test you want it to be fast you want it to be self-contained and isolated or, I guess in Martin Fowler's words in solitary, then you know, you're mocking up everything so your test is fast, but at the same time you don't get as much confidence because, well, you don't actually know if your request to DynamoDB is actually right. Maybe you got a typo in your query or something but then your test is just hitting a mock, so you're not really exercising that connection to DynamoDB, you don't know if the the table names you've got is correct, you don't know any of the parameters that you're passing to DynamoDB is correct, but it gives your test really fast feedback at expense of greater confidence that your test is doing the right thing. And if that's the only test you write, then I think that's a problem because the moment you actually run it, you've never actually exercised that request to DynamoDB. So, the moment you actually run it in the real world, in the, in the Lambda environment, that's when you actually find that, oh, wait, wait a minute, my, my code doesn't actually work, because I'm writing the code with one set of assumptions about how DynamoDB works, and then I'm baking the assumption into my mocks in terms of what response gets returned, but then I've never tested my assumption if my assumption is actually correct, which I think is the whole purpose of testing, right, that you're testing what you assume to be working, because we don't deliberately write not working code, but then your test is not really exercising it, it's not really telling you that your assumptions are correct, and then all your test is doing really, especially when it comes to Lambda is that, you know, I guess we don't write as much business logic anymore, or it's not complicated ones. More and more is offloaded to other services that does that work for us. So your test just hits mock and tests the mock, which is the self-fulfilling prophecy because you tell the mock what to return. In that case, how do you go about structuring your tests, especially when you've got lots of other asynchronous processes with Lambda and you have to hit, you have EventBridge, you have SNS involved which makes life a lot more complicated in this kind of integration testing approach we're talking about, both from how do you, how do you trigger your code to, how do you verify that you've got a bit of code that then writes something to, to a queue or to EventBridge? How do you verify that those requests that you're sending are correct? When you write integration tests for something that writes to DynamoDB, it’s easy to assert that the data that you wrote to DynamoDB is correct, you can just do a check against, you can do a get against the table afterwards to make sure that the data you got back is right, but how do you do, you can't do that with EventBridge, you can't do that with Kinesis or SNS, what's your approach for that then?
Paul Swail: 15:05
Yeah, yeah, that's very interesting, yeah, and that's that's a common issue. My current approach and I think it's actually based on a blog article I read from you, Yan, probably like two years ago. So I currently use SQS, say for EventBridge for verifying, I guess the, the scenario that we're testing here is checking that an event from a Lambda function gets published to EventBridge, so that is what we want to make sure is happening. So obviously we can’t query, there's no get event API for EventBridge. So we can temporarily hook up a queue, an SQS queue to EventBridge, and just let it consume every event don't have any filter rule on it. So, and I put a small retention period on it. So, the test, your test and the, the sort of the the action of the test is to publish, is to invoke say the Lambda function or whatever the system under test is that publishes this event, and then separately poll the queue. There are libraries to help with this, but poll that SQS queue. So it's a bit.... when I teach that in the workshop, it's it's a bit like people are like, Oh, this is extra infrastructure we need to set up just, just for testing. And because it's something which you wouldn't do in production you only do in pre-production test environments, but I think I've seen it was Theodo have sls-test-tools that they had released which has, I haven't used it yet but it seems to be a library which does that in your actual code, and you're like, say it’s a Jest test, you can, it spins up the queue in the background in the setup party of your, of your test and they think it turns a dine afterwards. The way I've done it is separately so I have it as a pre-test run step so in the test environment, and say, I use serverless framework so it's defined conditionally the queue there just to run in test environments, but it's not ideal, hopefully, I think EventBridge announced a feature about six months ago about archive or about an API for accessing archived events and I was thinking, Okay, well maybe we could use this, this could be something we could use to to get events to actually query it directly. But, there's like I think there's a 10 minute latency between them being accessible so it's just not really feasible. But yeah, I think that that's similar pattern, we're talking about EventBridge here but SQS or SNS and other, I guess, Pub/Sub type systems have that same issue where you can't just query them to check if the data is sent there in the first place.
Yan Cui: 17:45
Yeah, that's that's quite interesting that the approach that Theodo took with sls testing. I think my problem with that is there are a lot of people just, they run the test and they realise, Oh right, I forgot something, Ctrl+C. And then you've got, you know, you left infrastructure that's just sitting there and never really been cleaned up, which over time can become problematic if, especially if it's hooking up to EventBridge and then saving everything into something, then the overtime that's gonna just accumulate more and more junk. And that's my concern with that particular approach, but like you said, if you just provision additional resources using some kind of conditional thing on the CloudFormation, then you're doing actual bootstrapping work that doesn't get deployed to production, which is, feels a bit, I guess, untidy is probably the right word there. Another thing I've seen is, you instead of doing this kind of polling thing, where you could also do is, you can use a set up an API Gateway separate project altogether WebSockets, which just exists in the your non-production environments, and then you have something that hooks up to that API Gateway may be something that you can automate so that any time you set up a new EventBridge, event bus or SNS topic or something like that they always subscribe to something so that you push the event to that API Gateway, so that in your test environments, you always have an API Gateway that you can listen to via WebSockets, that's going to tell you any event that goes into any of these async event resources. I had a client that basically does that, so that in their test environments they can use more of a push model as opposed to a polling model where they have to poll SQL or something like that, to find out what events has been pushed to EventBridge, which from the sort of the point of view of writing unit tests, or writing integration tests, like [inaudible] all these out, it’s actually quite nice so that you don't have to worry about, you know, polling logic and all of that. You can just subscribe to a WebSockets endpoint and that's it. You will get all the events coming through. But still that does still have that element of having to provision something just so that you have a way to listen into those events, but I haven't found a way to really work around that. You can hide it to the... I guess you can do some work to hide it like the sls test testing approach that Theodo took, but I think that's I think the drawback there, if someone just cancelled the test prematurely it’s still too great.
Paul Swail: 20:21
Yeah, I think that the general relying on any code in your [inaudible] that could have, is… you should put code that cleans up data as well in there but just relying on it having it run isn't something you should do, like it could get cancelled early and then next run shouldn't rely on whatever state that needs to be cleaned up, it should make sure it's done before the test is run, or just ensure write your tests in such a way where that wouldn't matter.
Yan Cui: 20:45
Yeah, so that's actually that's another good point, because one another thing that I think a lot of people, the mistake a lot of people make is that when they're writing tests their tests rely on data that already exists in the database so you see a lot of this, I guess, pre-test step where they have to see the data, and then the whole test is run against those data. What do you, what's your take on that? Because I always feel that that means the tests are, I guess, easier to write, but then the downside is that now, when they break, it’s quite hard for you to figure it out because there's, you know, the entire setup is not visible, there's some other invisible setup that needs to happen to pre-see the database so that you've got to right, you know, data in the test, well, you've got the right data in a database for the test run.
Paul Swail: 21:29
Yan Cui: 22:31
Okay, another service that I think is quite challenging for people to test is the step functions. I've got a lot of questions from people about how do I approach testing step functions. And if you've got something up your sleeve, so tell us about that.
Paul Swail: 22:46
Yeah, I don't know if I have something up my sleeve as such, and yeah it's it's an area where a lot, I have I didn't actually cover it in my work, my testing workshop but a lot of people asked about it because it's inherently more difficult to test, because think the step function as an orchestration, the state, it's, there's a lot of state in it, it's one step to the next step and the general testing guidance is test in small, isolated, isolated boundaries. So, whereas a large, if you're testing the whole, along with a step function which has several states, the current way of doing that is generally writing an end-to-end test effectively with [inaudible], write using the AWS step functions SDK initialising the state, starting the execution, and then checking the state at a different point. But that's quite like it's quite a long test case and each not you can't test for each transition in isolation, several folks have said like, just even things like logic which you put into the step function itself, like the JSON path which is something to easily get wrong when you're passing the input path to the path of the next state and error handling cases so a big use case of orchestration is to have if something fails, then you have an action that you want to take to resolve that, but how do you test that within, within, within a step function. So, recreating a simulated that failure is difficult and couldn't either... This is one of the few areas where I would like it to be more locally unit testable as in without testing the logic purely within the step function itself, not in the Lambdas that it calls out to. So I would generally test those first step function calls out to Lambdas, and I would have tests that directly test the Lambdas outside of the scope of of the step function itself. But the actual flow logic, I find it difficult to test, you can’t write end-to-end test to do it but it's not ideal because there are a lot... If it's a long, if it's a long process, you need to set up the sta—, the whole way through, or wait for it to get to that stage, you can't just jump in between steps three and four, and test that transition as such.
Yan Cui: 25:02
Yeah, fortunately that's something I guess we have to wait for the service to support that because it's kind of out of our hands at this point that we need to have some ways to jump, well, I guess start execution from a certain point, and also being able to say, Hey, step functions use mock for this Lambda function that you're going to call and then get a mock response instead so that you can test your step functions flow as opposed to the individual Lambda functions which you'll be testing separately, to make sure that they're doing the right thing, you know, given this input. They produce the right output and all of that, but your layer, yeah, totally agrees it is quite hard to test the actual state machine itself.
Paul Swail: 25:41
Yeah, but just to be, just to be fair, the step functions team has done the step functions, and product manager was chatting recently to justify this and they were very keen to make that experience better for developers writing tests for it so hopefully we'll see something come from that soon.
Yan Cui: 25:58
Well, it’s only, was it now five months or six months to re:Invent, so I’m sure we still got time for a nice surprise at re:Invent. And I guess, another thing, I guess we should probably, I want to circle back to is mocking, local simulation, we talked about some of the challenges of those. In that case, do you ever use mocks and if you still do then, when do you use it?
Paul Swail: 26:24
I still do use them but I think the developers tend to overuse them and I find a lot of the uses are smell so, but then what I would say is mainly because what you punctuated to earlier, just that they..., they're actually more difficult to write a test, rather than like if you're mocking a DynamoDB response, it's easier to just to find out what that mock should look like takes more effort normally than to just write the actual code, and call the real service. And so there's that reason why I generally don't like it for, especially happy path tests, but one scenario where I do use mocking is for non-deterministic test cases. So occasionally, very occasionally, AWS services don't give you the response you expect. And it's, it can't, you cannot easily simulate that, or, like, make it directly happen using the real Cloud service so that the service is temporarily done, and you have some key logic, which you need to test that it reacts correctly to that service giving a certain unexpected response. That is a good, that is a good case for using mocks, so just mocking out the SDK call, and with with whatever error response that you want to test. So that's, that's really the main use case. One, one, occasional one, it doesn't happen, I tend to work in sort of single teams not work both like a front end or back end possibly but not many back end teams but occasionally we've worked with another team where we're calling another API, internal one, and if it's, it's nice to not have, if it's changing a lot, their API, we don't want to have to, they're not providing us an environment which we can reliably write end-to-end tests against because it's gonna change. So, in that case, mocks can be useful for just mocking out the expected responses that we would, we're expecting those services to send rather than relying on the other team to provide us with a fixed API which isn't going to change.
Yan Cui: 28:28
Yeah and also you don't want to be beholden to another team, not ever making a mistake. I've had, I've worked in environments where one team made a mistake, deployed something broken to the dev environment, and before we had a chance to fix it, everybody else is stuck because we all had to hit their API as part of our end-to-end, as part of our CI pipeline and as part of our testing. So the whole, the whole, essentially the whole company got got grinds to a halt because one team made a mistake. And that's not a good place, position to be, especially as you're as you got a, really, I guess, bigger companies where you got a lot of different teams. So yeah that I totally agree. Those are, those are the places where I use mocks still as well. I guess it’s good, maybe example, maybe maybe we should give an example to, to the audience in terms of what do you mean by non-deterministic response.
Paul Swail: 29:18
Yes, sure. So, if a service is having issues,so say you're writing to anything, any service which can be throttled. For example, so if you're calling, for some reason, DynamoDB gets throttled. If there's a lot of concurrent load on it and you're testing, you have a Lambda function, which is writing to DynamoDB, and it doesn't get written, and you have some logic which you need to run the AWS SDK, or the API, by the SDK via the API would give you a throttling error code, and you can't... to set that test up without mocking would be very difficult, you would need to simulate overload to do it. So it's not really realistic to set up that test. So, you would mock out the right item, I guess, or wherever the put command for DynamoDB SDK, and make sure that is returned, whatever throttling error code is, and test your, your compensating logic within your test.
Yan Cui: 30:20
Okay, yep, that's, that's a good example. I think the one, that, that's probably the one that I often use as well in terms of DynamoDB throughput exceeded exception, I think that's the one. I always have to also look out for how to do the try catch for that because of the way the SDK works and how you handle those error messages. I think that's all the questions I had, Paul. Is there anything else that you'd like to talk about before we go?
Paul Swail: 30:46
Umm, no, I guess just more generally on serverless testing, we've talked about the challenges but I think I think the big opportunity that people sort of miss is that, because we have infrastructures code and because services are so easy to provision, that means our pre-production testing can be a lot more a lot closer to production than with past environments, we can easily set up, we don't need to, like, not have SSL in dev environment or not have a cluster database, and we have only just a single node because of cost or because of effort to to provision. Those issues are going away. So I think our pre-production testing can be a lot better and can catch a lot more before getting into production, we still have, it's not an excuse not to do monitoring and other in production testing as well but I think it will help us. We have that opportunity to catch a lot more errors pre-production, and through our automated tests than we do in previous sort of server based systems.
Yan Cui: 31:45
Yeah, I guess that's also one of the reasons why I never bother with, I don't really bother with local stack because often, oftentimes I find it takes more effort to set up local stack and configure it and to just provision a real thing, and you still get those isolation and all that if you just use a temporary stack for yourself or a feature work you're working on, or just every time you do end-to-end test, you just provision a new stack. And then at the end at the end of it, you know, delete the stack, because like you said, it's just so easy to do that using serverless components where you don't have to wait for EC2 machines to warm up, you don't have to provision a big cluster of RDS databases or worry about those, you can just create a new DynamoDB table. It's gonna get a different name every time. If you don't name the table explicitly and then at the end of it, you just delete the whole stack and then you wipe the data so there's no need for you to have those, I don’t know, weekly, monthly script that cleans your database, because there's so much junk in them. So that all makes life a lot easier as well if you use a temporary stack for for running your tests. You said something that you see people doing more and more now as well.
Paul Swail: 32:54
Not, not so much, I think, in pull requests I've seen people doing it, but it's still not as common as it should be, I think, but it's still, it's not totally frictionless, there's some friction and, like, because you have to spin up the whole environment if you're fully blown the environment away rather than just deleting it, and deleting the data within it. It slows it down a bit, but you can mitigate that just by having, I guess, an environment just before that, or just after that, which doesn't have that, it isn't, just doesn't rely on the data being blown away each time. So yeah, there are a few ways of doing it but yeah, hopefully I think it will become more feasible to do as I guess there might have better defaults, and when deployment frameworks come in, we're writing less config for it, and they're, I guess, offloading even more of the infrastructure config to just the service we're hosting on. That will definitely become more commonplace.
Yan Cui: 33:54
Yeah, I hope, I hope that's the case as well. And I guess one last question, how do people find you on the internet and have you chosen a domain for your book yet so that people can go in the book market.
Paul Swail: 34:05
No I haven't. No, that's a big decision I need to make. It's, yeah, so I don't have a domain name for the book. My domain, my website is serverlessfirst.com, so you can go there and on the homepage you can sign up for my email list where I will be sending out updates about the book. And there’ll be, generally, I'm hoping to do sort of work in public and release chapters to people for review so if you're interested in that, jump in and I'll add you to my email list and send you a chapter, if you want, and to read some draft copies. Yeah, hopefully having it closed off later in the summer. But yeah, so serverlessfirst.com. And I'm on Twitter @paulswail.
Yan Cui: 34:42
Okay cool, I'll put those in the links in the show notes so you guys will have that. Thank you so much, Paul, for taking the time to talk to us today. I look forward to, you know, when you start to releasing drafts of your book
Paul Swail: 34:55
Cheers Yan. Thanks a lot. It was great to talk to you.
Yan Cui: 34:57
Take it easy. Bye, bye.
Yan Cui: 35:12
So that's it for another episode of Real World Serverless. To access the show notes, please go to realworldserverless.com. If you want to learn how to build production ready Serverless Applications, please check out my upcoming courses at productionreadyserverless.com. And I'll see you guys next time.