For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.
Opening theme song:
Cheery Monday by Kevin MacLeod
For more stories about real-world use of serverless technologies, please follow us on Twitter as @RealWorldSls and subscribe to this podcast.
Opening theme song:
Cheery Monday by Kevin MacLeod
Yan Cui: 00:12
Hi, welcome back to another episode of Real World Serverless, a podcast where I speak with real world practitioners and get their stories from the trenches. Today, I'm joined by Vadym Kazulkin. Hi, welcome to the show.
Vadym Kazulkin: 00:26
Yeah, thanks Yan. Thanks for inviting me.
Yan Cui: 00:29
So we've known each other on the Twitter for a little while now. You are very much in the sort of JVM world but still very much doing a lot of stuff using Lambda and serverless technologies. So can you maybe take a few minutes to tell us about yourself and what you've been doing with serverless?
Vadym Kazulkin: 00:47
So sure, Yan. My name is Vadym. And I'm in this industry since more than 20 years, so from from the end of the last century. So I was very involved in the Java and all of this Java development, Java ecosystem, Java Enterprise Edition stuff, with all these frameworks, like like, like Spring and so on. But for since four years, we are migrating to the cloud with my company. So we've done a lot of steps, but this is some kind of continuous process. And then the last three years, we are doing things serverless in production. So concerning serverless, so probably some words about the company, so you can imagine what we are doing, we are writing software for, for designing, and also purchasing of the photo products like calendars, photo books, prints, posters, so everywhere where you can print your emotions, in form of pictures, images, so just we are doing a lot of huge amount of images. And the turning point how we looked at serverless as it was nearly three years ago, we had some kind of production weeks, so that people could decide what to do. The only goal was to help the company to solve some kind of problem and just it took three weeks, and we, we choose the project where we had to do one very heavy computation or a lot of computation, but only one time in in the month. So on the on the first of the month, it should be triggered automatically. And only several computations will be required in between or even manually started. So just the first discussions where as always, so use Java, probably Spring, use Docker and so on. And just then we look at this thing like serverless, we were in the AWS cloud already and just had a rough understanding what it is, but we haven't used it yet. And it was a perfect use case, because just only doing heavy computation, only one time per month, only for several hours, it was simply the perfect use case to start, to start with this. And just we have four people. And we just pick Java as a programming language. Because just the company had a lot of, say, with Java developers, and just with deep things with Lambda DynamoDB, SNS and so on and just gained the first experience there. And the results were pretty, pretty nice. Over three weeks, a lot of stuff was programmed. And just that was some kind of starting point. And then we involve the whole organisation in this. And now currently, we have six teams, only 30 developers and two teams are completely with serverless. And we are doing things in production. So product creation, and so on are completely serverless. And currently, we are rethinking how we deal with uploads with safe projects. Because just if you design the photo booth takes normally several days, and just you have to save the project and to to load the project. So we are currently rethinking how we can do this in the native way with a completely serverless technologies. And I think just developers are pretty happy, they feel that they are productive there and they can work end-to-end and just don't have this worries with the infrastructure so on.
Yan Cui: 04:20
Vadym Kazulkin: 04:47
Yan Cui: 06:09
Yeah, I've seen the Java being used with Lambda in the few enterprise customers. And some of the problems that they were running into before things like having really long cold start time for VPCs. That's kind of been addressed. And things like the Java itself, the JVM take a long time to cold start. I think that's also getting better with newer versions of the JVM. Have you seen, I guess, I think you mentioned that there is more adoption now of Lambda from the Java community, do you see some particular reasons why people moving to Lambda? Is it a case of “Oh, they see some of the productivity gains you can have as a Java developer”, or is it something else do you think that's contributing to this greater adoption?
Vadym Kazulkin: 06:58
So you've already mentioned this VPC cold start, which are not a high concern now, but this VPC cold start, they didn't depend on the programming language, but of course, if you have the VPC cold start and the cold start within your within your programme, which which is based on some programming language in the worst case scenario, so, it were a huge cold start. So Java suffered a lot from this, but with the, with this problem sort, the VPC cold start, and also the possibility that you can now use provision concurrency. So there are also now the solutions which which which, which may help you even adopt Java as a programming language in this serverless space at scale. But general problem remains the cold start with Java and probably the same will be true for for C#, they are really big. And just the reason are that it's just people want to be productive, they use all these frameworks like Spring, but they were built with other, with other trade-offs in mind. So just use the reflection heavily, runtime byte code generation, runtime generated proxies, dynamic class loading, and those stuff which which contribute, to more memory and just also to higher cold starts. And these are all these parameters, which also contribute to your cost. But of course, not always is a cold start is an issue because just if you can rely on some kind of synchronous communication, then it's probably not an issue at all. But of course, there are a lot of public, there are a lot of Lambdas which which which are kind of public facing. So they will be triggered by API gateway and just you have the time out value of 29 seconds as a maximum value. So just I saw the, the code that the experience even cold start of something between 20 and 30 seconds. So just it was really insane this place. But currently, there are a lot of best practices documented, which may reduce this cold start, they are not very intuitive for Java developers. So just there are some kind of even dirty hacks. But now they are known and just people can can start applying them. Probably we can we can talk deeper about all of this stuff, but mainly with the VPC improvements, and also with provision concurrency and all this stuff. We're probably talking about like GraalVM native images. So there are solutions, which have probably other trade-offs. But generally, if you have the choice, then it's something very good.
Yan Cui: 09:40
Yeah, I remember a few years ago, I had Scala function doing some background data processing from Kinesis stream, and it was running inside of VPC. So that was like a 17 second cold start, which is horrendous, but because it's happening on the background, no user ever saw that that delay anyway, so It didn't really matter. But in terms of some some of the things you could do to optimise your cold start time for a Java function, what are some of the tips that would you give to someone who is looking to use Java in Lambda and also using it in user facing API whereby you do want to keep your p99 percentile latency down under, say one second?
Vadym Kazulkin: 10:24
Yeah, probably with a plain Java or something under one second, may maybe, maybe an issue. And it's really challenging. Depending, of course, on the application for something like hello world, it may work. But generally, applications are more complex. But there are some best practices to share, like just to switch to SDK 2.0 for Java, which is a lot more modular, and with a lower memory footprint. So you can even only import all AWS resources, which we are really using in your Lambda, like DynamoDB, like DynamoDB, or S3, and not all this stuff. It also allows you to configure HTTP client of your choice. And it was basically plain Java HTTP client. But now, AWS introduced its own asynchronous common runtime, an async client, it has really, the lowest cold starts from other choices that you have, you can use Netty HttpClient and even Apache Commons HttpClient. But they are good for long living applications, because they, they use caching a lot in just for for short leaving functions, it just doesn't matter. So you want it to establish one communication with you. If you are doing the call outside like with Stripe or Shopify and something like these so you don't don't need everything around these because it's costly to initialise, and then the container will be released. And then you have to initialise it again, and so on. So this is one of these practices. Other one is if you know all your static variables like like credential provider, region endpoint, so provide this, if you're building the client like DynamoDB client or something like S3 client, so just provide this. Otherwise, of course, AWS can figure this out for you. But for the credential providers, the whole lot of chain, to figure out what you are really using. And it costs really time and money, of course, and, of course, there is a thing, this static initializer block, and everybody knows that you will gain access to the full CPU for maximum of 10 seconds. So you, Yan, wrote a lot about it, I think, also Michael Hart did suggest use this because if you only select 128 megabytes of memory, you will only receive their small fraction of CPU on your handler function. So if you have things that you can initialise, or you have to initialise only once to move it outside of the handler function to the static initializer function. With access to the full CPU will probably have a smaller, bigger cold start, but then you you don't need to do the things in the handler method. So you will be in there. And of course, one thing, it's probably true for everything, the less dependencies, the better. So just exclude everything which you don't need in your own time. And just if you have some testing trend, or probably you need it on the on the compile time, and so on. So just with all this with all this small optimization, you will be probably just within several seconds of the cold start even for complex application.
Yan Cui: 13:42
Okay, yes, I guess if if everything fails, there's also provisioned concurrency as well that you can just, you know, mitigate cold start altogether by paying differently or paying more potentially. What about in terms of the new version of the JVM? I've heard that new versions of JVM is going to be more modular and lighter to begin with. So that is better for environments where you're running Docker, or running Lambda functions where, you know, resources much limited. And also, you know, you worry a lot more about this start-up time for your application. Is anything coming up in the newer version of JVM that may help things?
Vadym Kazulkin: 14:23
So currently, we have only choice between Java 8 and Java 11. And AWS has only committed itself to to support the long term Java versions. And with Java 8, they do it even with the extended support just with Amazon Corretto, so just otherwise, Java 8 is out of support and they also support Java 11 since one year. And the next long term support version will be Java 17. Probably it will be released in one year, so just probably we have to wait one and a half years until AWS supports this, so we can also compare Java 8 and Java 11. And in terms of these, there is no significant difference between the cold start in these two versions. I have also compared this by myself, but also I know that you've talked to Mike Roberts, who just also very known in this space, just like an author, the book Programming Java with Lambda. So I have also the same opinion just that with Java 8 or 11, you don't have any any huge differences. Of course, they have more features with Java 11. And currently, it's very difficult to predict what will happen with Java 17, such as probably also the Oracle recognise that just they have to provide features so that just Java should become container aware, just much more adopted to the microservices and serverless world. So So I expect some optimization to be there. But currently, I don't see a lot of them.
Yan Cui: 16:01
Okay, but what about other unofficial VMs? I've heard about the GraalVM quite a few times now, in the context of optimising Java functions, where you literally compile your Java code to native code. So you get, you know, you don't get all the overhead of the JVM, the class loader and all of that and you get to super fast Lambda cold starts. Is that something people actually use in production? What about... Are there any other sort of similar VMs and similar techniques that you can use?
Vadym Kazulkin: 16:35
So GraalVM is really good example. So just, of course, a GraalVM has a lot of characteristic. And it can even be viewed as a as a client compiler in the Java world, which is, which is written in Java. But for us, people who are really interested in serverless, of course, this whole stuff with the ahead-of-time compiler in with native image functionality of the GraalVM, which is something which is which really becomes very interesting. And I don't see a lot of adoption of GraalVM for serverless currently but people are paying attention and trying things out the same is, is true for me. And just ahead-of-time compiler, it works differently from just just-in-time compiler. So with ahead-of-time compiler, some kind of closed-world assumption. So just you have to figure out all fields, all methods and all classes that you have, that you will use in production, and then you have to, to put this together, and then you have to just convert this into native image. And GraalVM supports Linux, Mac, and even Windows native image. But of course, just if you think about how to run this on your cloud provider, like Amazon, so Amazon only supports Java 8 and Java 11 as a managed execution environment. And GraalVM is a different execution environment. So of course, you have the option of custom runtimes, where you can provide this native image which is built with GraalVM there. But of course, it just specific specification of custom runtime says it should be Linux executable. So just now you're the people who are using Mac and just Windows have to think about using Docker container to build all those things. But generally speaking, of course, this is a very, very valid alternative. And, of course, you have their different trade-offs. And one of this trade-off is there are this higher compilation time because it takes currently between two and three minutes, just to compile this relative simple function into your native image. So just, of course, you lose some kind of productivity, because you sometimes you you make some small change and want to test this locally. And just this waiting time may be very boring. But just as everything is a trade-off, then it's the trade-off is that you you will get your very highest starting time. So just something with two and three hundred milliseconds of the cold start is just what you can get with GraalVM. So just this cold start won't be an issue anymore. And of course, if you only package only what you use, then the package size is small and it's also a factor effect size is also affected for the cold start and also the memory usage will be lower. And also you have the possibility to pre-initialize things with GraalVM native image. So you may run static initializer block in advance, which also saves you on cold starts. And of course then in the end, your bill will be lower. But the trade-off is that you you have to wait a lot and of course there are some kind of you have to deal with different kinds of errors just writing your functions because just it's, it's it's not that easy for compiler to figure out everything you need in production in case you you use reflection something like this. So you have to become productive with GraalVM, how to how to get there, the GraalVM, everything needed. So this native image will be built without errors in production and so on. So just it's it's kind of different trade-off. But of course, it's, it's a tool. And I really believe that the GraalVM has an, will have a bright future. And I think that the Java community and especially microservices, that the people writing microservices and serverless application, Java will, will use this, but I think we are not there currently.
Yan Cui: 20:50
Okay, that's a shame. I guess it’s pretty much a similar story, with the .Net side of things that there is the, I guess, there is a possibility for you to compile your .Net core application to native code. And then you can ship that as the native binary so that you don't get the usual cold start overhead for .Net core applications. But also, there's some, I guess, very similar challenges to what you described in terms of tooling, in terms of if you use reflections, then it doesn't work anymore. And also, as to your compilation time, so that some of your tool chain becomes a bit more complicated. But yeah, I'd love to see some of these approaches that becomes more widely adopted, for those guys who are looking to keep the same language. Sometimes they have to because there are decades of Java code in your, in your company, you can’t just throw them away. And you still want to use a Lambda and get all the benefits in terms of very little operational overhead and all of that. But I want to circle back a little bit about your company and what you've been doing specifically. Maybe can you talk a little bit about how your architecture look like from 30,000 feet?
Vadym Kazulkin: 22:00
So there are a lot of different products. So just also the architectures look very differently. But generally speaking, currently, we have adopted EventBridge in several services. And just probably we are using just everything that other people use, just EventBridge Lambda, also DynamoDB wherever we can. So just there are situations where we go and use Aurora Serverless, and also SNS, and so on, so just it's probably kind of standard stuff. But in terms of the mindset, with what we are striving for is to be as serverless as possible. So this is like we adopted this, like serverless is a spectrum. And so the first question we ask ourselves, can we implement the things with a completely serverless services like S3, Lambda, DynamoDB with with of course on-demand capacity, so just not provisioning your reads and write capacity units, and so on. So just different teams has different challenges. But generally, this is standard stuff, which we are using, and just also looking like to other companies, how they adopted things like LEGO’s Sheen Brisals, talks a lot about how they do things. So we exchange also opinions, but generally speaking, so just, we believe that that's the way to go. And just, this is my experience. So just, if you measure time, how much do we do we deal with the infrastructure. So the serverless is the right step to into the right direction, because this infrastructure is some kind of necessity, but it gains a lot of focus. And it's probably just, it's not our core at our company. So our core is to build a product, or editors. So just you can design your your product. So just that's what our focus is, not the infrastructure stuff.
Yan Cui: 24:04
Okay, so I spoke with Sheen at LEGO. They're using EventBridge really extensively, and I think, MatHem in Sweden as well. So are you guys using EventBridge like a centralised event bus?
Vadym Kazulkin: 24:18
I think that each team has its own account and use the EventBridge only in this account. But what I really like with EventBridge is this very flexible filtering and routing. So it allows you to send events and then the con—, the consumer can can only receive what what he wants. And it's much more flexible as even SNS where you can filter only on the attribute side. And so you don't need this fan-out pattern and this complexity, so just on the EventBridge and then just hearing only to what you want. So this this gives us a lot of flexibility and just within your releases from the last week, so just even this cross account things become easier. So just, I think that there is a lot of future with this. I'm also waiting for, for FIFO functionality. So First-In-First-Out, it was currently released for SNS. I think and just SQS had this also since several years. And then also this will release for EventBridge, and it will become even more powerful.
Yan Cui: 25:28
Yeah, for listeners who haven't seen those updates recently, EventBridge announced support for event replay, it added support for Dead Letter Queues. And also they've simplified the way you need to, I guess, how you configure cross account event deliveries with EventBridge, so a lot of them has been improved. So since you're talking about, I guess, features that you'd like to see in the EventBridge, and the fact that the re:Invent is just around the corner, what are some of your top AWS wishlist items, things that you'd like to see at this re:Invent?
Vadym Kazulkin: 26:02
Oh, there are a lot of them but just probably starting with, even not features, but but my my general thoughts. So I think that exact billing for Lambda will be a good stuff. So just I see people which which try to optimise the Lambda, which doesn’t take 100 milliseconds by 99 milliseconds to execute, just to save a half. And I think this for the short-living functions, the exact billing will be will be the right choice. And probably there is also more with saving plans at 17% to gain. So just, I think it will, it will be nice for the community just to stop thinking about this micro-optimizations. And, of course, in terms of features, I see that we have gRPC support for application load balancer. So it will be nice if API gateway or HTTP APIs will also get this feature as well and because gRPC is really a popular protocol. And just It will also close the gap there. Otherwise, so I just talked about this, First-In-First-Out for event bus. One thing which which I've really missed is some kind of Elasticsearch, but more serverless. So people love this managed stack, and just, but managing EBS volumes, and all these instances with sizing just doesn't feel right. So if the people from AWS are hearing me, probably it's too late for this re:Invent. But this is one thing which I which I see very useful. Another thing, it's really good that we can attach an Elastic File System for to our Lambdas. But of course, comparing Elastic File System and S3, of course, just there are trade-offs, but there are also some gaps on the Elastic File System side just you can trigger on events like onCreate, onUpdate and so on. And if it will be possible with Elastic File System it will be fine. And also love this possibility of S3 that it is very tightly integrated with a compliance and governance services like AWS config and AWS CloudTrail. And I think there is this possibility is not just they are for Elastic File System. And it will be nice that it will, if it would, it could be added there to the platform. And probably some other thoughts. I like I like this possibility of First-In-First-Out queues because it gives you some kind of idempotency and also ordering. But generally thinking, I think just to say all Lambdas should be idempotent. It's very easy to say but from my perspective, from my programming perspective, it's very difficult to guarantee that each function can be written idempotently. And there are some kind of situation your Lambda can fail all the time. So just this might happen. But what I think if it happens in the middle of some execution, then you have some kind of partial storage state. So that, how can I deal with this and some other functions will see this some some state in between. And just then, it adds a lot of complexity to sometime to the Lambda function that you have to deal with a lot of stuff if you would like to guarantee that the function is only executed maximum one time, so you have to kind of do some kind of tracing on your side. And you have to use some kind of database to save that just what was called and was not a huge complexity. And I think that serverless is really good and one of this mindset of serverless is to write less code. So just what I would like to see that the people from AWS will give us some kind of automaticity within the Lambda function, some kind of tools to deal with all these things, if you have some kind of failures between writing to S3 and so on. So, we have to deal with those or even the platform can deal with those situations automatically. So, I will be very, very happy if it may decrease the amount of code people have to write to guarantee this idempotency.
Yan Cui: 30:22
So on the idempotency bit, are you talking about the fact that for async invocations, Lambda is at-least-once rather than the exactly-once?
Vadym Kazulkin: 30:30
Yes, this is some kind of [inaudible]. There are other boosts like Kafka, they, they they give you a choice between exactly-once, mostly once. Exact once it's very, very difficult to guarantee but sometimes what you would like to avoid the function will be called twice and just this may happen even with the First-In-First-Out queues, just there is also this visibility timeout only five minutes, I think. So just if the message, duplicated message will arrive after five minutes, it will be proceed proceeded. So just there are there are a lot of corner cases that might that may happen. And just then the programmers have to deal with with them by themselves. And just a bit of more help from the platform itself in terms of idempotency will will help the developers, I think. That's my point.
Yan Cui: 31:24
Yeah, I guess, if you can get the guarantee exactly-once, again, like you said, it's very difficult to do for async invocations, that will be a big help. But I think beyond that, it might be really difficult for them to do anything beyond that. It's gonna be really difficult for them to even do that, to guarantee exactly-once, because, you know, if you understand the distributed systems, exactly-once is actually really, really tough to do. And also, I guess, that's not going to really help you if you are doing some type of transactions in your own application code where you need to update the state in several other systems. Idempotency in that case is also going to be really difficult to achieve, especially when you may be say, I don't know, sending a request to Stripe to process a payment. Well, if your function gets called again, because part of the transaction where you're doing something else as well failed, you have to retry the whole thing. While that request to Stripe is not going to be idempotent. A lot of services that you're talking to themselves are not idempotent. So it's really difficult for you to handle partial failures and still make your functions idempotent. So I guess a lot of these... there's only so much AWS is able to do. And I think the the most you can probably hope for is to have exactly-once on the async invocations. For the SQS visibility timeout, I don't think they can actually change that, because that is a mechanism for retrying when the SQS processor has failed, and so I don't think there's anything you can, they can really change about that. I think right now if you create a function with a timeout that is less than certain amounts, I think multiple of six of the visibility timeout on a queue, then you do get a warning, you get a warning from the from the console if you create the function there. But there’s obviously no warning if you're creating the function inside, say, SAM or serverless framework or something. So maybe there should be some guardrails around that so that, doesn’t matter where you, how you, how you're packaging and deploying you function, there will be some guardrails, that tells you “Hey, your functions timeout doesn't quite align with the visibility timeout on the queue”. So maybe you should think about changing that timeout setting on your function. I guess you can achieve that with, maybe, AWS Config, maybe something else. I'm not sure what's the best way to to implement that, possibly AWS Config. But yeah, certainly, I'd love to see if they can get to the point where they can guarantee exactly-once on the async invocations, because I've run into a few problems with that where successful invocations happened twice on the same SNS message.
Vadym Kazulkin: 34:09
Yes, you're right. That's, that's the whole complexity of the distributed systems. But of course, you have at-least-once guarantees, and also at-most-once, will will also sometimes be helpful. So you have the choice. But generally, you're right. If you are leaving the ecosystem and you are calling outside, then of course it's really impossible to guarantee this but probably for, if you use the services within the ecosystem, so they will provide you tools that will probably make your life a bit more easier. But of course, it's probably not solvable problem in general.
Yan Cui: 34:43
Yeah, at least it's a very difficult problem to solve. But yea, okay, but that’s a good list certainly I like the serverless Elasticsearch. I think you're like the 15th person to ask for that on this podcast. It is a very commonly requested service. And I think right now, Algolia is probably the best or closest thing to serverless Elasticsearch. At least, I've seen it's nice service. I've used it a lot. And so I think that's all the questions I've got, Vadym. Before we go, do you have anything else that you want to tell the audience, anything about, you know, personal projects and hiring or things like that?
Vadym Kazulkin: 35:25
Probably one more thing to add on the GraalVM. So we've talked about the GraalVM as itself. But of course, just there are frameworks which which where you can use GraalVM or even the plain Java like, like Micronaut, and Quercus, and even the Spring currently are looking to the Spring native GraalVM project. So that's a probably, so people love these frameworks, because they make you more productive with the whole programming model with Java annotations, and so on. And just all this, this mentioned frameworks they, especially Micronaut, and Quercus, they were designed several years ago with microservices and serverless development in Java in mind. So you have the choices, you can use these frameworks plain with Java, of course, you will get the higher cold start times just similar with with Java, but you may be more productive. And they have similar programming more than like, like Spring framework. And of course, you can use them in conjunction with GraalVM native image, because just, they support this also, and just then you are, then you're really productive with this and have this benefits of the GraalVM native image. What's really nice that all this annotation stuff, many things are happening on the compile time and not on the runtime. So just with, they are built with other trade-offs. And I personally do a lot of the Micronaut framework currently to see how it fits. And I think it's really a nice addition. So just, that's my hope that with the GraalVM with all these frameworks, which which were create in the last years, which they become more mature. And we will see the greater adoption of Java as a programming language. And I think it will be a huge door opener also for the cloud providers for for serverless adoption in general.
Yan Cui: 37:19
Okay, great. Thank you. And how can people find you on the internet?
Vadym Kazulkin: 37:24
So I'm very active on on Twitter. So with the with my first name, and last name, you can find me there very easily and also on LinkedIn. So I don't have any personal homepage. So these are two sources, which I use very, very actively. So probably, Yan, you can put this into the show notes where people can find it, because my my first and last name are very difficult to pronounce.
Yan Cui: 37:50
Yeah, sure. Absolutely. I would include those in the show notes, as well as the links to GraalVM as well as the Quercus and Micronaut frameworks that you mentioned earlier as well.
Vadym Kazulkin: 38:01
Yeah, thank you very much.
Yan Cui: 38:03
All right. SoI guess that's it, then. And thank you so much, again, for joining me on this podcast and sharing your experience working with Java and serverless.
Vadym Kazulkin: 38:13
Yeah, thanks for having me, Yan.
Yan Cui: 38:15
Great. Take care and stay safe.
Vadym Kazulkin: 38:17
Yep. Thank you Yan.
Yan Cui: 38:19
Ok. Bye, bye.
Vadym Kazulkin: 38:20
Yan Cui: 38:34
So that's it for another episode of Real World Serverless. To access the show notes, please go to realworldserverless.com. If you want to learn how to build production ready serverless applications, please check out my upcoming courses at productionreadyserverless.com. And I'll see you guys next time.