Infrastructure as Code, AWS Edition Artwork

Cloud Coffee Talk

Cloud Coffee Talk is a podcast by cloud professionals for cloud professionals. These are relatable, deep dive, unscripted discussions, where technical talk is mixed with the real world challenges of people, process and technology. Each episode features a different domain discussion with 1 or more guests who are passionate about cloud and technology.

All Episodes

Cloud Coffee Talk

Infrastructure as Code, AWS Edition

May 29, 2021 • Darren Weiner and Erik DeRoin • Season 1 • Episode 1

In the pilot episode, Darren and Erik talk about the real world experiences and challenges of working with Infrastructure as Code.

Click here for additional show notes, including contact and references.

Darren:
Welcome to Cloud Coffee Talk, AWS edition, sponsored by CloudButton. These are real world problems, solutions and thoughtful discussions about working in the cloud. This podcast is meant for cloud professionals at all levels of the organization, from the executive team to those with their hands on the keyboard, putting out fires and making the world a better place.

And it's really meant to be unscripted. It's meant to be for those that are passionate about cloud technologies. It's truly just coffee talk. It's not meant to be a cloud 101 discussion. There's a lot of great content out there for that already. This is meant to be something, that hopefully you'll find a little bit different.

I am Darren Weiner. I am the owner of CloudButton, an Independent Consulting Company, focused on AWS Cloud. In our very first episode, we are taking off with a discussion about infrastructure as code, and my co-presenter today is Eric Deroin, Erik, tell our throngs of listeners about yourself

Erik:
Hi, everybody. I am Erik Deroin. I am a - my official title, I guess, is a "Site Reliability Engineer"

Darren:
Oooh, SRE

Erik:
SRE for a company called Training Peaks here in Boulder, Colorado, that does endurance training software for endurance athletes.

Darren:
Excellent. And Eric, you are...You do a lot with infrastructure is code. So tell us a little bit about what you do, how you do it.

Erik:
We are fully in AWS. We manage the majority of our code, or infrastructure, is in infrastructure as code at this point. It has been a migration journey for sure. Most of it uses AWS native whenever possible. So lots of CloudFormation, lots of boto3. Those kinds of things and we sort of default to wherever possible. just using AWS native services. We could get into it if we want...We sort of avoided intentionally any sort of multi-cloud stuff. We just made our bed in AWS, and that's where we decided to lie.

Darren:
So what does look like...a multi-account infrastructure?

Erik:
Yeah, we are. So it's a weird kind of hybrid model at this point where we are, it's monolithic-like architecture that was lifted and shifted into AWS in 2014. And since then we have been breaking apart sort of that monolith. And so we have some of that old stuff still in a single account, and we've been breaking off leaf notes into a multi-account model - dev, UAT, production accounts with, you know, some management and master and a few other sort of helpful accounts that facilitate some of that. All as code.

Darren:
Excellent...and using CloudFormation. You know, the interesting thing - you described what's a fairly mature, organization in the cloud since in AWS since 2014, 2015, you said.

Erik:
14, yeah. Before I was there, they lifted and shifted.

Darren:
That makes it so much more challenging, right? Because...

Erik:
For sure.

Darren:
...what happened with the tool sets since then, whether it's with cloud, native CloudFormation or Terraform or all these other things, a lot of those tools didn't even exist, they existed in a very immature way. There weren't things like multi-account solutions then. So how do you deal with with the old stuff versus the new stuff?

Erik:
Gingerly. Next question.
So, yeah, that's sort of an interesting question. You know, there's a lot of stuff in there that's been hand jammed for sure. Just like people going through the console. People going through cli to just to spin stuff up, and it has been a process of replacing as much of that as possible. We really started that process before you could import things into CloudFormation. So it's been figuring out ways to run some level of 'parallelization'. Sorry, it's a word I can't say, great for a podcast.
And, you know, cutting over and making sure that, rerouting traffic as it needs to happen. And there's just been some...It's kind of a case by case situation. Like tackle things in small chunks. We try to really adhere to lean and agile principles. So we're never trying to rearchitect from the ground up. We're always gonna take one small thing and work on it bit by bit.

Darren:
Is your plan to import all your existing resources like, what's the idea?

Erik:
Not anymore. No, I mean, at this point, we have the majority of it as infrastructure as code. The bits that aren't are some networking stuff that we could import, but we can also, and what we started to do is, it's easy enough to just spin up a new - some of our subnets are still hand jammed - Spin up a new subnet in CloudFormation and start to replace those in our infrastructure piece by piece.

Darren:
You bring things into the new instead of...

Erik:
Exactly, yep. And migrate them over that way.

Darren:
Real challenges with a lot of the mature environments, for sure it becomes really interesting.

Erik:
Yeah, there is..reminded me of a fun story - in one of our repos. There was just an old CloudFormation template that was, I can't remember the name of it you know, 'something VPC and so many subnets' .yaml. And I just, I don't know what this - I haven't seen this - this isn't something we use..and I deleted it. And then, like a week later I end up looking, and I was like: Oh, this was actually the original, yaml file that somebody uploaded to create, our original VPC. It's still hand jammed so the subnets looked nothing like the subnets in that file. But here is this CloudFormation stack number one that still exists. And I deleted the code for it. And I was like, Oh, well I guess I should restore that? Even though it's drifted so far now.

Darren:

One of the really nice things I like about CloudFormation is is the fact that you have things available through the console, and it's so easy. Everything becomes self documenting and very easy to retrieve that information and rebuild it.

Erik:
At least now, you can also then look at something and go look at the tags and you can find out that, oh, this ties back to this CloudFormation Stack through there. So I know...

Darren: They've done a really good job of the auto tagging on all the resources. That is pretty sweet.

Erik:
...so I don't have to go see if was his hand jammed, was this not, and update accordingly, if I need to.

Darren:
Right. So what about CDK - Cloud Development Kit? Tell me a little bit about your experience with it or your thoughts on working with it.

Erik:
My experience with the CDK is I was at ReInvent. They announced it there, and I got really super excited about. I went to a few conversations or a few classes/sessions about it. Tried to learn as much as I could. I wanted to run at it full bore, got into actually exploring it when I got back from ReInvent and unfortunately found that there was some hang ups. So we needed to not pursue that at the time. We had that sort of typical AWS experience, and I mean, no disrespect to AWS that they release something. and it may or may not quite be ready for full showtime. Yeah, because that worked really well for the narrow specifics that they built for.

And that was sort of our initial evaluation. And unfortunately we ploughed full steam ahead with CloudFormation, so that going back and looking at CDK has not really been a viable option for us. That being said, I kind of followed it along because it's something I'm super interested in and really want to take advantage of. At some point, I think it's really the direction infrastructure as code will probably end up going moreso.

Darren:
CDK definitely has legs. It's a real challenge, though, for a lot of organizations, I would argue, and this is almost a religious argument because CDK has a big following at this point, but I would argue that it adds a lot of complexity in the aim of making things simple. There's nothing inherently wrong with that. That's the whole idea of complexification. I mean Andy Jassey used it at the keynote for AWS (ReInvent). But you want to start simple and sort of build because you're always gonna build complexity, right? It's...entropy. I think there's a couple of challenges I see with with CDK for a lot of organizations. One is defaults.

One of the nice things about CDK is you have these simple little constructs or simple little modules - here's a S3 bucket, or here's whatever it is and instead of writing a bunch of code or copying and pasting a bunch of code in CloudFormation, it will do that for you. It'll synthesize the template for you from one line. You could, you know, it's easily ten to one savings in terms of the amount of code you have to write, which I think is fantastic, but it's also relying on defaults. And I think this gets into a really interesting place because if you look at some of the - I was actually looking over some of the documentation on CDK and it talks about the idea that oh, you don't need to be an expert in these resource to use CDK. And I'm thinking, I think there's a problem with that because when you think about things like security, reliability, scalability - all the well-architected components, you can put yourself in a in a really difficult spot if you're just trusting those defaults.

And even if they say they're the best practices or they're meant to be secure, there's his trust elemen, and it gets to...I've been thinking about this a lot.: The idea of the shared responsibility model. The idea that AWS is responsible for security of the cloud and the customer is responsible for security in the cloud. But when you start using these frameworks which are abstracting out all the details, all the properties of these resources - to me it gets really challenging to...you're starting to create a lot of grey areas in that space, and so that that's one of the things that I just think about. And I'm trying to figure out how to reconcile that a little bit.

Erik:
Yeah, there's a couple of things there that I think the first thing that comes to mind is the idea of complexity vs complicated. And I think the cloud inherently is complex. You try and build frameworks, build modules, build tooling on top of that, and that removes making it complicated. There's a few writings around that sort of split of those ideas. I think CDK is attempting to remove bits of making it complicated. I think it does it in a way that is somewhat opinionated, that may or may not fit everybody's use case.

So I think about when I was going to some of those ReInvent sessions and in one of the smaller workshops, and they had a Q and A, and I have to go back and find that - I talked to you briefly about it - and I really need to go track back down that verbiage exactly. So don't quote me. This isn't any official AWS documentation as far as I can recall, but they basically said, they really only expect a percent adoption of CDK. They expect most people to do things through cli, through CloudFormation, through the console, through some other tooling. They're not trying to make it be all things to all people. That being said, I think that style of infrastructure as code will become more prevalent than CloudFormation yaml, which is not super sexy or exciting to write or, you know, Puppet or Chef or whatever yaml file your writing in.

Darren:
I love yaml. Yaml is so sexy.

Erik:
It's Yet Another Markup Language. It's in the name - people are already tired of it. If you want to get software engineers excited about it, you make it code. That's where I kind of got more excited about things like CDK and I think Terraform can offer some value as well that you can actually write “code code“. Now the problem with it, I think like a lot of frameworks, is the black box-ness of it. I talk about this with my team. Oh it just works. You just plug it in and it spits out whatever you want. Well, that works until I need to change this one thing, and then it becomes magnitudes more difficult.

There're a few frameworks out there in coding that I can think of the fall into that. Like I've done a lot in ruby on rails and have had a lot of problems with some of the rails constructs because as soon as you to do something not in a rails way it becomes magnitudes more difficult.

Darren:
So depending on your team and their abilities, this becomes that much more to manage. When you're dealing directly with something like Terraform or CloudFormation templates, they're just easier to sink your teeth into. And again, I know a lot of people would maybe completely disagree with this, but that's in my experience. It's much easier for people, especially if they're not subject matter experts, to really sink their teeth into those instead of having to build out this framework in Python or Node or whatever else.

Erik:
Yeah, I think goes back to cognitive load of it. What capacity does your team have for extraneous cognitive load? Do you have a team that can dedicate itself to building a bunch of CDK, CloudFormation whatever templates, modules, Terraform or whatever structure you use - Chalice - Or do you have, you know, a small group of two or three guy that are trying to, you know, cover a lot of bases? What's going to be efficient and effective way for you to make those changes? And to do it in a way that is maintainable, readable and all those things that infrastructure as code buys you.

Darren:

I agree. Good point.

What do you do when you're trying to figure out something new? What are the resources - you know you're trying to build out some CloudFormation templates. How do you go about that process?

Erik:
Yeah, So we were talking about this earlier, so this is a little bit of a leading one, which is super fair. I generally I have bookmarked - it's one of like, five things I have bookmarked in my browser - the AWS Resources and property types reference in the AWS CloudFormation. The majority of our stuff is in CloudFormation wherever possible. So, I'll start there, and I'll actually read through the documentation. I actually find that documentation for me does a better job of explaining the capabilities of features than maybe some of their marketing material. So actually, start there. And then it's usually a matter of start writing out a CloudFormation template. And then I'll just kinda like bang my head against the wall, trying to get this thing to work. Trial and error to get it to go.

Darren:
That documentation is critical, especially since it's changing all the time. We all know that when when AWS releases a new service or feature - that actually isn't necessarily going to be available in CloudFormation. That's ultimately handed off to the CloudFormation team who are sitting there underwater, buried in all the things that they have to build out for all these different services. And it could take a while. And so that documentation is so important, because every time, even if I have, you know, I have a whole library and sets of modules that I pull from, to construct a solution every time I need to do something for a client. But even though I start with templates I have already built, I'm always going back to the documentation to make sure that there's nothing new and and really critical in terms of having to incorporate those in my template. Things that I think about that were in the last year or so - I don't even know - is things like in S3 buckets - new ACL features available to shut it down to public access that now I can incorporate into my templates and actually use that. So lots of examples like that.

Erik: Yeah, we had one this week where Fargate changed from 1.3 to 1.4 how they bind volume mounts was just like like, this works in our older stack, why doesn't it work in the new...

Darren:
One of the big ones for me in the container world were capacity providers, which..they are still not quite there yet, but that's another story. But at least you can actually build those into the CloudFormation templates fairly reasonably, which is nice.

Erik:
And at least for us, we attend and monitor Reinvent wherever possible to see the new big announcements. We just kind of have baked in at this point that we're going to need to wait eight or nine months before even start looking at this stuff because we know that it needs to get over to the CloudFormation team in needs to go through a few cycles. And this just sort of AWS' pattern that they have. They announce something. It's usually 80% there, and so we can get to watch it mature and hang out and give feedback at times too, as well. They are pretty good about that.

Darren: So you mentioned about your way of going back to the documentation and then sometimes beating your head against the wall, which I think that's pretty much a rite of passage with infrastructure as code, is, beating your head against the wall...

Erik:
And re- re- re- re- re- uploading my CloudFormation template for the 900th time.

Darren: Yeah, the one thing that I will do at a certain point when my head is really sore, is I will reverse engineer it. There are things that could be so complex to do in AWS. Just figuring out on your own that to actually do it through the console and then do a describe against the resource using the cli using boto, using sdk, you know, really describe that. And then you have all the information you need to then build it back in the template, which you now have for posterity for your different environments just parameterized or however you want to do it. So it's so quick to spin up these resources sometimes that you forget that that's actually going to be quicker than running it through CloudFormation, and then I have to delete that stack. And you know, when you're dealing with anything networking related and you delete a stack, it might take a minute. It might take 15 minutes and, you know, you're sitting there waiting for this stuff to happen. Just spinning cycles.

Erik:
So, yeah, you brought that up recently and I just definitely had one of those, like, I'm an idiot. Like, why did I never think about doing it that way? You know, I'm sitting there watching these progress bars over and over again. Just been like: There's gotta be a better way and you mentioned that and I was like, Oh, that's the better way.

Darren:
Because it feels like cheating sometimes, right? I mean, we're all about solving problems - that's what we enjoy doing and it almost feels like it's cheating. But, you know, at the end of the day, you're trying to deliver business value to the company you're working for.

Erik:
It's just one of those - I think it also speaks to the mindset shift of cloud, that we talked about right - pets, not cattle, or we have some vegetarians that we work with. So I like to joke about that.

Darren:
You meant cattle not pets, right?

Erik: Cattle not pets, yes...What did I say? The other thing I say is they are crops not house plants for our vegetarian listeners.

Darren:
Very nice, thank you.

Erik:

Don't need to be about slaughtering animals. I can spin something up in AWS to test and figure out how it works and just pull out all the information and just delete it like it doesn't matter. For some reason there's still - that can still be a hesitancy to be 'Well, I don't want to spin it up by the console because I want to do it in code' and I forget that I can then delete the thing that I spun up by hand.

Darren:
Right. Ultimately, at the end of the day, you have to come out with a good pattern that you can deploy again and again and again. That's what you're after. So whatever works.

Erik:
Yeah. As we were coming in this podcast and thinking about it today that was a thing that we talked about. Why do we do infrastructure as code? Why are we doing this? Why is it so important? There's lots of reasons I'm sure we're going to talk about that here in a minute. But the idea that, there's an up front cost to it - You have to pay that upfront cost to do it in code to figure out how it works. It's easy to do it through the console once. It's a lot harder to build something that's reusable, readable and flexible enough, right?

Darren:
And I've got to tell you, as a consultant, I work across multiple clients, and I have these libraries of whether it's CloudFormation or Terraform that I literally - you're right - I put the work up front and again, even though you always want to revisit and modernize them a little bit - but I can copy and paste this stuff, change some parameters, and here you go, and it's available. Because there's a lot of common patterns, especially if you do a modular design. I'm all about doing a much more modular design. One thing that I was going to say - one other way that you could certainly build CloudFormation is to steal from all the libraries and GitHubs that are out there. AWS certainly has lots of those, and I do use those sometimes, however, they tend to be very large and monolithic in their approach because they're trying to be all things to all people, so they have every condition you can imagine. All these custom resources - and they get fairly complex fairly quickly, and so that could be real challenge when you're trying to build a modular design to really you know, the Legos of the cloud, to really mix and match for a particular project or particular solution.

So I really like that modular design and what I have now is this library to choose from. And it becomes so powerful like, Yeah, I put the work up front, but now it's literally seconds, you know, to do these kinds of things.

Erik:
Yeah, part of me is very jealous that you have all those and you should open source them to me so that I can steal them.

Darren: One of the things about open sourcing, though is then it has to be like, perfect. So then you have to really spend that time.

Erik:
I've looked at some open source projects...I don't know if that's true, Darren.

Darren:
I feel like it needs to be pretty dialed. And you think about those edge cases and those sorts of things because the last thing I want to do is hand over a bunch of crap to someone that's not going to work in a particular sort of environment. So the fact that I have it easily at my disposal and I could deploy it across these accounts and tweak it is great, but you're right, I do need to open source more..I do.

Erik:
I'm just saying I'm gonna steal and rip it off because I think you and I think about a lot of these things the same way. There's AWS labs and AWS example repos out there that I think are tremendous that they put out which is super helpful. But they do things in their very AWS-y kind of way that - what you said is that it is just trying to cover all bases for all people, and you do things in a certain way in a certain pattern and practice that they follow that's pretty close, but not quite what we do so I feel I have to take these down, tweak them, delete them, clean them up, move stuff around shuffle.

Darren:
Sometimes it's easier to start with the basic documentation and build up instead of working things out with all these dependencies and everything else.

So let's talk about a fun part of infrastructure as code. It's the four 'Ps' for me. The ports permissions, parameters and policies. The things that I find the most challenging to incorporate into infrastructure as code, and I think you do as well. So talk about how you kind of deal with some of that and some of the challenges there.

Erik:
'star colon star (*:*)', and call it good.

Darren:
Yeah, best practice.

Erik:
We don't have to be compliant. We don't have auditors, we're a small private company. No, I'm totally kidding. Yeah, that is hard. And it is one of those things that the pendulum can swing in so many different directions. And it's so hard to get that dialed in to be right and to be easy to use and to be, to really follow the AWS model of least permissions or least privilege. And to really do that well.

Darren:
The challenge I have with this...So let's take one of those, for example - ports. You could think about things like security group configurations. A lot of times I'll find myself in a situation where the business really needs something. There's multiple people trying to troubleshoot a particular application. It might be an initial build out in the development environment, but it doesn't matter. There's still three or four people that are involved in some level of troubleshooting. And even though we've deployed, everything as code up to that point, they're like: Hey, it might be this security group, it might be this port - You know, we don't know exactly what's going on. Can you open this port? So you get in this situation where you have people literally on the phone, on Zoom, in-person that you're really trying to accommodate. You need to move fast.

And, of course, if each time you're running things through a template or through a module and/or through a pipeline and waiting for those cycles to happen, it could be really challenging. So what do you do? You open up an inbound port, you open up an outbound port manually, maybe through the console. And then, of course, you have to sort of backfill that once it's dialed in, which certainly can be done. But you can very easily lose track of this kind of thing, and the same thing's going to go with IAM policies as well.

Erik:
100%

Darren:
And then it becomes - and this is where it's interesting - because I found with CloudFormation and things like security groups, especially with really complex security groups - lots of ingress and egress. If you move fast, made lots of changes, getting things, getting the drift, managed - Managing that drift - can actually be really challenging whereas something like Terraform, it's a little bit easier because of the way Terraform manages state. It's one of these areas where I don't think there's really good solutions for yet, especially when you think about what's happening with the business on a day to day basis. Like in theory. Hey, everything is code - you have to do everything as code. But in practice, when you have those business pressures and in the stand ups, it's being called out that this is a blocker because we can't get our work done and we need to figure this out. It just creates this pressure around it.

Erik:
Yeah, I guess, to me, as we're sitting here talking about it, it reminds me of old school C style and java style garbage collection. In modern frameworks it's really not a big deal anymore. A lot of them handle it really well. But previously, you had to pay tons and tons of attention to make sure that that stuff is well taken care of and that, like it was set to do those sorts of things. And I think drift detection and this sort of combo ability to do things by hand and do things through infrastructure as code is a similar kind - It takes a level of due diligence. It takes a level of expertise and it takes a level of commitment to make sure that you follow through on those things. It's important enough that you prioritize that as an engineer. Certainly that makes it hard. It requires that human element, it requires that human intervention. I have definitely pulled up a security group and been like, oh, whoops. I remember doing that to make it - because I was testing something to see if I could get this thing to work, and that port has remained open and that port should definitely not be open.

Darren:
There's probably a whole other discussion there around configuration. Things like AWS config and looking at ways to sort of manage those sorts of rules, and that's maybe another conversation. But I do think that area is interesting because it's an area where the tension between DevOps and the rest of the organization can really be a challenge. Even larger organizations that might have, a cloud center of excellence where it's like, Hey, we're doing everything as code or following all these best practices. This ends up being an area that slows things down. And so when it comes to the business, these are easy things to say on paper. But if ultimately every day it's a matter of: 'This is slowing us down. This is slowing us down.' It creates this tension and it doesn't matter if it's a dev, stage, prod environment. It happens in all environments in my experience, in organizations large and small, and I just think it's one of those areas of DevOps that just still is messy because the whole idea of wanting to innovative and move quickly when you're talking about things like ports, parameters, policies, permissions and again all those expectations around trying to move projects along, this area, for me it's not cut and dry.

Erik:
Yeah, and to me there's kind of two sides to that coin. I always use "the business", right? As engineers, we're part of the business, we have investment in that. So, I struggle with the business part of that. It's not a sexy thing, and it does slow stuff down. And so it's hard to say "Hey, we're late on shipping this new feature that we have promised customers because we couldn't get our security, permissions, ports locked down". That's not great. And nobody wants to hear that. And the other side of that is that I have struggled with engineers and developers who have no interest in - we've talked a little bit about some of that cognitive load, like it might be in code, it might be in something that is accessible to them, but they still operate on that old mentality that operations is gonna, you know, take care of the stuff. I just write the code, I throw it over the fence.

Darren:
What a great segway.

Erik:
I know. I thought you'd like that one.

Darren:
Thank you. Because we did outline this a little bit around the different patterns for how to structure teams and how company culture might play a role in some of that those decisions in that direction. And I think to me, this is the messiest part of DevOps. Because the whole idea of the DevOps piece being extraneous cognitive load for developers - Developers have sort of three things to focus on: This kind of the core code, there's the business logic and then there's all this other crap around the infrastructure and resources and everything else. And now you have these pipelines that when you think about things like containerized pipelines, which are really easy, But at the same time, you have all these templates, this code that's being built into the pipelines. Who owns it? Who's responsible for it? What does that do by way of helping of the teams along or potentially creating even more of a mess in this environment? What does it look like in the organization? In terms of where is the line between what DevOps does and what the developers do, and what does that look like?

Erik:
Yeah, I don't know that we have a good distinction there. So a little background for people is that I actually started at Training Peaks as a software developer. And I have always had an interest in, the AWS side of things and the DevOps piece of it. And all my companies have had some level of engagement with that and was given the opportunity at Training Peaks to step into that. So for me, I come at it from a developer standpoint. I actually I think I'm ill-suited to answer this question because I don't understand the developer who doesn't want to view and understand or take into account how their code runs. And to me, that's part of the environment with AWS. It's kind of how I got involved into it. So to me, it's everybody's responsibility. I might be more of a subject matter expert in some of the AWS stuff, but you should feel just as comfortable running into my world as I do, jumping into our main code base and making changes. And I've been up and down that stack. So, I actually struggle with that whereas, I think you have more of a historically...

Darren:
IT background.

Erik:
...IT background. I was going to say system administrator, but you've done way more than that. Yeah, I think...

Darren:
Because the challenge is everyone can't know everything and especially given, like anyone you talk to, that works with cloud technologies - Even the people that work for AWS say, "gosh it's changing so fast. I'm just trying to keep up". So you have these developers that are trying to focus on bringing value to the business, but they have to deal with all this pipeline crap in these templates and everything else. And so it's ultimately, I think, what needs to happen, how it needs to evolve is really challenging for organizations because what you really need are blended teams.

Erik:
Yeah

Darren:
You need the support, the developers if they're getting stuck on a template, and this is what a lot of what I do, by the way, I engage deeply with developers who are - here's your basic template because most organizations only have a handful of patterns of how they deploy applications. They're doing, a containerized approach so you can build out templates for that, they're building out some basic - might be EC2 application, whatever it is, there's some handful of patterns with the occasional I don't know S3 Bucket thrown in and some other things - CloudFront and whatever services they're working with. There's basic patterns that you could establish and basic templates that you can build for that. But there's always going to be exceptions, there's always going to be things that go wrong.

So if you have that subject matter expertise in-house, and the right culture where people feel comfortable coming, going to the subject matter experts to engage as they are building these things out - to me, that's that's the solution. But culturally, a lot of times that doesn't work in organizations and it's really challenging - every organization has some amount of silos involved in it. And then if you have organizations that have higher security, higher compliance needs, it gets even more challenging because of things that developers simply shouldn't be doing in particular environments. To me, it's scenario that're still evolving. I do think that it's, again, I keep using the word messy because I do think that DevOps - there's a lot of areas in DevOps that are very messy now. But I think that this is where leadership of organizations really play a role because to have those blended teams, if you think traditionally about how all teams were structured at all companies there are these natural silos that develop. But DevOps is a place that can either be this this wall like "Hey, here you go DevOps, do this" or they could be this really facilitating part of the organization to really help the developers move things along, help develop new patterns, new architectures. So I think it's a it's a really exciting place to be, but really challenging.

Erik:
Yeah, And I'm gonna plug a bunch of books now at this point because, like, that's really kind of what opened my eyes to a lot of things. I started with Phoenix Project reading that, like oh my god...

Darren:
Ahh yes, Brent.

Erik: eah, you know, wanting to be Brent, wanting to truly be a Brent and then realizing all the issues with that. And then, you know, getting into the DevOps handbook, getting into Google SRE book, getting into Accelerate. And then the last kind of bigger one that I read, I think Octopus Deploy put together a good chart on like, the DevOps - I don't remember who it was - It was Team Topologies, which kind of talked a lot about what you're talking about. Like, how do you sort of - they use Conway's law. Basically, your organization's structure will determine the structure of your architecture. How do you build your organizations? How do you structure your organizations so that you can build the architecture - your software architecture to match that. It does a really good job talking about that really kind of comes down to a bunch of different team types, and this is totally another topic to talk about another time.

But to your point of - there needs to be a mix. There is that kind of talk about every team sort of follows this fractal pattern. But as you zoom in on teams that you're going to have their appropriate team size and they have the right people. You're going to have people in there were a little more DevOps focused, have people maybe a little bit more focused on the code side of things or whatever. I do think people have a tendency to self-select some of that kind of stuff and that you know, a good leadership, a good organization is going to take advantage of that. I've worked with really great developers who don't really know a lot about AWS. They understand how code runs on a server and stuff like that, but they don't know AWS. I've worked with other ones, that get really engaged with that and take advantage of that. And they're both really good developers. It just kind of where their interest is piqued and where they kind of go and drive. And I think if your company can maximize that and listen to what people have to say and take that feedback...

Darren:
Yeah, I totally agree.

Erik:
...you're going to get a really good team.

Darren:

Yeah, I totally agree. I think it really does come down to leadership and management of those teams, because otherwise what you end up with is - you're relying on individual contributors who might have that greater passion on this one team for the cloud technologies. And so they're going to build out these patterns for that team that really help. But another team that is, you know - maybe it's front end back end, whatever it is - that might not have that particular resource if the leadership isn't really overseeing this and really looking at how to bring these teams and take the best advantage of those DevOps resources, you end up with sort of these siloed organizations, even though you have the skill sets there, so I just think it's - to me it really is about leadership.

Erik:
Yeah, and I think also you need to push back to people that there is some expectation that they're going to take on some of these DevOps sort of principles. So if you move to a CI/CD pipeline model, you let the developers know that you guys were responsible for watching your release. At Training Peaks, we've redefined what "done" means. What done means is that it's in production and it's running as expected. It doesn't mean that my PR got merged. It's like you were expected to watch things go all the way through, like as a developer. And it's my job as the SRE to make sure that you have the tooling and are familiar enough with the tooling to be able to monitor that sort of effectively. How successful I am at that is questionable. But you know, we're a big organization trying to do a lot.

Darren:
Well it's ever-evolving.

Erik:
Exactly. It's a moving target for sure.

Darren:
That brings up the whole realm of observability, Which is a whole area that's blowing up right now as well.

Erik:
And maybe one of my favorites, yeah.

Darren: Yeah, and doing that as code gets really interesting. So maybe another conversation.

Erik:
Oh, my God, yeah. That's such a mess.

Darren:
You know, I think we're getting towards the end. There's so much more to talk about. But from a timing perspective, I think we're good.

Darren:
Well, I think this concludes our first ever groundbreaking Cloud Coffee Talk AWS edition. Eric, thanks for joining me. Hopefully, people will find this interesting. And there's more to come.

Erik:
Yeah, thanks for having me. As always, you know, part of this bore out of just you and I always talking. And so it's one of my favorite things is kind of talking shop, and I have a very small and narrow experience. So not only am I interested to see how this is received I'm also interested to hear what people have to say about what we have to say.

Darren:

You're modest is always Erik.

And thanks to our listeners for tuning in, you can find us on Twitter at @cloudcoffeetalk. We welcome all your feedback. Just, you know, be nice. It's nice to be nice. Until next time, have fun in the cloud.