ShipTalk - SRE, DevOps, Platform Engineering, Software Delivery

ShipTalk - The value of IDP at AWS, the $1M cellular outage, and the most useless beer cooler ever - Kyle Shelton - Consultant to Toyota Racing

April 04, 2023 Jim Hirschauer Season 2 Episode 4
ShipTalk - SRE, DevOps, Platform Engineering, Software Delivery
ShipTalk - The value of IDP at AWS, the $1M cellular outage, and the most useless beer cooler ever - Kyle Shelton - Consultant to Toyota Racing
Show Notes Transcript Chapter Markers

In this episode of ShipTalk (The SRE Edition), Kyle Shelton tells us about his experience using and building developer platforms to enhance the developer experience. He also shares his favorite hobby and cautionary tale about a preventable outage that cost ~$1M ... ouch! Be sure to check out Kyle blog site at https://chaoskyle.com/

Introductions
Just for fun #1 - Kyle's favorite hobby
Main topic - The value of IDPs and some advice on getting started
Just for fun #2 - Kyle's worst IT mess-up

Jim Hirschauer:

Alright. Welcome to ShipTalk. I'm Jim Hirschauer, your host for today. ShipTalk is a DevOps podcast brought to you by Harness, the software delivery platform. And we usually talk about reliability topics, but today we're gonna be talking about developer experience. So my guest today is Kyle Shelton, who's a consultant for Toyota Racing. Kyle, welcome to the show.

Kyle Shelton:

Hey, thanks for having me.

Jim Hirschauer:

Yeah, absolutely. Kyle, could you please take a minute to share your background and what you're up to these days?

Kyle Shelton:

Yeah. So I, I got started in the technical field early on. My, my father owned a telephone company, so I've been pulling Cat three, cat five cable since I was about 10, 11 years old. And my first real job was with Verizon Wireless. I was a telecom technical specialists, and I also helped build the nation's first voiceover LTE network with multi engineering. I got, My career started in the cloud as a network engineer. I got a degree in network engineering from Westwood. And you know, I had to shift to the cloud when I was working with a startup and they decided to, you know, do what a lot of companies are doing now, and that's get rid of all of their on-prem and co-location data centers and go 100% cloud native. So basically my VP of engineering said either get an AWS cert and have a job or, or go find another job if you wanna be a network engineer. So from there that's, that's how I got my start in, in cloud and DevOps and that was my first DevOps engineer job. And I, I, I went from being strictly on the operations systems admin, network engineering side of the house to now getting more involved with the software developers and, and you know improving the developer experience. And so currently I am supporting the developers for Toyota racing. I spent some time with AWS and Splunk, as well as a SRE and DevOps engineer.

Jim Hirschauer:

Fantastic. Thanks for sharing that. Kyle, I think you're familiar with the format of the show, and we don't like to jump straight into the, the beefy tech topics here. We like to start off with something I like to call just for fun. So for this segment, Kyle, I'd love for you to share what's your favorite hobby outside of tech?

Kyle Shelton:

Yeah, so great question. I actually love ice fishing. And you say, well, don't you live in Texas? Where can you go ice fishing? Right. Well actually, when I was working at Verizon, I got shipped out to Colorado Springs, and that's where I experienced ice fishing up there in the high Alpine Lakes. And so I'll be honest, my favorite thing to do is fishing. Being up in the mountains and yeah sitting over a ice hole at negative 10 degrees you know, freezing your butt off and, and the work it takes with the l with the higher elevation too, there's not a lot of oxygen, so, you know, there's, on some of the lakes you can't drive a four-wheeler or anything, so you gotta trek all your gear out into the middle of the ice and hope that they'll come and bite your bait, but, You know, the actual fishing itself is kind of like what I, I think of when, when I fished as a kid on the lake dock, right? With a hotdog and a little brim, and you just throw the hotdog down there and the fish comes up and get it. And that's kind of like what it is in this format because you're on a 30, 40 foot lake and it's crystal clear water. These are the lakes that sit at the bottom of the continental divide, right? And the scenery alone is what kind of mesmerizes me. But yeah, you're in a hut. You drill a hole and you watch down and you'll see, you know, Big old trout come and you got a little bitty pole with a little bitty line, and you have to be really finesse. You can't, you know, jerk'em too hard or, or pull'em out of the hole too fast, right? Cuz they'll just break off. So it's it's, it's, it's a lot of fun. It's kind of crazy. Your body kind of gets shocked after you're done and you're real tired because of all the Like, I guess exposure to the, to the frigid temperatures. I do have all the gear, we have heaters and stuff, but still, I mean, it's, it's pretty intense and it's just it's one of the things me and my wife like to do. Cool. Go up there. We, we, we didn't get to go up this year because we got a little one coming here in a couple weeks, but we'll definitely be out there next year.

Jim Hirschauer:

Awesome. Well, I've never gone ice fishing before. I've done other types of fishing. For somebody who's never done it before, what are like your top three piece of advice if they want to get started with ice fishing?

Kyle Shelton:

Yeah, so I would research where you are and always look at safety measures. It's very dangerous. Obviously, you can die through hypothermia if you were to follow through. So make sure you're familiar with the safety and what, what is safe and what is not safe for the ice. Number two, I would go with someone that knows what they're doing first, because like I said, there's a lot to it. Whether it's, you know what type of auger to buy what type of gear to use, you know, definitely don't just buy a bunch of stuff and go out there before going out with a guide or something just to kind of learn the basics. And then the third thing, which was my biggest mistake was, you don't need a cooler. So funny story the first time I went I'm up there and I, I went out for a weekend with some of the guys I work with up there. These are all native Coloradans. And you know, they pull up in the RV and I roll out and I'm thinking, all right, it's fishing. We gotta have a cooler full of beer, right? And so I literally rolled the cooler on top of the ice and, and met'em out there. And they looked at me and they kind of looked at each other and they were like, what are you doing, dude? And I was like, well, we're, we're fishing, right? We gotta have a cooler full of beer. And they're like, you know, you're like standing on a cooler, right? You don't need a cooler to keep the beer cold because you're on an ice block buddy. And so the biggest thing is that if you do go ice fishing, do not bring a cooler. You do not need it.

Jim Hirschauer:

That's great advice. You know, the safety advice as well.

Kyle Shelton:

Yeah. Yeah. The safety advice is big, right? Yeah. You can get little ice pick things and, and understand like if you do fall through, you know, don't panic. It is crazy because although you might think it, it's thick throughout you can have spots that are not safe. And so I would definitely be aware of all the safety hazards and conditions before you go out there.

Jim Hirschauer:

For sure, for sure. So as a fisherman, every fisherman has their, their fisherman stories about the one that got away. Do you have any great fisherman tales about your one that got away?

Kyle Shelton:

Yeah. Yeah, it's actually my wife. So the last trip we went on she had what I would call a master angler. So I've caught one, I've got it hanging on the wall over there, and it was you know, the day before I was moving back to Texas, from Colorado, I went ice fishing one last time. And I caught that fish basically 10 minutes before sundown. I was out there all day, caught nothing. Didn't get a bite was like, well, this is how I'm gonna leave Colorado, you know, with the worst fishing day of my life. And, and sure enough, you know, it, it only took one bite and I caught a pretty decent size one that night. And then I caught the one that I, that I have mounted on my wall which was a, it was about eight and a half pounds, 25 inches. It's a master angler, certified master angler rainbow trout. And so last year my wife she had one about that same size all the way up to the hole and she. You know, we got really excited. I got more excited than she was right. And I flipped the ice hut up and I was like, oh. And then she got right to the top and it, and it broke off. But after that she looked at me and she's like, okay, I get it. I see why you're so obsessed now. She's like, it's not, it's not the catching, it is literally the one that gets away and that like, you're gonna catch that fish one day no matter what. Even if you spend your whole life trying to get it. But you know, I can relate on the same note with that too, is like when I was a kid, the largest bass I've ever. Scene. I pulled it right up to the boat and I did what you're not supposed to do. I hoed it and tried to get it up in the boat with a little, you know, with the wrong tackle. And I, I broke it off and I will never do that again. Yeah. So, yeah. Yeah. Lessons learned. A few of them.

Jim Hirschauer:

Yeah. Yeah. All right. Well, thanks for sharing that story. Interesting stuff. Yeah. So let's, let's talk tech a little bit. You mentioned, you mentioned in your intro that you were into developer experience, and you know, that's one of those topics today that's just so important. We're we're asking developers to do so much and mm-hmm. You know, we're shifting more and more left, shifting it towards developers. And developers need a great developer experience. They need the right tooling to help them to not toil, to not waste time to not burn out. Right. And I'd love to hear about whatever you're working on in that area of developer experience.

Kyle Shelton:

Yeah. Developers are, I would say, every business's golden goose, right? Like they are producing your products solutions that you sell to your customers, right? And so the, the more golden eggs that they can produce the better your company and your products evolution are gonna be. So it's natural in my mind to make sure that your golden gooses are always producing golden eggs as fast and as reliable and as secure as possible, right? You gotta keep'em away from the foxes. And I'm from Texas and so I like to make farm references, but you gotta think of, you know, If you got chickens that are, that are, that are producing eggs, you want to keep'em in a coop. You want to keep'em away from the foxes and you want to keep'em fed and watered and happy. Right? And they keep on producing eggs. And so same concepts kind of go with the developer experience and, and one of the things that I'm really passionate about is build internal platforms and throughout my career in some form of another I've helped work on or build internal external platforms. You help build a platform to either build a product on or build a tool or, or whatever, but it's, it's kind of like the platforms that help the builders build are some of your most important systems, and oftentimes they get neglected the most, right? And they're the most, you know, Rube Goldberg like strung together machines. Whereas if you spend a little bit more time optimizing those platforms and building those platforms so that the developers aren't spending a lot of time messing with the platform itself versus building whatever they're supposed to be building the results are astounding. I mean, and you look at these cloud native companies that were able to grow. And like if you look at a company like Spotify, right? And how that, that product just took off. And boom, you know, overnight they revolutionized music, right? And they're having to hire thousands of developers because they have to keep up with the demand of the product and be able to sustain the business. And one of the things they did was they created backstage.io and that was their developer portal. And what they did is they streamlined how developers could build. And my role in that is as a DevOps engineer, I build that platform for them to build their features for the business. And I help them. And, and I think the way to do this at scale is to have a portal or a single source for the developers to go to, to either learn about the landscape. I think a lot of companies get onboarding wrong. They might have a high churn of developers because of the market and just the nature of the tech industry and they spend a lot more time than they should on getting developers to where they're actually doing the work. Because, you know, whenever you start a job, you've got that kind of grace period where you don't know anything about the environment, you don't know anything about the software patterns that the company's using. You don't know anything about the monitoring and observability, and then, you know, there's a lot that you have to learn in order to be successful as a developer. And so if you can build a portal or a platform that takes care of, that takes care of your software patterns that you can implement, because I know that's another problem that you might have, is you might have 17 different patterns and you have 17 different skillsets and nobody's on the same page. That's how those things can come. So you know, making the builder experience a lot more quick to be able to develop products or, or standardize your development patterns or I've seen some stories with companies that are leveraging backstage to make cost optimization kind of gamified by creating like, Hey, here's your spend score. You're optimized, you know, up to 98%, and I know Harness has a really cool cost management tool that can help you with looking at right-sizing information. As a TAM, when I worked at AWS, That was the first thing I did because to me that's a low hanging fruit and you know, you can go into Trusted Advisor in AWS and they'll give you all your right sizing recommendations. I think each cloud is different and that's why I like Harness because we have multiple environments and we might be using multiple providers and being able to aggregate that I really like how you can do that with Harness. But yeah, I think, you know, gamifying that and showing your developers that say, Hey, if you build this right, you get a score. Maybe you can do pizza parties or something for the highest score each quarter. You know I found that gamifying and putting things in competition produces results right, because it kind of gives people motivation to do better. Absolutely. And, and so yeah, so on that I'm obsessed with the building. Yeah,

Jim Hirschauer:

yeah. On, sorry, go ahead. Along those lines, you know, along the lines of results, right? So do you have any mm-hmm. Examples or stories for us of results that you've seen? Positive results for the business that you've seen, or for the developers themselves? It could be mental health, it could be in velocity, whatever the tangible benefits might be. What are the real benefits that come from really investing heavily in developer experience?

Kyle Shelton:

There's a white paper that AWS did with Toyota Motors, North America and their chauffeur platform and, and just the velocity of onboarding new developers and they were kind of forced this, and a lot of people are forced to this because traditionally, pre covid, everybody's in the office, you had everybody in one place and you could kind of enforce things better in an office versus. The world shuts down everybody's remote. Okay, well we have to create a whole new remote working solution. And so if you look at what they did the cost savings, the ability to move fast and pivot fast from a POC to like an actual working solution. And, also security-wise too. They're able to quickly implement their security orchestration and patterns because all the development goes through one source of truth. So you can get in front of that. One of the cool things that we did at Amazon, I always try to replicate, what is done there because it's a proven system that works. And I think their onboarding system, like there was no question of where I was, what I'm doing, and what I need to do to get to where I could do my job. Right. And I. You can implement a system like that with a developer portal. You know, because you can just send'em to the developer portal, you know, stage one. They'll go through this, and even Amazon's doing the gamified training, right? So you can get your certifications through the game and making it fun, right? Yeah. Make training fun, make education fun. Your developers are also, these are some of the smartest people in your organization, right? These are highly intelligent software engineers, developers, programmers. Give them the ability to continue to grow the curiosity Knowledge-based solutions are great. And you can do that in a developer portal.

Jim Hirschauer:

Yeah. IDPs are really hot topic right now. Everybody's looking into, yeah. How do they utilize them? How do they implement them? What are the benefits? So, as someone who has experience here, what are your top pieces of advice for anyone that's heading down that path? How should they get started and what's the best approach for them to take when they look at this holisticly.

Kyle Shelton:

Yeah, I think there's two things. I think buy-in from the top down. So having buy-in from your leadership that, hey, in order to make this work, you have to kind of draw the line in the sand and say, this is the new way we're gonna do things and everything whether it's team silo isolation or, or just a bunch of patterns or, or lack of observability or whatever. This is not something that happens overnight, but you have to make that commitment to say, Okay, from here on out, each business unit is, and it's more about the DevOps culture in general, right? As like you build it, you own it. But having that top down buy-in, I think is the most crucial thing because it's hard to implement such a dramatic shift of the way you do things from the bottom up. Yeah. In my opinion and in my experience, that's the number one thing. And then the number two thing is don't take a set solution that somebody else has built. Look at what you have and and leverage what you have. Help those that might be a little bit stubborn in this transition. Help them grow, help them educate, help them learn. The technology that's out now is, is outstanding. And if you can take your programmatic mind, And then take these concepts and then just adjust them a little bit to the, you know grow, grow, grow internal. You gotta foster what you have. And there's no one playbook, there's no one system that's gonna work. So hopefully that makes sense.

Jim Hirschauer:

Yeah, yeah, absolutely. And, I agree with you. Every company is unique. Every company has their own unique structure. They're unique challenges. They're working on usually unique products. And it makes sense that yes, you have to treat Overall solution as something that's gonna be unique to your company. And it can have shared components I think there are off the shelf solutions that people are certainly gonna purchase. Like observability for example, is, is one of those, right? There's many different observability solutions. So people are gonna take that and they're gonna customize it and make it fit for their own use cases. So, certainly agree with that there's no one single off the shelf solution that makes sense for everybody.

Kyle Shelton:

Yeah, and another thing too is things change fast. If you look five years ago, there wasn't a cdk, there wasn't, you know, Terraform blueprints and there, you know, all these new things. And as the technology evolves especially with open source software, like it's fascinating how quickly an idea becomes a product, becomes a community. Right? Right. It's, it's insane. And so you know, be ready. Just when you think you understand things, you're probably gonna have to pivot and reinvent yourself as a business. But if you look at the ones that are still around, that have been around for a long time, they've. Yeah. And you know, it's part of it.

Jim Hirschauer:

I think that's part of the beauty of what we get to do. We work in technology, right? And technology changes very rapidly, and we have to keep adapting with it. We have to keep learning and we have to apply that to our business, whatever company we're working for, and that's where the real magic happens. Being able to apply these fast moving technologies to help companies achieve their business use case and their business goals.

Kyle Shelton:

Yeah, it's amazing. And it's really fun when you can see direct results. Yeah, it's cool.

Jim Hirschauer:

Yeah. Awesome. All right, well we've covered off the, the tech material, and I want to jump into our second, just for fun segment. I've worked in it for a long time. As our listeners know, I've had my share of mess-ups. I've shared a couple in the past before, and I love asking this question. Kyle, what's your worst it mess up?

Kyle Shelton:

Okay, so this is a painful one. I really thought I was getting fired after this, so this was about 12 years ago. This is when I had just first moved to Colorado Springs. I worked for about eight months as a contractor with Verizon Wireless and had the choice for a full-time job in Midland or Colorado Springs. I chose Colorado Springs. I was managing all the cabling contractors for the data center. So I was in charge of all the transport equipment. I worked with all the Cisco juniper routers, switches, firewalls, load balancers, anything layer two, layer three, transport in the core data center, I was a part of. And so one day I was executing a night change, so I got in, I, I used to work 10:00 PM to 6:00 AM and execute all my maintenance windows through that. When, you know, call volume is at the lows. This was back in the day of like if, if we don't provide the cell service, there could be credits back on an account. They're not making money. You know, it's a big thing at a carrier level when you break things. Sure. So I was executing a mop and on all of which is a method of procedure. And on one, all of these you had your basic, I don't know if you remember a Cisco like copy TFT flash, and you had to copy your Cisco configuration. To the like local little flash drive and to the hard disc, right? And so on, on like my third maintenance window, I did not do that. I, I just skipped through that and I was saying, it's no big deal. Everything will be fine. And so I went through and I did a, a, this was on a Cisco FEX 8500 I believe, big giant like card, you know, layer three switch. And I was doing an a OS upgrade on it. Well, when, when the switch came back it came back with an empty configuration file and you're talking hundreds of ports in a data center that were just unconfigured. So basically I took out about six cabinets of AAA', which is the authentication system at the time for your cell phone. So pretty much long story short, it took out almost the whole west region could not authenticate their phone, so anybody that turned on their phone couldn't authenticate with the network, had no service, couldn't make an emergency call, it was a big deal, like that was bad. And so I figured that out really quick. Well, remember I said I didn't copy the, the router config over in, at the beginning, and so I had to go in there with my laptop and an old console serial cable and line by line, copy every single configuration for that layer three switch. It was, I think three to 4,000 lines, one by one. It took me(brutal). A long time. It was one of the, I mean, and this, so, and it really didn't come up until I got that and committed it and, and the switch came back up. And the service was restored and, and, but like those two and a half, three hours of me being in the data center copying line by line, cuz you couldn't do it more because the way the, if you've worked on the Cisco consoles like it, it'll paste. And if you try to paste more than one line, it gets all jumbled up and it just doesn't do anything. And then you have even a worse config. Yes. Which is it, it, it's a pain in the butt. And so, yeah. That I will never not copy a router config or take a backup of a configuration ever again. My boss just stressed. He's like, I have to write you up for this, obviously, and, and you can't do this again or you will get fired. But he's like, you have to follow exactly. Even if you know it's gonna break, then back out and say this broke. But if you don't follow the procedure provided by engineering it is 100% your fault. If they give you a bad procedure and it's not tested thoroughly, then the, the onus is on them because they, they did not set you up to succeed. And there's all those checks and balances and, and so yeah, it was a scary time. But yeah. That, that was it.

Jim Hirschauer:

Yeah. You know, I, I heard a saying a really long time ago when I first started in, in my IT career and mm-hmm. The saying was, you're only as good as your last backup. Yep. And I really took that to heart. I'm like, oh. Yeah, that makes sense. Mm-hmm. If, if everything blows up, all I have left is my last backup. Right. And who knows how long ago that was. Right. I'm hoping it's, you know, less than 12 hours old. In those days, it was probably, could have been up to 24 hours old. And so if everything mm-hmm. Just, you know, worst case scenario, you're losing up to 24 hours of data potentially. But at least you have something, you have something to fall back on. I can't imagine sitting and having to copy three to 4,000 lines of configuration one at a time. Right. Any idea how long that took?

Kyle Shelton:

Yeah, I went home at about 1:30 2:00 PM I remember the outage recovered, so the upgrades, like the upgrades started at like 1:30 in the morning. Mm-hmm. It bled through the maintenance, so there's acceptable time, like with outages. And so I had up until like 5:00 AM and then I ended. I think recovering at around eight or 9:00 AM Okay. So I wanna say probably three to four hours. Yeah. You know? But still, my manager told me that those two or three hours was over a million dollars worth of money lost mm-hmm. For the company. Wow. And is that, and, and then, so like, Like, you can only imagine we're talking, this is only six out of like, you know, 300 cabinets too. Very small portion. But you take out all the network, all the layer two. Yeah. Things aren't gonna work.

Jim Hirschauer:

Yeah. Well that's it's a tough lesson to learn for sure. But we're all human. It happens to every single one of us. It's happened to every single person who works in IT if they've worked for long enough.

Kyle Shelton:

Yeah. And you learn the backup lesson, either the easy way or the hard way. You know, most of us learn it the hard way, right? Yeah. Yeah. But yeah, crazy.

Jim Hirschauer:

Alright. Well listen, Kyle, it's been a pleasure speaking with you today. Thanks so much for joining us and sharing your knowledge and your wisdom, your ice fishing story, as well as being vulnerable and sharing your multimillion dollar iT mess up. I really appreciate that. To all of our listeners if you're an SRE or if you're in a related DevOps type role and you wanna be a guest speaker on ShipTalk, please send an email to podcast@shiptalk.io and we'll get back to you. Thanks again, Kyle. Appreciate it. That's all for now. Until next time.

Kyle introduces himself
Just for fun #1 - Kyle's "cool" hobby
Main topic - The value of IDPs and some advice on getting started
Just for fun #2 - Kyle's worst IT mess up
Closing - how you can be a guest on ShipTalk