
Customer Support Leaders
Customer Support Leaders
268: Mastering Incident Management - Part 2 of 6; with Kat Gaines
Mastering Incident Management - Part 2 of 6; with Kat Gaines
Embark on an insightful expedition into the nerve center of incident management with Kat Gaines and myself, as we unravel the essentials of tailoring an effective process for your organization. We're not just bystanders in the realm of crisis resolution; we're the architects designing the blueprint. This episode promises to hand you the reins of your own incident management strategy, urging you to define incidents and their severities in a way that resonates with your business's unique pulse. As Kat shares her wisdom on building from the ground up, I draw from both our experiences and the well-oiled machine that is PagerDuty, ensuring you leave with a toolkit brimming with proactive solutions.
Commanding the spotlight, the role of the Incident Commander is dissected to reveal the mastery behind maintaining clarity amidst chaos. Supported by an ensemble cast of deputies, scribes, and liaisons, we discuss the art of orchestrating an efficient resolution process. Discover how integrating tools like Slack's canvas feature can transform your documentation and communication strategies, enabling your team to make swift decisions and remain in control. It's about more than just the technicalities—it's about empowering your support leaders to assert their expertise without the looming shadow of repercussions.
Finally, we emphasize the symphony of soft skills that harmonize with technical know-how to form the crescendo of incident resolution. Personal anecdotes underscore the delicate dance between listening and projecting confidence, especially when the stakes are high. We challenge the preconceived notions about non-technical roles, advocating for their technical savvy and pivotal role in customer communications. Tune in for a paradigm shift that will equip you to conduct a well-orchestrated incident process where clear roles and effective communication are the keystones to success.
As part of this episode, Kat shares some awesome PagerDuty resources showing the structure of an Major Incident Response Team and the roles we discuss over this and the coming episodes:
Different Roles - PagerDuty Incident Response Documentation
and
Complex Incidents - PagerDuty Incident Response Documentation
Hello and welcome to Episode 268 of the Customer Support Leaders Podcast. I'm Charlotte Ward Today. Welcome Kat Gaines in Part 2 of a six-part series on incident management. I'd like to welcome back to the podcast today Kat Gaines. Kat, thank you so much for joining me again. I feel like I'm going to start every single one of this six-part series saying how excited I am about the six-part series, but I'm going to try and hold back. But I think everyone should just take it as read from now on. But yeah, part two here we are talking about incident management again, welcome back. What's on the plan for today's discussion?
Kat Gaines:Yeah, thank you for having me back. I think you know it's fun. If you want to say you're excited every time, I'll say it too. I'm excited. I'm excited for this conversation and the last one and the future ones, it's all thrilling uh, so yeah, we, we gave everyone an overview last time.
Kat Gaines:We just started talking about incident response. Why really customer support, customer success other customer facing teams may care about incident response. Why really customer support, customer success other customer facing teams may care about incident response and the fact that you don't want to be just a bystander right, you want to be involved in the process. You want to feel some ownership. We laid out some of the basics for folks. I'm trying to remember exactly what we talked about, but that's a little bit of what we talked about last time, so we want to dive a little deeper today. We want to talk about designing the process and what good looks like. So we had talked a little bit about you know, charlotte, for example, you mentioned that your team owns the process at your company right now.
Kat Gaines:There are other customer support teams who do the same thing that I know of, who own the process end to end, and there are others who just are involved in a subset of it, and no one way is right. Both of those are totally valid routes to go. But even if you're not owning a process, you should have a hand in designing it. You should have a hand in being able to say here's what good looks like.
Kat Gaines:Here's what I know from the industry standard that people like, pager duty layout, right, what good looks like. And let me, if I have, if I'm someone who has this knowledge, even if I'm in a team that isn't directly responsible, let me help coach on that knowledge that I have. So we'll talk about that. We'll talk a little bit about where teams fit in. We'll try not to bleed too much into the content of the next episode, which is going to directly face some of those responsibilities that, from the customer support perspective, we want to own. Yeah, good stuff ahead today.
Charlotte Ward:Wow, yeah, yeah, I love the kind of the process design bit. You know where you've got a blank sheet of paper and I think, actually, I'd really like us just to flesh out that bit of this first, because I think you and I both sit, I mean, I, I, I like to think we have a good ish incident management process. It's going to be nothing like as mature as paper page duties. I know that Um and uh and and we, we we're iterating, but we have gone through nothing like the iterations. I'm sure you guys have too. We're iterating, but we have gone through nothing like the iterations. I'm sure you guys have too.
Charlotte Ward:But everyone starts with that blank sheet of paper, don't they? And so I guess, where I want to open, the kind of getting to a good process or good-ish or good enough for now kind of process is right at the beginning almost of process. It's right at the beginning almost. I mean, if you're and it is going to be mostly support leaders listening to us and support folks listening to us I don't think there's a support person on the planet who hasn't thought we should probably be managing this kind of situation differently. Yeah, where do you go from there?
Charlotte Ward:you recognize yeah, you recognize that there's a need to do something about this hotbed of mess that you're in today. Where do we go with that blank sheet of paper?
Kat Gaines:or stewing because the latest incident really ruined your day or your team's customers. Or maybe it wasn't even that dramatic, maybe you just said you know there's room for improvement here, or maybe the whole company was scrambling because you don't have something laid out yet. All of those states are kind of like, oh, what do I? How do I even get started? But yeah, laying out that blank piece of paper, one of the first things you want to understand and define and again, this is not individually, this is as a business is what an incident is for you. So we talked about that a little bit last episode. Our formal definition is something that is a disruption or degradation of service that usually is going to be actively affecting your customers, your end users.
Kat Gaines:For us, it's specifically a disruption or degradation of services affecting customers' ability to use PagerDuty. That's how we define it internally, right? So everyone's definition is going to look a little bit different. But first of all, just knowing what's an incident. What do we think is a problem? Usually it's going to be something that you can't resolve too quickly, something that requires some kind of coordinated response.
Kat Gaines:That's when we slide into calling it a major incident rather than an incident itself, and so you need to design a response process. That's the first thing you need to know. Once you have that figured out, what does an incident mean for you? The next thing you need to think about are severity levels, which, again, this is going to change a little bit for everyone, but you have to know what constitutes an incident and how to classify it by severity. And so, depending on your business, again, you might call that a SEV definition, you may call it a priority, and basically, the lower the number, the higher the risk or the higher the response. So it's going to be anything from an operational issue to, uh, just something that is a little bit higher severity and a major incident. So, for example, what we call a major incident is something above a sev three for us, and that gets a more intensive response.
Charlotte Ward:I want to come here I want to touch a little bit on the words, just just to be really picky severity and priority and urgency. I've been talking about the difference between those three things for, I'm sorry to say, the best part of 30 years in support. There is a difference, isn't there? I mean, you touched a little bit on, like, well, this is an operational issue, but actually the difference between severity and priority and urgency, I think is important to understand because ultimately, yes, we've got a label on a field somewhere in Zendesk or PagerDuty or any other tool that drives that response that you talk about.
Charlotte Ward:But but I think, um, knowing from the way I would say, the way I would classify those is, uh, severity is the, the impact, the actual impact on your customers, on your customers business, um, it's the thing that you can quantify quite directly. Urgency is how quickly it has to be done, which is different from severity, and priority is really an output of both of those generally, plus some other weightings. That's usually applied by, you know, escalations and things like that, but priority is the order in which you tackle things. And so I want to be really clear about those three things, because most of the time when we're talking about incidents, we're talking about severity being the primary driver, I think but that's not always the case but because I think that sometimes, what can seem to be a fairly low severity incident can nonetheless require a more elevated response for all sorts of reasons. So I just want to lean on that, because I've always been quite opinionated about it's important to know the difference between those three things, because they can drive different responses.
Kat Gaines:Yeah, and I tend to. I personally tend to sway the same direction as you're describing it. Severity is your major incident descriptor, that is. There's nowhere else that that term makes sense. Urgency feeds into that. Priority can both be informed by severity and it can also feed severity.
Charlotte Ward:It's kind of a bi-directional relationship.
Kat Gaines:I personally prefer a structure where you're using if you're saying you know someone's a P1, p2 in terms of priorities. That's more used to describe your everyday issues, maybe bugs in the product things that aren't kicking off a major incident right.
Kat Gaines:But once you're saying severity, that's really no. This is an issue with some severity assigned to it. This is something that we're saying. It has one of these priorities, but we know we need to tackle it. We know that there is an external facing issue here and, yeah, we're using that urgency as well from our definitions to inform that. And I say that this is my preference. I'm calling that out specifically because we have seen people use those interchangeably, where they're using priority or severity to describe major incidents. At the end of the day, fine, use whatever words you want. You can make up a completely new system of words for it if you prefer to. That's fine.
Kat Gaines:The biggest thing is to make sure that your organization all understands exactly what each of those levels mean Honestly, if you want to call them the names of different star trek shows, and that's what represents things, it's fine it's probably going to get a little confusing, but if everyone understands what it means, that's okay. That's perfectly fine. Um, on standardization, I think severity is usually the the term when you're talking about a major aspect, so that's a good point to call out yeah, yeah, sorry to completely hold us on that.
Charlotte Ward:I just think sometimes, like you, you were pretty, you're pretty clear early on saying this is how we define and we've got similar definitions. Where I am right now it's it's like it's all severity based. It's all about, you know, access or ability to use the product, achieve business goals with the product, um, and we try and have those as consistent as we can across our incident process as well as our bug. You know management as well as. So there's a lot of like. We don't want completely divergent um definitions that make no sense, like three different ecosystems for for tickets and bugs and incidents would be weird. But yeah, kind of adjusting that language so it sort of has a consistency and therefore it's just much easier to build that common understanding that you're talking about, which is really critical and sorry, there ended my interruption about severity.
Kat Gaines:Not even an interruption perfectly aligned with what we're talking about. I think the next thing I wanted folks to remember too, about severity as well, is that often, when you have these different definitions figured out, you want to figure out what type of response they warrant. So, for example, a SEV-1, the most critical issue it's a major incident response. People are pretty much dropping everything to work on it. Right now. You know for a fact that an incident response coordination has to be kicked off in some way. You have a different set of guidelines that you'll follow, for that that's great, that's your highest level, and you can go straight down to. You know, for example, a set five where it's something cosmetic or a bug, and maybe you may have assigned a severity to it initially because you thought it may warrant an external response, but you realize at the end of the day that maybe it doesn't warrant that, or maybe the response is relatively minimal, that, hey, we know, that's there, we will get to it when we get to it. Maybe it's something you decide not to fix because it's so cosmetic, so it can really run the gamut. The most important thing, though, is that, once you have those definitions figured out, you want to spend a lot of time figuring them out, getting them right.
Kat Gaines:But if you're in an incident and you don't know which severity something is, you're like, oh, this could be a sub two, it could be a sub one, just assume the worst, just treat it as the higher one so that you're not under, you're not underreacting, you're not underdiagnosing the issue and potentially not getting the resources you need.
Kat Gaines:If you accidentally have too many resources, you accidentally treat it as too urgent. The cost for that is pretty low, the cost for that is okay, cool, we get to dismiss people from the call early. We get to say everybody, go home, this is not as big of an issue as we thought it was. But if you underdiagnose it and you realize later that it's a much larger issue, then you have to spend more time, effort, energy roping people in. So having those definitions figured out is one piece, and then the second piece is really understanding when to use them and not punishing people for using them. I think we get a little scared sometimes of either calling an incident or saying, oh, decisively, this is what this incident is. But if you have that authority to say that, if you know for a fact, again, there's going to be very little risk from doing a little bit too much. There's going to be quite a lot of risk from doing too little.
Charlotte Ward:At the end of the day, yeah, I think that's where, like this culture of like, of um, where, like this culture of like of letting an incident be declared like and having ensuring that anyone who is going to wear that incident manager hat that we talked about before on the last session is doesn't feel scared, doesn't feel any kind of nerves, doesn't feel that there are any negative outcomes at all of making the you know, in little bunny air quotes, the wrong call like it's, it's like. This should not be a conversation anyone needs to have. There is no circumstance under which an instant manager, instant commander, whatever you call that person should feel like there is potential negative fallout from them, that they're going to be called out for getting it wrong right right and they're ideally they're going to be the authority on the end.
Kat Gaines:So actually this is a good place for us to talk about roles a little bit, I think, and just walk folks through what those different roles are and what it means. When we're thinking again about defining this process, we want to understand our roles and how they work. So you mentioned an incident commander, incident manager those terms can be used kind of interchangeably. So in your call, those folks are really the single source of truth for what's happening and what's going to happen. This person could be from well, depending on your org structure and how you're doing this any team. They don't necessarily have to have specialized domain knowledge about different parts of the product. In fact, it can be helpful sometimes if they're a little bit more of a generalist and they know more about broader areas of the product rather than focusing in on one piece, but again, they can be from pretty much anywhere.
Kat Gaines:Their responsibilities are a few things they want to help prepare for major incidents, make sure that all of the infrastructure is set up, make sure that they're funneling people to the right places when something is happening, when there are major incidents. Their main role is to drive major incidents to resolution. So they are wrangling people. They're getting people on the same communication channel.
Charlotte Ward:They're getting people on the same communication channel.
Kat Gaines:They're collecting information from people who are working on actually resolving the issue. They're collecting proposals of actions and then recommending what action is being taken. They are not jumping in and resolving themselves. That is too much for them. They should be just staying above and essentially coordinating what's going on during the incident. So this could be I was saying proposed actions helping decide what's happening next. This could be on the communication side. There's someone specifically handling communication, but they're helping direct what that communication is. They're reviewing drafts of communication. They are getting opinions on what's happening. They're asking if there are any strong objections to their decisions. Notice that I'm not saying consensus. They are asking about objections.
Kat Gaines:So if you're sitting around waiting for everyone to say yes, I think that's great. I think that's great. I think that's great you're going to be waiting for a while. You can't make fast decisions. So instead, the question we ask when we're making a decision whether it's are we taking this step, are we posting this message? Anything else a decision needs to be made around is are there any strong objections? That gives people a really clear in to either stay silent, in which case the incident commander can assume cool, no strong objections you do need to give it enough time. You can't just ask the question and then immediately move on.
Charlotte Ward:I was going to say that has to be time bound. I mean, usually you've got to give minutes at least, haven't you Right, yeah?
Kat Gaines:Yeah, yeah, yeah, give it a little bit. You know we all have the seven to 10 second role. When you ask a question during a meeting, give it a little bit longer than that. But it gives them a really clear and if no one says anything, they can move on and say, great, that's the decision. Or if someone does say I have an objection, they can say, okay, that was a clear and specific objection, let's talk about it and then continue repeating the process.
Kat Gaines:so they are coordinating the call, they're the people responsible for deciding what's happening and then they have and I think when uh you this episode, maybe we can give people the visual on the documentation. There's a whole chart of people stemming down from the incident commander.
Charlotte Ward:So the next. The incident commander really is the hub, isn't it? We talked a little bit about this last time. I think you made a really good point. I think it's just worth calling out and emphasizing, and that is that the incident commander doesn't have to be an engineer, it doesn't have to be somebody with specific product knowledge or specific customer knowledge, and in fact, in some ways it's kind of useful if, even if they've got those expertise that they kind of pretend they don't in the incident Right and just explain it to me like I'm a five-year-old, it kind of works you know, but also being able to explain out.
Charlotte Ward:So I think there's a lot.
Charlotte Ward:From my point of view, the way I see instant commander is that hub and that person that's facilitating and helping make those decisions and drive to resolution.
Charlotte Ward:But but they're also, I think, the person who should have that awareness at a high level of everything that's going on in the incident but also be able to output it to people. So they are the person anyone else should be able to go to and say and whether that's asynchronous or whether they're updating some kind of status document or whatever, whatever the mechanism is like, they are the person responsible for the status and I think you said that before. But anyone should be able to go to the incident commander and say explain to me what the hell is happening here, right? And I think being able to explain it across the business is important. You touched on in that role role description there how this is a broad role and often it's advantageous if they have a broad view across the business so they not only are they aware of who to pull in and when and uh and and help those people make decisions, but they're also able to explain why we need those people at any point in time, right?
Kat Gaines:yeah, exactly, and then they can also help deputize people for some of those actions like the explaining, the communication.
Kat Gaines:We talk about communication a lot externally as support people, but communication internally as well is incredibly important, so that you don't have folks jumping into the call, for example, and saying what's going on here, distracting the effort right, so, effort right. So they'll have a few people kind of working with them. They'll have their deputy, which this is going to be a less long-winded description, but essentially it's someone who can help just kind of make sure tasks are staying on hand. If there are timers set for things, they can monitor those. If they think there's anything they see that the incident commander may not have addressed, they can help call that out. They can also step into the incident commander duties if they need to. They can manage the call completely if, let's say, the IC needs to step away for some reason, and then someone else in kind of the same family. When we see the chart, we'll see these three people. At the top is the scribe who they're doing exactly what it sounds like.
Kat Gaines:They are transcribing what's happening during the incident. They're documenting what's happening and so they can note that. Again, your systems are going to differ. Something we see that's really common is that people will, for example, in their maybe they have a Slack channel dedicated to their major incident coordination. The scribe will do their scribing within the Slack channel itself and write down literally what is being said, what actions are happening, specifically a key action as it's being taken, things like status reports, basically anything that is a really key call out that people need to know. That's what the scribe is making sure gets documented. What's any fun stuff on the side?
Charlotte Ward:That's what the scribe is making sure gets documented. What's any fun stuff on the side? I think this is a really useful little aside into using the tools you have, sometimes Because this doesn't have to be particularly in the early days, it doesn't have to be anything fancy. And again, leaning on what we're doing, we use Slack a lot. Every major incident runs in its own Slack channel. Every major incident runs in its own slack channel and that new feature that slack put in place, the canvas inside the channel is an amazing space for scribing it's so great.
Charlotte Ward:Yeah, it's incredibly helpful and it's exactly that's where we manage status, comms, drafts and timeline. All happens in that canvas. It's brilliant, simple solutions that sometimes just just get you off the ground right right.
Kat Gaines:You don't need anything complicated. You can really spin these things up, and I think that's what we're talking about making it accessible absolutely, absolutely, um.
Charlotte Ward:So so we will publish a uh a diagram to uh explain these relationships visually. The the page. You want to be great, um, so we've got at the top or in the center. However, however we want to view this, um, we've got our ic, we've got a deputy and scribe. What are some of the other roles and how do they kind of relate?
Kat Gaines:yeah. So next you'll kind of drop down into the second level, which are the liaisons. So we're going to do a whole separate episode on this. Because we want to do a whole separate episode on this? Because we want to do a deep dive, because it's where a lot of customer support folks end up sitting in the incident response process. But the liaison role is really what it sounds like it is, being a liaison with the incident team and someone outside of the incident team, and so I mentioned a couple of minutes ago the internal liaison as one potential option. You don't, for example, want every single person on your sales team jumping into the incident call and saying my customer is asking me what's going on, what do I tell them?
Kat Gaines:You're going to be repeating that 80 times and then you'll never resolve the issue right. So an internal liaison can do that for different customer facing teams, for other internal facing teams who are just interested or just need to know the status of what's going on. They can help keep your executives informed. We like to use some fun terms like the executive swoop and poop is one we've heard and used occasionally where you just have an exec jumping into the call and they're like I'm the CEO, what's going on?
Kat Gaines:I need the status right now. The board is after me, after me, all of these things. Right, you don't want to put them in that position. They're anxious because it's the business for which they feel largely responsible for which is at stake, and that's fine. That's normal for them to have that anxiety. But you want to quell their anxiety by giving them a dedicated place to look, so they don't have to do that. They know, okay, this is the Slack channel I go to, or they can, for example, get an update from somewhere else.
Kat Gaines:So something that we developed at PagerDuty to help people with this problem were status pages, which can be both internal and external, where you can go, get a quick status of what's happening. You can also send notifications to specific people to say here's what's going on. So PagerDuty or a tool like it that's allowing you to do something like that, is another way. As long as they know where to go, they know where to expect the information. Again, like you're saying, the tools don't even have to be complicated. They can just know to look in this Slack channel, and that's the communication Slack channel, where it's free of all of the other incident crud.
Kat Gaines:But there is a clear timeline of what's going on and what kind of messaging is approved to be used. So the internal liaison handles that. They work with the incident commander and anyone else who needs to have input to craft those messages and post them wherever they're going to go. And then the external liaison does the same thing, but externally for your customers, for the people using the product who are going to be looking at it and going well, what the heck is happening here? They're the ones to post that message for customers. If you have an external status page that you're using, once you get to that threshold, they're going to be the person to also post that for the rest of your customer support team so that they know, for example, example, what responses to use when they have cases coming in and they need to respond to them that's super consistent messaging is, yeah, exactly, really really important, that consistent messaging yeah, yeah, yeah, they can give them templates, all of the yeah, exactly, exactly, those are two quite communications focused roles the internal and external liaison.
Charlotte Ward:Um, and I know that we'll deep dive particularly into the customer liaison in, uh, probably our next session, right, but but there are other liaisons, aren't there? So you, you may have like an engineering liaison, for instance, and, and, yeah, and what other, what other potential kind of? I I mean, for instance, what does an engineering liaison do?
Kat Gaines:So that's what we tend to call engineering. Liaison is one word for it. We tend to call it a subject matter expert, because they may be from multiple engineering teams or you know, potentially may even sit on a different team, depending on the nature of the incident.
Kat Gaines:Let's say you're having a major billing issue. You may have someone on the billing team who has to be that subject matter expert because they're familiar with those systems right, and so we call those folks subject matter experts. They're the people who are going to be effectively carrying out the resolution of the incident itself.
Kat Gaines:So, you'll have multiple of those folks usually. Sometimes you may just have one or two. That's a little bit more rare. You may also call them a resolver, but basically they have to be either a domain expert or an owner of the specific piece of your product that is impacted. They have to be people who though, like we said, the incident commander doesn't need to know everything. This person does need to know as much as they can about the specific piece that is impacted. They have to be the people who can diagnose issues. They can fix them. They should have some pretty well-developed communication skills because, again, they're going to have to communicate back into the incident itself. They're probably also going to have to keep their team apprised of what's going on as well, so they'll need to be able to communicate back and forth really well. As an aside, this is why I hate it when people call communication, empathy, human skills, soft skills, because to me it implies that, oh, they're soft, they're easy, you don't need any special training.
Kat Gaines:Some people find them very hard. Some people find those things very hard, and so I just I shift to calling them human-centered skills, because that's really what they are, at the end of the day, working with other human beings and communication skills fall in that.
Kat Gaines:So they can't just be someone who is brilliant at creating and maintaining the product but they can't communicate for anything right. They have to have both. And if they don't have that communication side as a leader, you have to have both. And if they don't have that communication side as a leader, you have to take responsibility to train them to do that. Or if they're not under your purview, you have to encourage your peers to say, hey, this person clearly needs training before they can continue to be involved in the incident response process. That human side of the process itself is, honestly, a lot of the time more important than just knowing the technical components of what's going on.
Charlotte Ward:It will help drive resolution faster. I agree, and I've got two little sides for you here.
Charlotte Ward:One is that I constantly tell my 10-year-old who is into coding and other things right now that the best engineers are the good communicators, because yeah you have, software is a people business and I I know there'll be people listening to this who aren't at software companies, but a lot of them are going to be, let's face it. And, and you know, software is a people business and you have to be able to articulate your ideas, explain yourself, reason, persuade, etc, etc. So I think, communication skills if you're at tool in engineering and worried about your comm skills or thinking about getting into engineering in in any level support or or development or anything please learn to communicate, invest time in it.
Kat Gaines:It's so critical um invest that time and use it as an asset too. I think that's the thing people don't realize is that if you, let's say you are in support and you're moving into an engineering role, don't downplay your human skills when you're advertising your skills to your prospective employers when you're seeking out that other role.
Kat Gaines:Don't be ashamed of that background, because it shows that you're an excellent communicator. It shows that you're someone who can deal with people and that it's going to continue to be a need in the work that you're doing going forward.
Charlotte Ward:Absolutely is, absolutely is.
Charlotte Ward:And I think the other thing, just leaning on like the, the, the notional difference potentially between a subject matter expert and what we would call an engineering liaison is that in particularly complex environments you can have multiple subject matter experts, uh, who are needed and necessary to get to a point of resolution, but actually you don't want all of those moving parts in your incident process as a primary voice. So I guess what I'm getting at is, for instance, if your product is complex enough that this resolution might need the SME from two or three different engineering teams, plus your, you know, your site reliability folks who are working on like deployment, plus someone from like an operator, technical operations kind of role who's doing the like you might like, doing the deployment or roll back, etc. You might actually have quite a few what I would consider to be SME level voices, but you don't want all of those voices giving everything into that main instant channel. I think sometimes what you need is somebody with an overarching kind of engineering plan, let's say a technical overview and in my mind.
Kat Gaines:Sometimes there's an argument to say lifting it up that little half level, somebody who's going to coordinate the engineering response across multiple engineering teams where there might be three or four smes but but bring back a kind of cohesive story to the central incident response, might be appropriate that can be a valid way, yeah, so I think that, like, what we coach on is both both ways are valid that you could have multiple smes and you know, you just have to have your incident commander carefully managing the communication side and not having everyone talk at once, um, calling on them, assigning them tasks, keeping track of that.
Kat Gaines:But if it gets especially complex like I think that the little chart that we're gonna, we're gonna attach to episode, there are like five SMEs listed and that's, that's a pretty, that's a pretty much an upper limit of number of people that you may have acting as subject matter experts. But if it gets more complex than that, or if those five people are working on things where the complexity just becomes so much that the incident commander alone can't keep track of them, yeah that's a valid way to go to appoint someone as the engineering liaison and just say here's what we're doing, here's the plan, and then act as that. Extra point, you do want to caution adding too many kind of degrees of separation. Yeah, between the smas and the incident commander. They want to be able to work together well. But again, it's just all about the level of complexity and what's going to move things forward and what's going to hinder it?
Charlotte Ward:yeah, yeah, yeah, I think. I think that's fair. Um, because the, the, the degrees of separation, as you put it, uh are just going to. It becomes like the telephone game, doesn't it, you know?
Kat Gaines:yeah.
Charlotte Ward:I need to know the status of something and it kind of gets, gets mistranslated, you know, three years deep, and comes back to you with a completely different set of like here's.
Charlotte Ward:Here's what we can actually achieve yeah yeah, yeah, um, so I think we could. We've probably. I'm sorry we've squeezed so much already just talking about roles, but I know we wanted to get to kind of what good looks like in an instant process. So I think actually through the process of describing these roles it kind of gives you an idea. I think We've talked quite a lot about how they fit together. So we don't need to necessarily unless you really want to deep dive into the mechanics of the process, because I think the roles often drive that right but what is the state that we are hoping to achieve when we get these roles defined in place? What does good look like for a process?
Kat Gaines:Yeah, I think I've said this before I find myself saying this a lot lately that an incident should feel as calm as regular working operations, and that sounds like, I don't know. Is that a hot take? It sounds aspirational, because we all do tend to have that panic response when there's an incident happening. Hottest of takes, though, because it is how things should be. It should be a state where you know what's going on, you know where to look If you've forgotten parts of the process, you know where to go find that, you know who's in charge, you know who to talk to, and so, yeah, it's a little bit of an elevated stress level from normal, but it is still just business as usual.
Kat Gaines:It's something that you can handle. I think to the piece that really plays into that is call etiquette and everyone knowing what's expected on a call, so I guess I've just got a few tips to drop in there, which is that if you are participating, you should participate in every way that your business defines participation participation. So if participation is both being in the call and in a slack channel, don't just cherry pick one of those. Definitely don't hang out in the slack channel and just start saying stuff when people are on the call we all know that person.
Kat Gaines:We all know that person know them, love them, bless them. Please get it in the call if you have something to say. Yeah, making sure that you're minimizing distraction for others too, is a good practice. So keeping your microphone muted until you have something to say. I'm having super intense allergies right now, and so there've been a couple of times where we've been talking that I've just needed myself to cough or sneeze a little bit. Same practice goes, but especially in an incident call, if you have your dog barking in the background, your kids are running in to say hello all of those things can be really distracting. So staying muted until you get something to say being clear, too, about who you are and why you're there Don't assume that you're a celebrity.
Kat Gaines:It's really tempting to be like well, I was told to join the call and send a better expert, so everyone knows who I am, and that's true. They may have said hey, you know what? I'm inviting Charlotte to the call because she's a subject matter expert in XYZ, so I'm going to bring her in real quick and then you join a couple of minutes later. They know, but the best practice is to state who you are and why you're there. So if I join and let's say you know like I'm a support leader joining. For that reason I say I'm Kat, I'm the leader of support. Cool, they know who I am. They know why I'm there. A little bit about what my agenda is. I can even be more specific and say I'm here to fulfill customer liaison duties. If that's why I'm here. Um, as specific as you can be, but at least state who you are and just the system that you're involved with.
Charlotte Ward:As far as this incident, I would like to yeah, definitely. I would like to add something there, which is that I'm trying to say this in a politic way, which is, I mean, you're absolutely right, like don't go into the call and think like I'm a celebrity. I'm here to fulfill this role, this is entirely my domain and you know I'm going to make all the decisions about my area of, like the seat that I'm sat in, right. So I think there is like an element of when you go into those calls or those Slack conversations. However you're doing this thing is be willing to listen, but also be confident enough in the role that you are there to fulfill, and I think it's like it's really, I think, the first few times you do this.
Charlotte Ward:It's going to feel weird to say, to be so confident about your role as I see you know yes or no or take a vote on this, like to be that direct and be that procedural as an instant commander feels alien, and the same could be true, I think, of people sitting in you know that kind of sme role or the customer liaison role.
Charlotte Ward:It feels a little alien the first few times you do this and I think it's okay to say yeah, I'm here doing this, I am going to listen, but what I don't want you to do is the strength of mind I think comes in in a in a small way in saying I'm here to listen to your opinions about, for example, what I should be telling our customers, but please don't make decisions and go off and tell customers stuff without me knowing about it. This is the role. It's like there's a level of respect as well as confidence. I think in those calls that you have to kind of build over time. It takes practice.
Kat Gaines:Yeah, it's a really fine balance of respecting the process and respecting the fact that the incident commander is in charge, but then also having your domain expertise and commanding the respect that comes along with that, which is absolutely acceptable to do, and I think you're making a really good point. I think the imposter syndrome just gets to customer facing roles so easily, because it's easy to jump in the role and say, oh, I'm not technical, whatever that means.
Kat Gaines:It annoys me so much when people say that because it's like what does that mean? What does that actually mean? Because you do have technical skill. You may not have the same technical skill as an engineer billing the product, but you do have a level of technical skill. Don't discount yourself.
Kat Gaines:I am famous, I think now, for shouting people down when they say customer support. People aren't a technical quote unquote. Folks can't see. But I'm doing a lot of air quotes here with my fingers.
Kat Gaines:But so being able to have that confidence and knowing that you do understand the product, that you do talk to customers and help them through the intricacies of the product every single day as a support person, and being able to say you know what. I do know this part of our product. I do know the common issues that come up with it. I do know how customers work around it, so I do have authority in how we should communicate around it, or I may have input around what we've seen from customers, and so you do.
Kat Gaines:You have to have that balance of being able to find your voice and I think for customer facing roles it's a little bit heavier of a balance on no, please step into your authority here, because we just tend to do it less. We tend to do it less, we tend to defer more. I think that is just a byproduct of decades of trauma in this industry from a customer support lens, of being treated like the last priority a lot of the time. But you are a priority in these moments. You are a priority at your business in general. You are a priority in incident calls because you are there to care for your customers and ensure that they get the very critical support they need during these calls.
Kat Gaines:So that's what I want everyone to remember when jumping into an incident response call, all the things I said respect the decorum, make sure that you're not being distracting, ensure that with the incident commander is giving instructions, you're following them and that you're stepping in place with them as well, but also that you're letting everyone know that you are the expert on what your customers need and you are there to help guide them through that.
Charlotte Ward:Yeah, yeah, I couldn't agree more. And you know, even if you aren, aren't you know, the, the product expert in support, or particularly you know, or if you really are not that technical, I mean that is, I feel a bit told off now, kat, because I say that quite a lot, but it's for me, it's it for me, it's kind of I think I use it as a shorthand for explain it to me you know, it's kind of I I don't want to be so explicit as kind of say explain it to me like I'm a five-year-old, because that helps me explain it
Charlotte Ward:to leadership or customer or whoever. But it also helps me figure out, like what the actual impact is and how we could explain it and you know, in the right language for customers. But but also like, just ask questions, like if you, if you feel like this war room is going off in a direction you don't understand, just pause it, just say, hang on, I, I just need to understand, because it's easy. We all know, you know those groups of engineers. They love to kind of just deep dive into a solution straight away and forget that there are people in the room who may not know actually the the potential implications of a proposed solution or whatever. Right, so it's okay to say can you just explain to me, if you do that, for instance, what will that mean under these circumstances or for that customer? And like, just ask questions too, and that's that takes confidence too, I think yeah yeah, yeah.
Charlotte Ward:So, uh, don't be afraid of a war room I think is what we're saying and step into, like, own the skills that you have and you're there for a reason that's the main message here. Yes, yeah, yeah, yeah, yeah, that's awesome. Thank you so much, kat, um, this has been another one, I'm gonna say it now.
Charlotte Ward:I'm gonna say I'm prepared to say it now. I'm going to say I'm prepared to say it now. I was excited and I'm excited for the next one. Still excited, yeah, still excited, still excited. So next time, little Teaser, we're going to be talking about those roles, and particularly the customer liaison role, in a bit more depth, aren't we?
Kat Gaines:Yes, yeah, we'll be spending a lot of time on that, a lot of time on where, again, for a lot of people, support really slides into this process easily, and on just owning that part of the process and what it looks like Awesome.
Charlotte Ward:Thank you so much. I look forward to the next one. I am excited. It's going to be great. I will talk to you again soon, thank you, yeah, thank you, yeah, thank you. That's it for today. Go to customersupportleaderscom forward, slash 268 for the show notes and I'll see you next time.