Detection Dispatch (Alex's Version)

A DE's Guide to Staying in the Loop feat. Your Favorite Detection Engineering Instructor Hayden Covington

Alex Hurtado Season 1 Episode 2

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 48:22

Detection Dispatch (Alex's Version) episode two brings on the person who treats detection engineering like an actual craft....not a vendor feature list, not a MITRE bingo card, not a vibe coded rule you ship and forget. Hayden teaches detection engineering at Antisyphony Training and runs the SOC at Black Hills Information Security, which means he's not theorizing. He's got the reps, the scars, and even a home SIEM with documentation. This is the episode for practitioners who are watching Claude write their detections and quietly wondering if they're slowly getting worse at their job.

In this episode we cover:

  • The detection lifecycle nobody actually closes: research, write, validate and the canary step that tells you whether your thousand rules are quietly dead in the water six months from now.
  • The CTI firehose problem. When every vendor blog is just an ad wearing a threat report costume, how do you find the gold? (Hint: DFIR Report and Google TI don't need your clicks)
  • AI writing detections: yes, with caveats. No for junior engineers who've never written a query. And absolutely not without a review agent, an experimental pipeline, and final approval from a human who still knows how to dribble the ball.
  • Why you cannot send AI out like a Pokémon and what happens to your detection program when you try.

Find Hayden at @kilobytethedust and at antisyphontraining.com.

Detection Dispatch (Alex's Version) is an independent detection engineering & threat hunting podcast. Rebuilt. Community-first. Featuring a lineup of the real and active projects pushing the limits of detection engineering, threat hunting, and everything in between.

SPEAKER_00

Today's episode is brought to you by Detections AI, a fast-growing repo where the DE community shares peer-rated detections and adds new content every single day. Welcome back to Detection Dispatch, my version. For episode two of this relaunch, I wanted to kick it off with everyone's favorite detection engineering instructor. The person who I feel like more than anyone else that I follow treats detection engineering like a true craft and like a something that you actually have to teach and not just a checklist. And today he's here to talk to us today. So, Hayden, it is my absolute pleasure to host you on the podcast. Thank you so much for coming on. We were just chatting and I feel like I first encountered you and you had a completely different setup. I saw you for the first time in person at Blue Team Con. You had that really cool game that you were giving away. Uh, and then at RSA, we met up virtually.

SPEAKER_03

Yeah.

SPEAKER_00

Which by the way, you did not miss out on anything around RSA. So good for you that you stayed back.

SPEAKER_01

Yeah, yeah. R RSA is not really my jam. Black Hat is kind of like pushing it. I'm definitely more like the small convention kind of person. I don't I don't do the big corporate ones as much anymore.

SPEAKER_00

Yeah, like Blue Team Con. That was definitely that was your jam there. No, it was a bunch of AI, AI, gentic AI, fatigues. I felt so oversaturated, but yet underwhelming at the same time. Like once you actually got into like seeing seeing a lot of the same of the sameness.

SPEAKER_01

Right. Everybody just has the same product. It's just built on either anthropic or open AI, and it just does the same things.

SPEAKER_00

Seriously. I I because of it, I feel like anyone who's not using AI in their booth or headline is almost the differentiator at that point.

SPEAKER_01

Right. Exactly. Like you have to use it, otherwise your share goes down. Like that's that's how it is at this point.

SPEAKER_00

Literally. I I was about to throw myself like off the pier, like pier 39 with those sea lions. Like I would have had a much better time with them. But you know what I did like though? Uh the weekend before leading into RSA with B-sides SF, they are the biggest, like big, biggest B-sides in the world, probably. And the Decibel, the they had they had a like a parallel track going uh at the kids' children's museum, something like that, like alongside the RSA, those those were phenomenal tracks, I will say.

SPEAKER_01

Nice. Yeah. Yeah, the B the B sides are also a lot more my jam. I prefer the smaller stuff. You can get like actually more connection with people, you can actually talk about things. There's not as many salespeople there.

SPEAKER_00

That kind of thing. Don't we love that? Uh I'm gonna have Johnny Christmas coming on uh in a couple of episodes. And that's that's it's he he likes to talk about like the right way to sponsor as a vendor. And I think actually Anti Syphony in Black Hills does this really well too, is to be a passive sponsor and to invest more in the content so that you just recognize and associate the brand as someone super credible. And that and that's something that I really commend you guys for doing. You guys have probably the best workshops. The last one I joined was the Unix, the Mac OS investigation.

SPEAKER_01

Yeah, with Patterson.

SPEAKER_00

Exactly. You really walk away better from it, truly.

SPEAKER_01

Yeah, all of our content is very much made like without an agenda. It's made because people wanted to make it, or it's made with like passion and like we treat our marketing, and I use that word loosely, like the same way. I went to a con in Missouri recently, and we went with like a bunch of our game. It's a tabletop incident response card game called Backdoors and Breaches, which is phenomenal, by the way. So cool. It's a lot of fun. I won't go down that rabbit hole, but we brought a bunch of those. We brought a bunch of what we call survival guides, which is like tips and tricks for people new to the industry and stuff. And we brought some marketing guides, which talk about our services, and we forgot to give those out pretty much the whole first day because we were busy like playing the game with people at our booth and stuff. We're like, oh man, we are such bad salespeople. We need to a couple of people came back and asked, they're like, Can I have one of those? And we're like, oh yeah, sure.

SPEAKER_00

But that that attracts almost it almost like filters it out for you instead of like the RSA scan everyone's badge who really they just want to pick up your swag, but instead it filters out people who really who maybe is probably a better lead for you down the line, you know.

SPEAKER_01

Yeah, it there's a lot less agenda in that conversation, right? And that's a lot of what it is in even in the cyberspace at these points these days, is like you talk to somebody, they want something from you. That's always what it is. Like there's always a catch. There's always they want to sell you something.

SPEAKER_00

Totally.

SPEAKER_01

Always, always somebody has an agenda. It's so sad.

SPEAKER_00

It is, and that's something I want to get into as well. It's figuring out what parts of detection engineering you really have to, or content out there, which there's not a lot of. So we we uh selfishly would love for you to keep posting and blogging and continu and doing a lot more of that because as vendors put put out content, especially throughout research content, it's so hard to really sift through what is even reasonable, or is it a hidden agenda of them just trying to sell you their sim at the end of the day?

SPEAKER_01

Right. And and that frustrates me more than anything else, where it's like something uh I saw recently there's been a lot of supply chain attacks on different open source repositories. And I've seen a lot of people post about those, but they're posting from like I saw a post of that, which was talking about this this breach, but it was an ad, like a Twitter ad, like sponsored ad. Like your article is just an ad for your product. I don't care about anything you have to say. I don't even want to waste the tokens to summarize your article.

SPEAKER_00

I'm not hell no. Yeah, figuring out which article is worth your token. That's that's that's the headline.

SPEAKER_01

Yeah, yeah, exactly. Yeah.

SPEAKER_00

Well, I definitely want to get into that. And for those that don't know him by now, Hayden is a detection engineering instructor at anti-si am I saying that right? Anti-siphon. Anti-siphon. Yes, training at Black Hills as well. And I love that you're constantly talking about this loop, this cycle, this life cycle of and this is the first time I've heard it said from you is the the canaring. So like the research query, back test, but then the canaring of it all too, the validation of your detections. I want to get into that because I think that's that's pretty that's pretty genius. So so but before we get into that loop, I also want to get into the loop of how to stay in the loop as a detection engineer so that you're not like you're not getting clawed out right like a lot of these other companies are are. Uh and so I want to get into that. But overall, it sounds like you've been good. Have you been working on any personal projects or is it just all the workshops taking on taking on all your free time?

SPEAKER_01

Man, I think I spent uh we were talking about this recently on an internal discussion about AI stuff. I think once they dropped OpenClaw, which is like a framework for agenc automation and things like that, they put that out. And I think I spent pretty much every night and weekend for like two months straight building one of my own. Um and I I've since downsized it some and made some changes to it and things. But it is to the point now where I have like back-end applications that I host that I've created that it can interface with um to do all sorts of ridiculous things. Like it can assume or assess what groceries I'm likely to currently have, and then like build an order on my grocery website, and then I can go in and fact check and complete it. Like those have been my personal projects are a bunch of like weird AI nerd things.

SPEAKER_00

That's been that's a really good use case, especially for building because you know what I'm really bad at is letting like letting a lettuce go go go bad.

SPEAKER_01

Oh yeah. Oh my gosh. It goes bad like 40 minutes after you bring it home. Like it goes quick.

SPEAKER_00

And the more organic you go, like I I my mom, you know, being a having an almond mom, she has literally ingrained in my head that it you need we need to buy all organic. And the more organic you buy, the less it, the less it survives. And it's like, okay, I need to make sure I make a salad, a sandwich, uh, a lettuce wrap, like unless it's gonna go bad.

unknown

Right.

SPEAKER_00

So that was really good for that.

SPEAKER_01

Yeah, I've done a lot of stuff like that. I I read a book a while back that said uh it had a quote in it. It was an okay book. Uh, I can't even remember the name of it, but there was a quote that said, the price of really getting to know and understand artificial intelligence is at least three sleepless nights. Meaning you will have to effectively put in a lot of time, sometimes your own time to really understand this stuff.

SPEAKER_03

Uh-huh.

SPEAKER_01

And I've done that partially out of like interest, but I've also found like a lot of my personal projects overlap very, very well with things that I, you know, start to do on the job too.

SPEAKER_00

I I agree with that, but I think I agree with the quote more is to really understand something, you must teach it simply. And that's something that you really do well.

SPEAKER_01

I appreciate that. Because I've I've had a lot of questions. I did a workshop recently. It's just it's called like detection engineering crash course or something. I changed the name in my head all the time.

SPEAKER_00

Um, is that the one with like hundreds of hours or something like that?

SPEAKER_01

No, this was the one that's like four hours, and it's just like go from zero to writing detections in a couple hours. But that one, like, I had a bunch of people ask me, like, when is your advanced courses coming out? I'm like, I don't know if I want to. Like, I really enjoy the foundational level stuff. I do I'm teaching with uh with some friends, I'm teaching a course at Black Hat this year, which is very, very advanced on detection engineering and threat hunting.

SPEAKER_00

Yes.

SPEAKER_01

And so that one will be more on the advanced side. But from like my own training endeavors, I really just enjoy like the fundamentals because if you can get the fundamentals right, all the rest of it is just like figuring things out and and learning little bits and pieces and tips and tricks. Like it's it's like a sport. Like you can understand how to play, like you can understand all the rules of how to play soccer, you can understand all the nuance and all these different like tactics and formations, but if you don't know how to dribble the ball, you're not going anywhere.

SPEAKER_00

Like seriously, and same same approach to your claude. It's like if you jump into setting up your claw to do all this automated detection engineering, but you're not bringing the fundamentals, like that's not that's not setting up for success there.

SPEAKER_01

Like no, it could be setting up for failure because your claude could be writing broken detections and you'd never know it.

SPEAKER_00

And you'd never know it ever. And it's and that's and that's the craft of it, right? It's like you frame this life cycle as like a scientific method, like rightfully so. But so many people treat or see detection engineering as like a one-shot. Like you just write a rule, you ship it, you move on, and that's not what it is.

unknown

No.

SPEAKER_00

So where where do you see like in the life cycle where teams where teams break the most? I'm sure there you're getting questions all the time in your classes, like about detection engineering. Like, what's what's that common like question, I guess?

SPEAKER_01

Yeah, I I think the two places where I see people mess up the most is like right at the start and then right at the end. So the the right at the end piece is you mentioned like canaries, which you can say canary, you might think like thinks or cyber deception. This is basically alert validation or detection validation, though. So it's effectively running code that should theoretically, if your detection works, trigger an alert. Uh, and I've seen it done a bunch of different ways. The best way that I ever saw was that a pretty big company, so they had the the power to put put it behind or put behind their program, but they would have a basically separate infrastructure where you could go in and say, This is the detection ID, this is the code I want you to run. And then it would run on a schedule on whatever you know platform and versions you specified, and it would run this code. If an alert didn't fire, it would then trigger an alert to the detection engineering team. And then if an alert did fire, it would like squelch it, close it out. Your team would never have to see it, but you'd know that your rules are working. And so that's like one of the biggest stumbling blocks because you could have a thousand detections and maybe you're not logging the right event ID. It doesn't mean your detections are broken. There's a number of things that can go horribly wrong. That's why you have to do like that full like validation end-to-end of your detection stack. And it has to be like consistent too. Because if you do it one time, that's great that your detection works right now. But what about like six months from now and something changes in your logging policies? Like you could be dead in the water and not even know it.

SPEAKER_00

Ever. Ever. That one fires me up. Truly. And and also there's I've seen levels of maturities with that, uh, the the validation of your detection. Because that that could either just be just a Python script, or you can waste some some some tokens with with Claude. And and I I will say Claude has been help more helpful now to simulate the data, the data that you would need to run a query against. I guess like how would you how would you ensure that the rule is is working just because it runs it runs a query without any syntax breaking, or because it find finds a result?

SPEAKER_01

Yeah. So with our detection stack in the Black Hill SOC, it's fairly fairly advanced at this point, but we have so many validations that happen before a rule even can get to like the pull request stage. And so before it even becomes a PR, we have not even agentic validations. We're running validations with our sim platform. We have you know their SDK and we're running scripts on it to make sure that everything conforms and we're you know test deploying it to test orgs to make sure that everything finalizes right. And so we have all of these steps that, you know, before we even spend a token, all we're doing is spending, you know, GitHub actions minutes or whatever it is. And so we do all of those checks before we even bring in like an agent to review things.

SPEAKER_00

Okay.

SPEAKER_01

And then the agents are also you know very specialized towards our use case. They have a lot of specialized instructions and mock events they can pull on for sampling. And sometimes they're still wrong. But one thing too is like these validations, you don't always have to use AI for them because if you can use like your first step being some sort of code that will theoretically work every single time the exact same way, you're not running the risk of like a hallucination causing something to get through that shouldn't.

SPEAKER_00

Exactly. Exactly. I then that and that's that's the canary.

SPEAKER_01

Yeah, that's that's the canary. Yeah. The the other piece that like is so hard for people is like, what do you write detections about? That one people get all the time. Uh and I did a talk on that uh like a month or two ago at uh a Black Hills at a sock summit.

SPEAKER_00

Wait, it's not it's not headline chasing?

SPEAKER_01

No, damn it. It sort of is, but the the whole talk was on like how do you turn CTI into detections? And a couple of the points I made were very much like what we talked about at the top of the show is like a lot of these articles are just sales pieces and they're worthless to you. And so the the talk was about how do you like find the actual gold nuggets in this CTI, just you know, fire hose of nonsense. Yeah. And that's like that's a hard thing to do. And so people get stuck on like, where do I get my detection stories? Is what is what we call them at Black Hills. It's like, where do you get the idea to write a detection? And it can come from CTI, it can come from your boss, uh, it can come from customers or stakeholders. Yeah, but that's like such a hard thing to have a pipeline of ideas because at a certain point, like, okay, like what do we detect? The things that we don't that we haven't detected yet. Well, how do we know what those are? Like you you just have like this, you don't know what you don't know kind of mindset. And so you need like a way to focus like your efforts.

SPEAKER_00

What agree. Sifting through a lot of this, and now because I've I've seen the marketing side where some friends say that there's this whole new tool called Growth X, and they're generating AI content because it helps them, it helps their company be recognized by the LLMs. So, like if the a buyer is researching a tool, it it'll show uh the more you generate content and the more it'll show up across these tools. Cause now we're all buying tools from GPT, AI, whatever. And that's and and in order to keep up with that, AI has to generate content for you. Like with all that AI slop out there, it has never been more hard to sift through something meaningful, like to actually go back and operationalize back into you know your detection engineering by backlog.

SPEAKER_01

Yeah, yeah. And and slop in the DE world is not dead either. Like we've had we have a lot of agentic like detection drafting. And even then, like we're we're still trying to tune out the the pieces of it that creates detections that are good, but are way too loud. Yeah. Like we had one recently for some pretty, pretty prolific malware that we wanted to put out a detection for, and the rules were good. It put out, I think, four rules for that, but one of them was just extremely loud and just totally useless. And so that that doesn't mean that AI doesn't have a place in detection engineering, it doesn't mean that we did something wrong. Kind of. We have a an experimental pipeline in between production and that caught the bad rule. But the the answer is like, how do we move forward from that? And the the approach that we were taking is well, we need to like stage an amount of data for this agent to query across to assess potential volume. It's true. It's easy to say, but it's a lot harder to do in a way that makes sense or works well.

SPEAKER_00

No, it's really true. Is there is there a hack that you've picked up that lets you quickly filter through this like vendor-generated threat intel CTI that you that you've come across? Because I'm still figuring that out.

SPEAKER_01

Yeah. So I I use two ways, well, I guess three now to sort of get my my intel. So before for a while, I actually had like an application that I spent like a month and a half five coding. And it would basically score articles and it would look for specific keywords, and it would do a bunch of fancy deduplication, all kinds of stuff. And so it would actually surface things that were relevant. That was eventually superseded by a Claude project. So it's just a project inside of Claude that runs and it generates basically top top articles, and it does a lot of the same scoring logic. I basically ripped out all of the prompts from this application. Okay. Put them in this project. But maybe maybe I'm old school. I get a lot of my most interesting stuff from like an RSS. Sometimes, actually, sometimes, yeah. My Twitter timeline is just really terrible memes and also InfoSec stuff. But but like RSS feeds. Like I use an app called Feedwise.

SPEAKER_03

Okay.

SPEAKER_01

And so I I have like a bunch of RSS feeds in there that are categorized. And so I have some that are just like news, and then I have one that's like favorites or or no, it's detections, I think is what that feed is actually called. And I have very specific publications that are in that section to where I know if something pops up in there, like if it's from the Deefer report or the Google threat intelligence team, I know that this is going to be not a marketing piece. They don't need marketing right now, right? And then they're probably also in both of those cases of those examples, they will have detection logic almost definitely in their article somewhere. Oh, for sure. And so for me, like that's a free win. You talk about this malware, and then you give me basically your draft of how to detect this. So I can yeah, I can hand that to one of my guys or even just an agent and say, hey, go draft this and see if it amounts to anything.

SPEAKER_00

Yeah. No, that's that's a good way to look at it. I really like Feedly, but it's it's expensive.

SPEAKER_01

It is. I I used Feedly for a while and I wanted to use like their their information security like platform or whatever, but it was like ridiculously expensive.

SPEAKER_00

Insane. Insane. Because they they know they know what they're know they're they know they're good.

SPEAKER_01

They know they're good and they know that like it's an enterprise product at this point. It's not a it's not a you know a small team product.

SPEAKER_00

Oh yeah, uh a hundred percent. On the what you were describing about the false positives and text validating against that that recent detection, I recently came across um I'll share it to you. I f I forgot to share this. Well, actually, no, I just came up with it. This article that it's like mere measuring detection effectiveness. And it's not on the basis of chasing, you know, this 100% coverage and visibility and basically playing mito, miter bingo, right? It's it's more around the bridge stress test model. It's Geor George Chen, he just posted this like last month. And it basically says testing conditions, testing conditions on when they crack. So, for example, you take a control area, all your detections you have for authentication or in the identity, or maybe your in your supplemental endpoint uh telemetry that you have running in the sim, and you run 20 specific uh techniques. And let's say you detect successfully 15, but you miss five, you have a 75% effectiveness of it. So you're not constantly testing everything, you're testing them. Around certain certain specific conditions. Yeah. And it's all atomic, atomic, you know, validated, atomic red team validated, which is so, so crucial. And it's non-negotiable at this point. Um, and I that that's probably I don't see any vendors, sim detection libraries really holding themselves up to a standard of like how how effective are they? It's like the no one is saying 90% of our customers have our detection library deployed. They're just kind of populating and populating and populating without really saying this one has a really high false positive rate, like you know, what you guys just discovered. They just, they're just filling, filling it so it looks like they have a lot and they're not really looking back to see, well, how how good are these?

SPEAKER_01

Yeah, and and that's the problem with like commercial rule sets or or like rule sets for like from like a crowd strike, if you have like Falcon, right? And I'm not I'm not dissing on on Crowd Strike, but they they have a uh they have a a problem with their detection stack, which is that it has to apply to all of their customers. It can't just apply to you and your org and your use case. Yeah, same with these commercial rule sets, it has to apply to the majority. Exactly. It has to apply to the majority. And I do like the percentage-based like rule effectiveness. Um, because we generally how we measure our rule effectiveness is like we we're a pen testing company, we run a lot of internal validations on our detections.

SPEAKER_03

Yep.

SPEAKER_01

Um, but whenever like a SOC client has like a pen test, we always offer to purple team it with them. So we did one recently. I think I was on a call with them like twice a day, and we would go over like here's what the pen tester did, and I would make sure that we saw detections for these things. And so we could build a timeline and show like here's how early on into the attack that we detected him, here's like any of the potential gaps, like we could uncover like some potential logging issues, or hey, you know, this data is here, but we thought it was over there. So we, I guess in that case, we measure it from like a purple team perspective. Truly purple. Yeah, truly purple. But I do like almost having like a standard to it because we do track that as well. We track how many rules that we have that are covered by atomics. We we also use atomic red team. We use a lot of custom atomics for our custom rule sets. And so we have a measurement to say like how much of these are covered by an atomic.

SPEAKER_00

Yeah. I I mean that just grows that just grows to show that this is a science. Like it truly is a science. I mean, there's formulas around it, there's equations around false positive ratios and and and all of that, and and stress test, and it it's it's so it's and I I think that also has been the biggest reason is why we've seen that role, the detection engineering role grow almost like so much in the last five years. Because do you remember maybe 10 years ago that you never saw detection engineer as a LinkedIn title, like ever, right? But maybe it was just kind of masquerading as a Splunk admin. I don't know. Right.

SPEAKER_01

Yeah, yeah. I mean, there's some there's definitely a lot of truth to that. Is it would be the Splunk people that are writing rules, and you know, now you have entire like dedicated detection engineering teams. I I have a buddy that leads a SOC at a bank, and they just have an external to their team detection engineering team. And so they they like it is fully, you know, these guys, this is all they do. And I think that's almost to the point where you have to be. There are there are pros and cons to having dedicated teams do it versus having your operators do it, right? But yes, it is like a science, it has to be, otherwise, you will not get consistent quality output. Like if you approach it from, you know, the same way that you might approach or just kind of going about it whatever way, it it's hard to determine the quality of your outcome. But if you have a repeatable process that makes sense, that can be measured, that you can like build out and experiment on and document over time, like you will almost every time have a better quality output.

SPEAKER_00

Oh, yeah. And it's almost like DEs need to be type A people.

SPEAKER_01

Yeah, yeah. Because that's the truth.

SPEAKER_00

Like every single part of that lifestyle life cycle process and uh know every every angle of it, and they'll they're the best person to know how it affects like all the inputs and the outputs of each of those phases. And and I think I think I heard you say you were you were a type A in one of your earlier workshops. I'm like, yeah, this guy's gonna be my friend.

SPEAKER_01

Oh yeah, I am really type A. I have documentation on my home lab. That's how bad I am.

SPEAKER_00

Oh man. Well, on that note, we talked about your detection engineering lifecycle loop, but the other loop is also about staying in the loop of just the detection engineering craft. It's like, how do you stay updated? How do you stay on top of things? Not just CTI research. Like that's we're beyond that at this point. Like how it's it's how do you stay up to date, but not but still not lose the fundamentals of critical thinking? Because everyone is just like Claude is now our our interface for everything now. And it's almost like, did you really have to ask Claude that? Like, did you really have to ask Claude for the weather when you could really just I don't know, turn on the news for the morning? Yeah. Look outside, are people wearing a coat?

SPEAKER_01

Right. Yeah. I mean, that's a really good question. Cause I I think at this point, a lot of our detections are written by Claude. And so that's that's something I recognized very early on is if we don't make sure that we are keeping a hand in this, it can become to a point where like we are not as good at this as we would like to be. And so whenever I write a detection, I I always will be fairly um critical of Claude's work. I find that helps me sort of pay attention and not only just catch its problems, but also pick up on any potential issues and and keep myself sharp. But also, like I will write detections just manually just to make sure that I'm keeping up to speed. I understand, you know, yeah. Like I'm I'm staying in practice, I'm staying sharp. And, you know, I I will I may write those and then send them to Claude and say, go push this through the pipeline. And it'll do like the naming conventions and tag it with miter techniques or whatever. It'll do the stuff that I don't want to do, but I can draft up a rule and say, this is exactly what I want to detect. I have the telemetry, you know, just go button button things up, write the doc for me. Here's what I want you to write it. And I think that that is probably the most important part because you can, you know, you can remember how to write docs again, but it would be very hard to remember how to write good detections again. Oh yes. Um yeah, and so you have to stay sharp on that part. Like I I have my own SIM at my house now, too.

SPEAKER_00

And so I I will location like a Wazoo sim or or you built the sim.

SPEAKER_01

This this is in a cloud. So I use Leama Charlie.

SPEAKER_00

Okay.

SPEAKER_01

And so I I have a few agents that are deployed there, and I will just write detection rules for things that seem fun.

SPEAKER_03

Uh-huh.

SPEAKER_01

Just as like an experiment, right? Like I have a MacBook or a or a Mac mini that runs a bunch of weird stuff. And I was like, I wonder what would happen if I just deployed a whole bunch of rules to this and saw what happened. And then I would pick apart the detections and try to tinker with them. Like a lot of like a lot of the tech space right now is like tinkering. You have to spend time experimenting, seeing what works. And DE is kind of like the same way because you can write a detection one way. And unless you try different approaches or different routes or experimentation, you're never going to know if there's a better solution for that rule.

SPEAKER_00

So it sounds like you you're relatively okay with using with using AI to build detection logic. Because I've this is kind of a little bit controversial. Some some people are very against it. Some people are are like, well, if Claude is really good at reading schemas, the schemas are pretty well documented. I guess in the case of like AWS, maybe. Um some maybe some of the newer Sims are not, but like CQL, not. CrowdStrike QL, no. But there's there's some that are. And if you point Claude to it, maybe, maybe it will give you a good you know, a good first pass. Like maybe, yeah. But is that gonna is that gonna be effective for a junior E who never actually wrote a query in his life?

SPEAKER_01

Definitely not. No, no, and I think that I'm absolutely okay with Claude writing detections with like a really important caveat is that I get final approval before anything hits production. Okay. Because I've seen it make some really crappy detections before, and I've also seen it make some detections that have slipped by me, and I'm like, that seems pretty good, and then it just turns out to be extremely noisy. Uh-huh. And and part of that caveat is like we are still in the loop for approvals, but we also have a lot of guardrails around it to where if something does get by, it's stopped somewhere in that process, right? Like our experimental pipeline catches things that are overflowing with too much volume. All of our validators catch things that Claude broke, a Claude, you know, review agent with a different, you know, set of instructions, reviews to check any problems. And so like you can't just boot up Claude code and say, write this rule for me. You could, but once you get into like more advanced detections and you get into more specific use cases, uh, and then once you start to scale it, you're just asking for for potential issues down the line.

SPEAKER_00

Oh yeah.

SPEAKER_01

And so so you have to be very, very careful because it can write these detections, but only if it knows and is instructed exactly how to do so and is protected from any potential mistakes.

SPEAKER_00

Well, now I have to know. I'm so you I'm so curious now about your Claud Code setup, like for detection engineering work. Like, do you have a bunch of uh custom slash skills, slash agents, slash subagents, MCP servers you're standing up by yourself? Like it what's what's like the lay of the land there?

SPEAKER_01

Yeah, I I have it open. Let me just run agents right now.

SPEAKER_00

Okay.

SPEAKER_01

Library. So this repo has three agents, one for writing detection rules, one for validating detection rules, and one for tuning detection rules. Uh and then just as a general like user scope, I think I have like 12 other plugin related agents. And then we also have a bunch of skills, and like there's a lot of different skills related to these detection rules and FP reviews and things. And that's like that's how you have to be in order to do this well. Because AI is just a tool. It's just a tool. That's all it is. It's it's not, you know, the magic silver bullet that everybody's been searching for. It can solve a lot of problems very quickly, but it can also create just as many just as quickly if you don't use the tool correctly. So so we have a lot of custom agents, we have a lot of skills, we have you know specific plugins that we use for certain things. You just have to you have to be very careful with it because you know, if you hand somebody a very powerful weapon, you have to understand how to use this weapon correctly.

SPEAKER_00

Yeah, wait, what is the what is that quote with great power?

SPEAKER_01

Comes great responsibility.

SPEAKER_00

Comes great responsibility. Is there well since you've since you're in the know then you've probably seen it fuck up a lot of times. Is has is there something that it doesn't like you don't trust it to do at all? Or is it this part of detection engineering, it cannot do? I will never have it do.

SPEAKER_01

I mean that's that's tough. Um it's it's usually like new log sources. I won't really let it do, or or what I'll do is I'll go find sample events of what I want to detect and I'll give it to that as context. Because it's gonna just guess on field names. Our agents have like reference sample events, but for something new, they're not gonna have that. And so it's gonna do a best guess. So that's a big one. I also just in general don't really trust Claude on anything critical, uh, or trust AI on anything critical without some caveat. Like I'm I'm a big proponent for like the dangerously skip permissions flag. I use that all the time on things, but only things that are not production. Ah so things that are production, I've seen it make some crazy decisions. Really good example. I actually have a detection rule that can handle clawed sessions. So we have a we have a tool we can offer to our SOC customers where we can deploy an agent to their clawed code sessions or their code sessions or whatever, and then we can log and respond as an EDR agent to that clawed code session. And so I wrote a rule based off of uh something I saw happen to somebody sitting next to me. We were at Wildbus Hack and Fest, and I was sitting next to this guy, and his Claude tried to like post a public gist of some logs that he was messing with. And he was like, look at this. And we were like dumbfounded that Claude would try to do that. So I wrote a rule. I know I wrote this rule, and what it does is it looks for anything similar to that. So a paste bin, a gist, or whatever. And if it detects that, it actually kills the Claude session and marks it as a policy violation and triggers an alert. So I told him, like, hey, go try and do it again. Uh, and he did, and that session got smacked down by the EDR agent. And I think that's where we have to be now because like if you are operating entirely out of you know, Claude on your command line, your your desktop EDR is only gonna help you so much. Like it'll stop it from installing things, maybe, and that you shouldn't be, but like from like running commands and stuff just through the the agent harness, like it might be able to detect that, it might not.

SPEAKER_00

That got me thinking. So there was there's some people in the community looking for how to protect against the dangerly skip the the danger permissions like that. That yeah, that is pretty wild because even those who try to even monitor the c like Claude understanding uh the OTEL is so hard. And it's it's hard to store, I guess maybe not hard to store it's it's just expensive, right? And I've seen a couple projects out there, like MCP wrappers that can really track a lot of the input, output, and prompts and uh responses from that from that. But I don't know, it it's really it's it's very we're lacking some detection content for cloud. Uh I mean sorry for clo cloud visibility and and then distinguishing what is a malicious agent versus a uh a a real human prompted agent.

SPEAKER_01

Yeah. That's like a brand new like attack surface, and like attackers are using it and humans are using it constantly to do dumb things, but attackers can also take advantage of that too. And so it's it's very much like we are behind in that sense in terms of protecting against these things. Because it's one thing for somebody to like try to install the Claude desktop app and mess around asking stupid things and wasting money, but it's another thing for somebody that doesn't really understand what they're doing to spin up, you know, Claude code and start going to town on a production repo. That's just that's a bad idea overall. Oh, and so we're we're behind in that space. And that's why I've I've been spending I spent a fair bit of time that weekend when I I saw Claude try to do the stupid. I spent I wrote several rules for our Claude Code sort of harness that we have, or our EDR agent for Claude, basically. Uh-huh. Um, that that all of our team is expected to to have just to try and prevent it from doing things that you know you wouldn't expect it to do. It's it's trying to be helpful, but it's going to be potentially harmful in that that helpfulness in some cases.

SPEAKER_00

Yeah. So I'll well, you're gonna have to give me do you did you write anything about this at all? So we can link it in the show notes. I know Trail of Bits has has a pretty good write-up too on the dangerous dangerously skip permission stuff as well.

SPEAKER_01

Yeah, I haven't written anything about this. I have one that I'm gonna publish before long. Uh it's been sitting in my inbox forever about like how to not let AI think for you.

SPEAKER_00

Uh-huh.

SPEAKER_01

That's a big pet peeve of mine when I see people use AI, is they have Claude this episode. Yeah, they they have Claude write their plan and then execute their plan. It might be good, but it is the worst possible of that good. The right way is for you as the human being to have Claude execute your plan. That's the way to get an actual good output. And then it's, you know, you are utilizing the tool in the correct way, you know. So that's that's the one that I have sitting in my inbox. But the the dangerously skip permissions thing, I I don't know how many people know about it. It should be default now, but Claude now has an auto mode that it uses, which I do actually like. And in a lot of ways, it's better than dangerously skip permissions because it will it will pass through things and basically let Claude decide what needs your permissions. Uh-huh. And so so far it's actually done a surprisingly good job to where it will stop before it makes like a push to a repo or something, and it'll flag it for review. But it will normally skip over other things that it knows like I don't care about. I'm like, I don't need to approve you running grep on this file. Like, I could not possibly care less. So that's a a good thing that I think a lot of people should probably be using instead of the dangerously skip, because that's a good way to accidentally break something.

SPEAKER_00

Yeah, break a lot. Break a lot of things, and maybe do more than that.

unknown

Yeah.

SPEAKER_00

The last thing I'll I'll say before I wrap up the episode is have you heard of Claude Gotchas?

SPEAKER_01

Claude Gotchas.

SPEAKER_00

I guess it's the new way, the new skills. It's like a the new MD. It's not, it shouldn't replace your MD, but it's like a supplemental MD markdown that you list all the times that it's been wrong before so that it will absolutely know not to do those things again.

SPEAKER_01

Yeah, I did something similar to that with my one of my CLUD agents because I have I called it like a learnings file, where it would basically track like issues or incidents or problems that happened before. And so part of its instructions were when it approached things that it didn't yet know how to do, it would check those files to make sure that it hadn't screwed up in a way that was adjacent to that before.

SPEAKER_00

So I like that. I love that a lot. In closing the loop, is there, I feel like there's there's gotta be now a new way to almost train junior analysts. I mean, from senior analyst to junior analyst to maybe export like clawed knowledge sessions from from person to person. I feel like that's gonna be the new way forward. If you're not already sharing your projects, I guess.

SPEAKER_01

Yeah. Uh so I actually just talked with our team internally about that today because I've I've been trying to work on like a shared marketplace internal to our team where I can basically stage different repos or modules as like plugins to Claude to where anybody could pull up their agent just from like a higher level vantage point and then bring in different pieces of work that they need to do their job. Because you know, people are throwing around great ideas left and right. Someone's like, hey, I put out this skill to, you know, get hunt or detection ideas out of CTI. And they're like everybody's throwing all these great ideas back and forth. But if they're just sort of everywhere, you're not gonna really know what's what or like what you can take advantage of. So you almost need like somebody who owns like your your sort of AI knowledge base, I guess. Uh because I think it was an article recently about Amazon where they had like several hundred of you know different AI initiatives. And I think in that that Substack article, they were the guy was talking about how they had like several named Insight. They had like four named Insight and two named Insight AI, and one named Insight 2.0, and they like all did the same thing. And I think that's like that's gonna be like the next tool sprawl where you have like 40 different AI agents that just do the same thing because someone thinks of a really good idea, but it's so distributed that no one knows that that idea already exists.

SPEAKER_00

That is how it that is how it goes, and because now anybody can take their idea to to life now, but literally anyone, anyone. It's it's it's there's all these vibe-coded projects out there that have like vulnerabilities on them.

unknown

Oh yeah.

SPEAKER_01

Yeah, the the barrier to entry now for anything is zero. Like there's no excuse not to build things at this point, and that's that's both a really good thing and a really bad thing.

SPEAKER_00

It's a really bad thing. The proloration of that is is making me really, really scared because now it's how do you control like people in in marketing and go to market and sales spinning spinning up their own dashboards, public facing, and now there's just available out there for anyone.

SPEAKER_01

Yeah, yeah. And for me, it just means I constantly don't have an excuse to not do these crazy projects, I think of. And I'm like, well, now my I just need like two hours and I can make this happen. What's my excuse to not do this? And then I find out it's like 8 p.m. or something.

SPEAKER_00

And so now we know what you spend your weekends doing.

SPEAKER_01

Yeah, yeah. Usually I love the the cloud remote control because I can like type it and then leave and do it from my phone while I'm somewhere else.

SPEAKER_00

Oh, I love that. I love that. I I swear I'm convinced that Mac minis are there's a shortage of Mac minis, by the way. Like you you you you you really can't go on Apple and order any because everyone is using them for clawbot and and and and all of that and and running their own agents at home. And but I'm truly convinced that that was that was claw literally clawbot was the reason for this Mac mini shortage or influencers, yeah.

SPEAKER_01

Absolutely. It was also the reason for a lot of clawed outages, I'm pretty sure too.

SPEAKER_00

Yeah.

SPEAKER_01

So yeah, and and people will like talk a lot of crap about the clawbot open claw thing. Um I think that that quick quick soapbox. I'm almost certain that all the people that have problems just don't know how to secure infrastructure. Everybody's talking about how it's insecure. Like mine was never insecure because I know how to set up a server and I know how to not leave ports exposed. All right. So it's just a skill issue if your claw bot was compromised.

SPEAKER_00

Well, that's the thing. Like, I think that really Is the heart of this. You said something a while ago that I really liked. It really stuck with me that it's it's like AI won't fix what was it, weak processes. Or no, AI can elevate someone that's really good. Yeah. Right. Who knows how to close ports, but it will never rescue someone who's weak or has, or if you if your organization has a weak process. And I think that's that's really the core here. And I think that at the end of the day, a lot of your trainings can really help people out continue to stay in the loop of their practice, of their craft, and not shy away from Claude. Like they can they can use it to accelerate, but but definitely not lose the craft of it.

SPEAKER_01

Because you have to understand the fundamentals in order to get the agent to do what you actually need it to do. It's like having like a junior detection engineer under you. Like you can't just send them, you know, into the fire and say, go write this rule for me. You have to guide them, you have to tell them what you want, you have to give them a scope, you have to coach them through it. And over time, as you train them and you build better instructions and guardrails, they will do a better job. But you have to understand the process in order to train this individual, right? You have to treat AI the same way. You can't just send them out like a Pokemon and expect them to like do an amazing job. You gotta train them, you gotta level them up.

SPEAKER_00

That's the fastest way to lose the craft is to onboard Claude before the before you onboarded the fundamentals. So there you have it, folks. Before we log off, is there any resource project or person Hayden that you are watching in the community that you we should spotlight their GitHub of, or or someone maybe that you reference to to get content from, like your CTI project you mentioned earlier?

SPEAKER_01

Uh there is actually, let me pull it up. There is a uh a class by by two two friends of mine.

SPEAKER_00

Uh-huh.

SPEAKER_01

It's on anti-siphon as well, so it's usually going to be not crazy expensive, but it's called I love the the pay, pay what you can. Yes, pay what you can.

SPEAKER_00

I love that.

SPEAKER_01

This one is uh is threat hunting with Velociraptor. So THVR with Eric Capuano and Whitney Champion. Uh it's just how do you take an open source tool and threat hunt across all of this different data? And I think that's like sort of along the lines of it's almost like a counterpart to all the stuff that I do is I focus on detections and they do a lot of stuff on hunting. Uh-huh. So I I love their their courses that kind of put out like really good accessible knowledge on like the other side of detecting because DE and threat hunting like really complement each other and they are like critical counterparts. And so you can't really gloss over either as a detection engineer.

SPEAKER_00

No, I always say like when when that is the ultimate question, right? When does a threat hunt get graduated into a detection engineering, a scheduled session?

SPEAKER_01

Yeah, it should be almost every time.

SPEAKER_00

Almost.

unknown

Yeah.

SPEAKER_00

If I think right now the best answer I have is if it doesn't generate enough false positives.

SPEAKER_01

That's a good, it's a good, a good measurement. Yeah.

SPEAKER_00

Well, thank you so much, Hayden, for your time. Where can folks follow you to keep up with with the latest, to keep up with your classes? Uh I'd love to send them your to send them your way. Anti-syphon training. If there's anything that we can do in the future to collaborate and continue the conversation on detection engineering, you're always welcome back on the podcast.

SPEAKER_01

Thank you. Yeah, thanks for having me. It was a lot of a lot of fun to talk talk AI and detection engineering in just one show. Two of my favorite topics.

SPEAKER_00

That's right.

SPEAKER_01

Yeah, I have Twitter. I think my account is Kilobyte the Dust. I don't I don't think I ever post anything. Oh no. I do I do have a LinkedIn, which has just made my screen very bright.

SPEAKER_00

You rarely rarely go on there. Every time I message you, you take like a four business days to respond.

SPEAKER_01

Yeah. I turn off notifications for all social media apps. And so whenever I open LinkedIn, it's like, no, this person asked an important question a month ago. I'm so sorry.

SPEAKER_00

Oh man. Well, that's okay. Well, until next time, this has been Alex's version of Detection Dispatch. If this episode lit you up, the next place to go is Hayden's anti siffon training. Stay in the loop and stay in the craft.