Episode 2 - NIST SP 800-61 Computer Security Incident Handling Guide (Preparation) Artwork

TLP - The Digital Forensics Podcast

Get involved in the exciting world of Digital Forensics and Incident Response with: Traffic Light Protocol. The Digital Forensics Podcast.

In each episode, we sit down with seasoned DFIR professionals, the blueteamers who work around the clock to investigate cyber intrusions. From data breaches to cyberattacks, they share firsthand accounts of some of the most intense investigations they've ever tackled, how they deal with burnout and the added pressure of cat and mouse while they learn about new attack chains.

All Episodes

TLP - The Digital Forensics Podcast

Episode 2 - NIST SP 800-61 Computer Security Incident Handling Guide (Preparation)

May 17, 2024 • Clint Marsden • Season 1 • Episode 2

Send us a text

In this Episode Clint Marsden talks about the first phase of Computer Security Incident Handling according to NIST. Listen to real world examples of how to get prepared before a Cyber Security Incident arrives.

Show notes:

Link to NIST SP 800-61 PDF

https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-61r2.pdf

Bro has been renamed to Zeek. https://zeek.org/

Rita is Real Intelligence Threat Analytics. Created by Active Countermeasures -

Available from https://github.com/activecm/rita

(Transcribed by TurboScribe.ai. Go Unlimited to remove this message.)

Hi, and welcome to TLP, Traffic Light Protocol, the Digital Forensics Podcast. As we discussed in the first episode, I touched briefly on the NIST, Incident Response Lifecycle. If you're not familiar with the NIST, Incident Response Lifecycle, let me just give you a quick background.

So NIST is the National Institute for Standards and Technology based in the United States. They do some incredible work and provide some public papers, most notably what we're gonna be talking about today. One of their papers is a, what they call a special publication, which is a document that will contain some fairly technical information.

And in this case, we're looking at NIST special publication 800-61. In this case, it's 800-61 R2. And this is all about the Computer Security Incident Handling Guide.

And it's fairly chunky. We're looking at about 80 pages for the PDF. And what I wanna do, I wanna do a four-part series on TLP that breaks down what the Incident Response Lifecycle is and creates a bit of a discussion forum and potentially allows you to send through some questions to get a little bit more clarity if you're interested and you want me to cover it here, or it might give you something to pivot from and take a look at the actual NIST website, which is nist.gov. The full link to the PDF will be in the show notes.

So I'm not gonna go through all four phases in their entirety today, but for completeness, the Incident Response Lifecycle looks like preparation, detection, detection and analysis, containment, eradication and recovery, and post-incident activity. Now, the diagram that's in the PDF, and this is not the way I should be doing this to describe this on a podcast, but imagine you've got four squares in a straight line, in a horizontal straight line, and the way that they go is they move from preparation, which leads on to detection and analysis on the right-hand side of that. Then on the right-hand side of detection and analysis, you've got containment, eradication and recovery, and post-incident activity.

That is the flow of an incident, and getting ready to respond to an incident. If you've already got tools in place, you might be able to do some detection and analysis, but really, you want to get into preparation first. And the reason why you want to do that is so that as a business, as a team, even if you're an IT team of help desk personnel, system administrators, or if you're a dedicated SecOps team, you want the business to be ready to respond to those incidents by doing training, by having playbooks, all that good stuff that makes it easier to deal with on the day.

And allows for consistency in responding to incidents. But it also gets you into a place where you can start to prevent some incidents by securing and locking down endpoints, servers, network devices, switches, but also edge firewalls. And if you're really lucky and you're a little bit further down the line, you might look at securing applications as well.

Maybe when applications are initially deployed, part of that preparation process is having a framework in place that says, we're going to enable as much of the security controls as we can at the time of implementation, because turning that on once it's gone live, it's very difficult. Most people are concerned about causing an outage, which means disruption to business, which means lack of revenue, which is a problem for all businesses. Anything that affects revenue generation is a problem.

Of course. So the preparation phase is fundamental to anything to do with incident response. We must prepare.

As part of that, NIST talk about some incident handler, communications and facilities. And I'm just going to run through those. So they start off with contact information.

So that means, so they start with contact information, which means as a absolute minimum, get everyone's mobile number listed in your internal knowledge base or on your phone, however you're doing it. But all the key people, you want to have everyone in your team as well as outside your team, who would control assets that if there was an incident, may need to be modified or engaged. So expanding from your internal maybe SecOps or sysadmins or help desk staff team, start to look broadly your network guys, your management team.

Then you can start looking at building a bit of a call tree and use that, bring it up when there's an incident and off you go. Having it all listed is nice. Trying to remember when you have a large team, a large organization, who does what? You don't want to be like at midnight at 2 a.m., going through your phone, trying to remember who does what.

Get it all listed. It will make life easier. If you've got on call, that is going to be worthwhile having as well.

You might have a SecOps team who are on call. You might have network teams who are on call as well. That will be interesting.

We generally, if you're being woken up in the middle of the night to deal with an incident, it's probably not something that you're going to be sending someone an email. Hey, can you pick this up in the morning when you feel like it? Unfortunately, you're probably going to have to get them involved at some point and they're going to feel the pain. You want their 24 seven on call.

People that have their mobiles on silent because they don't want to be disrupted while they're sleeping. It's not helpful when there's an incident and you're trying to get through. I've been there before and it's not ideal.

Talking about an issue tracking system. So how are we tracking incidents? How are we tracking the IOCs? How do we know what we're dealing with? This applies in the example of it's midnight, you're dealing with an incident or you're dealing with an incident in the middle of the day. You've got to track what is happening.

That also covers off your contemporaneous notes. So the notes of what you're doing as it's occurring, the time, the date, who was doing it, what is being done. Doesn't need to be super detailed, but what you'd like to have is things like checked firewall logs for this IP address or found patient zero.

Email was sent at this particular time on this date. Taking notes as you go through the incident will make your report writing so much easier. It will capture information once and I'm pretty passionate about the report writing and the note taking angle because it will really change the way that you respond to incidents because you know what is required at the backend.

So it's obviously a post incident activity. It's not something that you're not gonna be writing the report on the same day as the incident. I would say if you're able to do that, it may not be a big enough incident that would warrant a report because you're generally not gonna have something as open and shut.

It's not gonna work that quickly, but all the same, once you've been doing reports for a little while, you're gonna be responding to an incident. You're gonna be tracking the way that the incident is unfolding because you know the level of detail that will be required when you're writing the report at the end. And the worst thing is when you're writing a report and you're going through the incident timeline, jumping ahead, but if you don't build the timeline first, your report will not make sense.

It will not flow. So you gotta start with the end in mind, essentially. Not the greatest metaphor, but get that timeline written and then write the report from there.

I'll leave it there for now so we can focus on that in part four. But tracking everything in a spreadsheet is great. It also allows for people to share information and it gives them a quick representation of what is occurring.

So do that and it will make your life easier. There's a few other points in SP 800-61, like having a wall room. So now it's 2024, we're in May 2024.

Global pandemic has essentially ceased. Some people are working in the office. Some people are not.

This doesn't create a challenge. All it means is if you have an incident, can you go somewhere, and you're in the office, can you go somewhere where you won't be disturbed? You and your team can have everything out. You might need whiteboards, paper, easel pads, things like that, where you can be tracking everything, be undisturbed, not be listened in on because sometimes these can be highly confidential incidents.

If you're remote, how will you have a virtual wall room? Will you use Teams? Teams is great. You can use the function there. The great thing is with Teams, you've got chat.

So you can be pasting in artifacts. Some of that may form part of your contemporaneous notes because you can literally type out, just check the firewall, found how they got in at 2 p.m. on this day. Great, that's gonna form part of that reporting period.

It's really easy to get carried away in the heat of an incident. And it's really quick. It can go by quite quickly as you're racing through different systems, trying to correlate different things, looking at network logs, looking at the same, looking at email, pulling a triage, looking at endpoint artifacts.

And can you remember all of that all at once? Most people can't. Having those notes, as you find something, write it down, it just makes it easier to write the report. All right.

So there are some other things that depending on the size of the network, that you might be able to implement in the preparation phase. Talking about packet sniffers, protocol analyzers. So do you have a spare device? Do you have a switch that has a tap port? Do you have a spare device with enough disk storage to plonk that onto the network and using something like, it's not called bro anymore, but it's Suricata, can essentially grab everything that you need and sorry, I think bro has been renamed to Rita.

Essentially scoop up all the traffic on the network and then analyze it later. Maybe you'll analyze it during the incident. That is something worthy of consideration.

Also things that you're gonna need, you're gonna need removable storage devices. You're gonna need maybe USB thumb drives. You might need external USB hard drives to grab data from workstations, to do dead box forensics.

Have you got physical notebooks? They talk about digital cameras. We all have a phone that has a camera in it these days. Another big one, chain of custody forms, imaging device, imaging forms.

These are really super useful. Having this ready to go. Chain of custody form, very important for cases that are going to court, but also if you are capturing evidence that is then going to be given to a third party vendor.

So maybe you have a partner that you use to do forensics, to assist you with incident response. You wanna be tracking that transfer of this hard drive is going to this person. I've signed it over to you.

It's got the device serial number, who took it, when it was, where it was. It's nice to have an end-to-end tracking of that. Then once that's all signed, make a copy, give it to the other person.

When it comes back, you can sign it back in and you can track it from end-to-end. Going back to notebooks and physical pens and paper, it might be inconceivable that you'd think, why would we need that? But what if there are other external factors? Now, these are more rare events. What if there's power outages? What if you're on the go, walking from one building to another, you meet people along the way, they're talking about the incident, you can get that notepad, push it up against the wall and start taking notes.

You don't need to go and sit down with your laptop and type it up. Things that you don't think will happen, can sometimes happen. What if the entire environment was ransomed? What if there was a disruption to online services? What if you're used to normally using OneNote or something that requires internet connectivity, like Confluence? What if that infrastructure is taken down? Pen and paper, I know it sounds a bit old fashioned, but it's all about preparation, having that stuff ready to go.

Day to day, most of the time, you're not gonna need it, but it's better to have it and not need it, than need it and not have it. So other things for preparation from an incident analysis perspective, having access to documentation from your network team, understand how everything is plumbed together, might help you to decouple a particular network segment in the instance of a ransomware attack. It would fall into working with the playbook, whether that needs to be enacted, who would be able to make that choice, who would make that decision, when that decision would occur.

Another big one is the current baselines of what your existing network looks like, what your processes are on your crown jewel servers. When you're doing forensic analysis on a system, sometimes, and even networks, sometimes everything looks suspicious. There are processes with random letters.

They are operating out of what appears to be strange locations. These things on first glance can appear dodgy, but if you have a baseline, and if you know what is expected, from, say, the gold image from when that system was first deployed, building that baseline with even something as simple as a directory listing, or using a registry snapshot tool, getting a process list, taking a RAM capture, before that system goes on the internet, you've got a baseline. Because that can be a lot of data, you might need to restrict it to certain systems, to crown jewel systems, even, just your critical infrastructure.

When an incident occurs, you bring that book out, you bring that baseline out, and then suddenly, when you're comparing, it's not, oh, that's a randomly generated process name, that's malware, well, maybe not. It could be the payroll database system, just to make a classic example. You never wanna muck with payroll.

It's very helpful to do that comparison if you have it available. Doesn't need to be for everything. Start with the crown jewels, it's easy.

Go from there, you can always build upon that. The last section of preparation is moving into how can we prevent incidents? So, some incidents will be minor, some will be major. The minor incidents, unlikely to really affect the business's ability to generate revenue.

It will, and everything has a cost, the cost of downtime for the people who are affected, the cost of the time for the security team, or engaging other stakeholders to respond to the incident. So, trying to get ahead of that and prevent the incident from occurring at all is a great idea. This could be done by using a risk assessment.

So, a risk assessment that many people would be familiar with is a pen test. And doing a pen test is generally on one system or one application. The report that is generated from that at the end provides a significant level of recommendations generally.

If it's in scope, they might do some O-day discovery and find out what is present that was previously undocumented. You were unaware, it wasn't just a matter of patching, it was a matter of implementing some other controls. But by doing a risk assessment, you may be able to look at what are the crown jewels, if you don't know exactly.

Looking at systems like payroll, like the identity platform, where new users are onboarded, where authentication is done, looking at building access systems, fire suppression, they would be some crown jewel systems. It's gonna be different everywhere. We've got to look at what the risk assessment presents for the business.

Generally, not a function that's done by SecOps or the IT team, more of a governance function. But if you can get access to that risk assessment, then you can start to align with the business and figure out who needs to do what. We need to protect the crown jewels because this was identified in the risk assessment by the governance team.

That if the payroll system went offline, if the file server went offline that contains our intellectual property, it would maybe bankrupt the business, maybe some other impact. Obviously, payroll is self-explanatory. Other things you can do to prevent incidents, looking at host security.

So following basic building blocks of security, is it patched? Have you got endpoint detection and response software on there to allow for forensics, to allow for remote forensics? Have you got belt and braces AV becoming less and less important? There's a lot of bypasses, but it's all about layers. From a network side, the network perimeter in a perfect world needs to deny anything that's not expressly permitted. So by locking that down, we can look at doing things that is preventing, this is a very hard one to block, preventing people from using Tor or preventing people from accessing systems unless it goes via a prescribed route.

So sometimes people will seek to bypass the proxy. There are ways of enforcing that all internet traffic must go through the proxy. Are they trying to access services using VPNs? Are they using tunneling? This isn't just an insider threat thing, it's also an attacker's perspective.

When the network doesn't have much egress control, when it's not being locked down, the likelihood of an incident occurring and them being able to get out the door quickly and easily can be magnified with a lack of egress control. Monitoring that, again, coming back to baselines, having a baseline of what looks normal might allow you to detect a data exfiltration that happens at maybe eight or nine o'clock at night that might set off some alarm bells and stop an intrusion in its infancy. I touched on it before, talking about belt and braces, AV in Windows, that's pretty much covered with Defender these days, but there are some other places that you can put some malware prevention appliances, places like your email gateway, web proxies as well.

There's a lot of attacks that are occurring now that will seek to lure employees who are doing a Google search to click on malicious links going to PDFs, maybe particularly enterprise agreements is one. So a bit of search engine poisoning or they're trying to download Audacity to get some audio editing software or trying to get some other kind of freeware, not necessarily pirate software, which is still a huge vector for malware, but websites that are being set up to just trick people who are unsuspecting to download and execute the software, and it's laden with malware. Having a good proxy, a good web filter in place can sometimes slow that down or prevent it alongside user awareness and training, which is the last item that NIST talk about in preventing incidents.

So users being aware of the organization policies in terms of software or expected conduct and acceptable behavior when using the incident is one piece. And user awareness that might include some real life examples of what's been seen at the organization in the past could be helpful as well. Again, depends who's creating the training, but it is potentially worth kind of peeling the curtain back to share with the staff, this is what the real reason is why we're trying to implement these controls.

This is why these things may seem a little bit inconvenient at the time, but it's to prevent damage to the company's reputation so that they're not ending up on the front page of the news, potentially also avoiding having to report an incident to the privacy commissioner. So there are those aspects as well. And at the end of the day, it is a human being who is behind the keyboard on both sides.

(This file is longer than 30 minutes. Go Unlimited at TurboScribe.ai to transcribe files up to 10 hours long.)

People on this episode

Clint Marsden

Host

TLP - The Digital Forensics Podcast