Decipher Security Podcast

Andrew Morris

March 16, 2021 Decipher Episode 71
Decipher Security Podcast
Andrew Morris
Show Notes Transcript

Andrew Morris, founder of GreyNoise, joins Dennis Fisher to talk about the unique origins of the company and the security case for removing all of the background noise from the Internet to find what really matters.

Speaker 1:

All right.

Speaker 2:

Welcome back to the decipher podcast. I'm super excited today. My guest is Andrew Morris from gray noise, Andrew. How's it going, man? Hey Dennis, how are you? I'm hanging in there. I'm surviving, man. That's uh , you know, as we were talking before we started recording, that's the best you can do just yeah, exactly. Yeah, exactly. Thanks so much. Thanks. Thanks so much for having me on. I'm super excited to be here. Yeah, it's my pleasure. Um, I should have asked you earlier. I don't know why I didn't, but um, it's I want to talk about how you came up with the idea for green noise and, you know , maybe for the listeners who maybe haven't used it at , possibly explain like how it works, where it does that kind of thing, but where did the initial idea for this come from? Yeah, so the initial idea for gray noise came up for me. There's like a lot, it's a long story. Um , and some of the details I literally like cannot get into, but I'll do my best to , um, so originally basically , um, as a , as a pet project in maybe 2013, I set up a bunch of honeypots on the internet. There was this hosting provider that in retrospect it was probably a Ponzi scheme, but like there was this hosting provider that allowed you to spend like $10 or $50 and buy like a VPs for life, which has like, like you can very quickly back into realizing that this just can't work. Right. Um, but so, you know, I bought, I bought like 10 VPs is 10, like, you know, whatever, $10 VPSs for life or whatever. And I set up a bunch of honeypots and I set up these honeypots in , uh, 10 different data centers around the internet and I'd never set up a honeypot before. Um, and so I monitored the honeypot. I set up like a few different types of tech, but it was kippah . And there was , there were a number of other ones that I use , which was an SSH honeypot. And I looked at who logged into them and who tried to brute force those honeypots. And , and it was cool. I mean, I started getting attacks immediately and I was like, wow, the bad guys are coming for me. And , uh, and I was looking at all this data. And then at some point, you know, I added like the 10th honeypot and I was like, I was like, man, this is a lot of data. So I'm going to , instead of like logging in and checking the data on all these things, like I'm going to put it into like a central place. And I'm going to look at the data from one place inside . I stream all the data into one place. I think I did Splunk. I streamed it into Splunk and I'm looking at the data and I was like, man, look at all these bad guys, like, this is crazy. And then I noticed kind of out of nowhere that like a lot of the over a lot of the IP addresses that were attacking these hosts on the internet were , uh, there was a lot of overlap. Like there , there were a lot of IPS that were attacking all of them. Right. And , and, and mind you, these IP at these , that these VPs is which it stands for virtual private server, for those who aren't familiar, it's just a cloud, you know, it's like a server that you rent on the internet. Um, they're all in completely different data centers in different countries around the world, like in very different locations geographically. And I was like, that's super interesting. Like, I , there there's like, I'm seeing the same one, two, three, four, five, 1,000 IP addresses, attacking all these things. And I was like, man, that's crazy. And this was right around the time that it was a little bit before, but it was around the time that that company Norse was becoming like kind of more and more popular Norse , the cyber threat Intel company with the map. Now they're obviously the butt of a lot of jokes, but , um, it was right around the same time that they were kind of doing stuff and it was, and it was interesting. And so I, I remember thinking about it, like this is threat Intel, right. You know, like where the bad guys are. Right. I found the bad guys. And so that's threat Intel. Right, right. Uh, and it just after I had a lot of conversations and I talked to a lot of people , um, some switch flipped in my head where I was like, wait, this isn't the stuff that people should be worried about. Like, this is the stuff that's hitting everybody. Like, this is just like internet background noise. This is, this is just the opportunistic. This is like anti threat intelligence. This is like the stuff that like, like if you freak out about any of these things like attacking you, then you're in for a really bad time because there's, I mean, it's just a barrage constantly. So it started kind of iterating that same sort of pattern of like, well, let's do that with different kinds of data. Let's like, let's put this over here. Let's do this kind of collect this kind of data. And then when it got extremely interesting, I noticed was when you overlaid what you were seeing from, you know, these honeypots on the internet on top of like an actual network that has actual business users. And what I found is like that it creates this like noise canceling effect where you basically have, what's hitting everybody on the internet totally opportunistically. Right. And you subtract that out from what is hitting somebody's network. And what you're left with is like only the things that are hitting that network specifically, both legitimate, regular business users, but also, you know, like , uh, like targeted attacks or more targeted attacks. So I built this thing at, I built this thing and I presented it at a ShmooCon in ShmooCon 2014 or so. And , uh, and it was basically like, Hey, look at all this stuff I did was called, it was called no budget, threat Intel. It was like, it was like, and it was as soon as his stupid name, but it was, it was a fun talk about the system of honeypots. And there was a , there was , uh , there was a person in the audience there who basically saw my talk and said, okay, well, you just built something that is functionally a better version of something that, or, or a, you understand this problem better than we do , um, for a project that we're working on a common build that out for a secret sneaky customer that I can't really talk about. And like, and let's, let's kinda like solve this problem for a sneaky customer that we can't really talk about, who has a problem of trying to figure out if things are hitting them specifically, or if they're hitting everybody on the entire internet. And that was when someone asked the question , um, what is the expected amount of scan traffic that any host on the internet should see? And that question is so hard to answer and is so vast and so massive. And the implications of properly answering that question took me down this massive rabbit hole, but I'm still going down with gray noise right now. And until finally, you know, iterated, iterated, iterated, I, I built a company out of it. Like just this one problem. I started gray noise to solve this one problem. Yep . For the, for , for the world. And it, and everything has kind of come from that. And we've, we've got different use cases now for the data other than the initial one. And we've got, you know, lots of different types of users and things that I would have never thought of before and scale that we could have never achieved before. But that's functionally kind of like where the idea came from

Speaker 3:

When you, so when you set up those honeypots , Steve said it was a side project, you were just kind of, you know , around , around. Yeah . Right. Exactly. So until you gave that talk at ShmooCon and somebody came up to you and was like, Hey, you know, this is a real thing. Had you thought about like, Oh, maybe there's a business in this, or was that kind of the spark of it? Okay . No, no, no, no. Not , not even a little bit. I mean, I think gotcha.

Speaker 2:

Would have been really cool without getting too like deepen personal. That just isn't what really motivated me at the time. It was like, I just wanted to do cool stuff. I wanted to like solve problems and do cool stuff. So even if somebody came and said like, Hey, I'll give you like a bunch of money, but you have to like, solve a different problem. I would have been like, well, but I don't want to solve a different problem. Like I just found this one really cool. And so I , I wasn't even thinking about the business aspect of it. I only ever started thinking about the business aspect of anything and making money. Once it became apparent to me that the only way to properly have control of building this thing in the exact way that I think it needs to happen was to build a company around it and to build a company around something, you have to think about money.

Speaker 3:

Yeah. Unfortunately, that's true.

Speaker 2:

We , we live in it , we live in a capitalist society and that's just how it goes. Right. Yeah . I don't , I don't think the rules.

Speaker 3:

No, it's true. Um, so what were you doing at the time? Like when you, like, what, what were you, what was your job when you were just set up the honeypots? Like, what were you doing at that point?

Speaker 2:

I got kind of fired from a job, like not, not fired fired, but so I was , um, I was working, let's see, I was working in between , uh, I was, I was on a staff augmentation role at a , uh, at a , for a customer and they, they wanted to hire me. And so I was like, Oh, sick. Right. Like I'm going to join this company full time. And then like, right at the last second, the offer like was rescinded after I'd already quit my other job. So it was like, I didn't actually get fired. I got like an hired, which just felt, it felt like getting fired, but it was like, you know, I was on hired and I was like, man, this sucks. And so I had kind of like a week or two of not working, but of like, kinda like my mind going a little crazy in between me, like starting up my last job again. And so I was like, well, you know, I'm looking for something to do. And this just seemed like a fun thing to do that. And that, like, I've never really told anybody that before, but that's honest to God. Like I, where I was, I was had some downtime between jobs and I was kind of like looking for a cool thing to occupy my time. And , and that was what was going on.

Speaker 3:

Honestly, that's where some of the coolest ideas come from, like not just insecurity , but, or even technology, but in life in general, if you just have some time to think about things that you, you know, maybe you're vaguely aware of, but you have too much other going on.

Speaker 2:

That's exactly right. It's like when you, so without getting into this too terribly much, it's like when you meditate and you clear up your mind and you, you, you clear it out, you give yourself all this new space to be creative and to think about new things. And so like sometimes like taking a step away, taking a step back from your slog or the grind or the knife fight that is kind of the day-to-day , it really does allow you to get , it gives you some flexibility. And unfortunately it's the kind of thing that not everybody has the opportunity or the option to do, because sometimes you just got to slog. Right? Yep . Um, and in this case, I would, I, it , with , with the case that what I was describing, I didn't choose to take two weeks to just think about cool stuff and what I wanted to do. Right. Like I had this thing that I was going to do with my job. And then that all of a sudden it was kind of like, wow, okay, I've got two weeks to kill. Like, I've got to kind of think about, you know, I gotta , I gotta figure out something to do in the meantime.

Speaker 3:

Yeah. I mean, you could have just gone and played like call of duty for two weeks, which , you know,

Speaker 2:

To be fair, they're not mutually exclusive. I probably played a lot of call of duty in that time as well.

Speaker 3:

Fair. So once you , um, once you kind of wrapped your head around, like, okay, I'm going to build this thing, like, it's going to be a thing. What, how close to what gray noise is now was like that original vision. Like how what's the Delta between like what you originally produced and where it is now.

Speaker 2:

Oh , it's a great question. So it's gone through so many different iterations. And so what I kind of, I would say the gray noise right now is, is the exact embodiment of, of what the vision was several years ago before it got bigger and cooler. Um, a lot of it was, I was imagining the exact, I was, I was imagining showdown , but the opposite. Okay. So I was like the same kind of like layout, similar workflow, similar kind of usability, similar like freemium similar kind of, you know, like, like similar feel, except like obviously the data that we collect is the exact opposite of the data that showed in collects, right? Like we don't scan the internet. We just look at, we listened to the internet. Right. But laying out the data similar, I've always been a huge fan of show Dan . So it was like, look, bar borrow and borrow from those who do good stuff, borrow from those who are, who have figured it out. And so I borrowed a lot from showdown . I know John I've told him that many times, like I've borrowed a lot from you over the years and I really appreciate it. Um, so now I would say in , in our, in our , our web interface is really beautiful. I feel really strongly about like elegance, beauty, aesthetic products should feel good to use. They should, they should, you should feel cool using them. It should have kind of a good feeling. So, so we're right there. Now the, the thing is that now the, like now that we're in the position that we're in, we have, you know, points of presence in dozens of countries and hundreds of data centers around the world. We have thousands of users. We have all these customers. And so now there are so many more things for us to do and the vision has gotten a lot bigger. And so now it's not, it's not at all where I would have or where we're going is not at all where I envisioned a few years ago. Um, we've kind of already surpassed that, which is exciting.

Speaker 3:

Yeah. It must be. I mean, if you, no matter how big something gets, you can kind of look back and think like, Oh man, this all came from this one little thing, this little idea, I had this goofy talk, I gave it to Imogen , you know , eight years ago or whatever it was. Um, so how, how exactly do people use it? Like what are the most common use cases for your customers right now?

Speaker 2:

Yeah, absolutely. So our elevator pitch that we give to people when we're talking to customers or potential customers is every security operation centers, too busy. One of the reasons they're too busy is they have way too many alerts. Some of those alerts don't matter very much because they're generated by completely pointless, opportunistic internet wide scanning attack traffic. That's not even a little bit targeted towards them. We'll tell you which alerts are generated by that, you know, maybe 20, 30, 40% so that you can focus on the alerts that really matter to you and , uh, like noise canceling headphones for your center, for your security products, right? That's like our two second sales pitch for what we tell our enterprise customers. But like the, the, the nerd fundamental version of like what the most common use cases are, is, you know, what is gray noise? Well , we run a gigantic network of, of, of collectors sensors, kind of like honeypots and all these different countries. We , we collect data in a bunch of places. We analyze that data. We make that data available in a web interface, API APIs and security integrations. Um, what can you use gray noise to do really three main use cases? The first one is , is tell that answer. The question is thing hitting everybody, or is it just hitting me? Right. I just saw this thing hit my network and it looks weird and it raised an alert, or it did a thing. I'm gonna look it up in gray noise. And if it comes back in gray noise, that means that it's hitting everybody on the entire internet. Not just you, it's not a targeted attack. It also means that like we we've probably analyzed the actual behavior of what that thing was doing. Maybe what it was looking for, what it was scanning for, what that IP address was targeting, you know, some metadata about it, any tags. And so we can also tell you like that weird thing is an Apache struts, you know , vulnerability check. That weird thing is actually a Muray spreading mechanism, right? That weird thing is this vulnerability or this, that, or this, this vulnerability check this, that or the other. So use case number one, is this thing hitting me specifically? Or is it hitting everybody on the entire internet use case? Number two , show me where compromised devices are. Right. As a byproduct of all the data that we collect, we know where a massive amount of compromised devices are on the internet, hundreds of thousands every day. So we can tell you like, Hey, this 300,000 IP addresses were compromised in the last day, right ? And now that's useful. That's useful to people because we can tell you if something that you have is compromised, we'll use our alerts feature to do that. If you pop in your cider blocks that belong to your network, and we see anything that pop there, we'll immediately email you, right? We'll tell you which of your customers, you know, if you have like , uh , some kind of anti abuse , anti-fraud, you know , enrichment pipeline will tell you if your customers or if your users are compromised. And then we'll also maybe tell you to, you know , give you information that you can use to block things that are like definitely bad, like beyond a shadow of a doubt, bad, right ? If that's your jam, if that's, what, if that's how your network works. Right . Right. And then the third, most common use cases, and this is the one that you and I have interacted on before is this identification of emerging threats and which vulnerabilities are being opportunistically, exploited, and from where. Right . Right. So everybody has this question when a new vulnerability drops, new vulnerability is out new CVS out and everybody's terrified. And it runs on, it's a , it's a vulnerability and a piece of software that typically, maybe sits on the internet, like a laugh, or like a, like a, like a web server or like a FTP server, or like an exchange, or like an OWA server, right. Or something like that. And everybody has the exact same site, like, is this thing actually being exploited in the wild, if so, from where, right. And has it been weaponized and operationalized by any of these botnets yet, which are just blasting the internet, trying to exploit people with this thing. Right. And so it's this third use case. What we do is we basically our re our engineering teams and our, and our research teams will basically say , uh, our engineering and research teams will put something together on our side so that we're able to get some visibility. We'll, we'll, we'll make something that looks like the thing that's vulnerable. Right. And we'll instrument the crap out of it. And then we will, we will look to see what happens. What does the internet, what does the , uh , scanners and the , what did the scanners and the collars do when they find it? Do we find anybody who's checking for the existence of the vulnerability? Do we find anybody that's opportunistically exploiting the vulnerability? Do we find anybody that's doing that at scale? Anybody that's doing that in multiple places. Now, the issue is with that third use case, it is very hard to productize because it's complex. It's not, you can't always do it. You can't do it consistently. And it requires a lot of work to get it right. So we mostly just do it for the marketing, right. We'll , we'll just, we'll just do all this stuff for free. And we'll tell everybody about it. We do what we do, license that data to people, et cetera. But that third use case is you it's, you can't predict it. And it's complicated and it's, it's hard to productize, but those are the three use cases that last one being just, you know, is anybody exploiting this vulnerability? If so, from where they doing it, where are people exploiting blue keep from? Where are people exploiting shell shock from who's scanning the internet for this OWA vulnerability. Who's running vulnerability checks at scale for this vulnerable, for this new OWA vulnerability. We can answer a lot of those questions.

Speaker 3:

So for example, the exchange stuff that came out last week, how quickly, or what kind of work goes into saying, okay, we need to do our magic on the back end because we know everybody's going to call us and be like, Hey, you know, what's going on with the exchange bugs?

Speaker 2:

Yeah. So the question is like, what's the work that goes into that? Or what's the delay , what's the drag. Yeah. So my, I can already feel my investors. And , uh, and , and, and board members like hearing this, like, Andrew, please stop saying so much about how you do what you do. Um, so I don't want to get into it a terrible amount, but the, the, the short answer is we have to, you know, there's a little bit of , of protocol implementation that goes into it. There's a little bit of, of, of sometimes you have to write code. Sometimes we just have to move around existing code that we already have something

Speaker 3:

Things just require more work than others. Some protocols are more complex than others.

Speaker 2:

Um , some things, some vulnerabilities are more complex to try to emulate a, the device that is affected, right? Like it, it varies wildly and some, and it goes back to tooling. It goes back to, you know, a lot of stuff like that . So there's a lot of work that goes into this. Sometimes it's easy. Sometimes it's complicated. Um, we're still a relatively young company we've been around for about three and a half years. Now. We only started hiring people, you know, two years ago. So we're still getting our ducks in order on turning this into a machine. But , um, a good amount of work goes into it. And honestly, the other crazy thing about it is sometimes for reasons, for various different reasons, gray noise just can't be valuable with certain kinds of vulnerabilities architecturally or because, because nobody because they affect something, that's not necessarily sitting on the perimeter or maybe because there's some prerequisites that are required, that, that are such, that we can't, we aren't going to be able to do anything about, about a given vulnerability. All of these things are the reasons why we've been very hesitant or , or I would say deliberate about when we want to productize this last thing as an offering, because it's hard .

Speaker 3:

It sounds really hard to me. I'm not even sure how you go about it, honestly. And it seems like you would need quite a bit of manpower to get that done.

Speaker 2:

We do. Um , we've got patents on this stuff. If you want to dig into it more, you can like, you can rip into all of our patents and figure out what we do. It's all public, but , um, the long and the short is like, you need to be able to, you're not fooling a human you're fooling a scanner, you're fooling a across . So you just need to know just, you need a present, just enough information to them to be able to fool the scanner, fool the crawler into doing kind of the next thing you need to do that quickly. And you need to do that at scale.

Speaker 3:

Yeah. You , you mentioned earlier that, you know, a large part of the stuff that hits say a given enterprise in a given day is stuff that they don't really need to be all that concerned about. It's mass scan stuff. It's, you know, it's opportunistic, it's not targeted. What is the, I know this is going to vary widely by organization, but in general, like how much of the attack activity that organizations see is something they need . Like they can just filter out with great news like that .

Speaker 2:

That's a great, that's such a great question. So I'm going to try to break that down as best I possibly can. What amount of attack traffic that organizations see can they just filter out? So I don't ever advise that our users filter out anything because you can't get that. You don't want to drop data. You can never get it back. And if the tool is wrong, then you're in a bad position. So you may want to put things in a lower priority on the queue. You may want to deprioritize things. You may want to put certain events in cheaper storage. You may want to, you may want to do a number of different things, but I'd say don't filter anything. The answer is, it depends. It depends on a lot of different factors, but every organization that we work with on average, depending on where they're enriching, things from brainwaves , from just about any organization is going to find a hit rate of the alerts that they're seeing that are making it to a security analyst through all of their other automation of at least 20%. So that's, that's after the firewall, after whatever else , other automation that, that automated audit like automation that the team has already built workflows, et cetera, is at least 20% of the things that are going to make it to the SOC are going to be contextualized by gray noise. Now, again, does that mean that they can just forget about all of them? No . It means that it means that we want to turn that, that one hour investigation into like a five second investigation, right? Like this thing was hitting everybody on the entire internet. And then th then the explainability part comes in like, okay, well, was it, what , why did it trip the alert? Oh, it's looking for this kind of technology. And we don't actually run that technology. So like cool, done. I don't need to look at this thing anymore. Right. So it all kind of goes into it. There are a lot of factors that come into play there when you now here's the really interesting part on the other side of this, let's just say, how many of the opportunistic attacks that an organization sees at their perimeter are things that gray noise can contextualize not necessarily that makes it into the sock and makes it up to an analyst, but just like the amount of things that hit the , uh, the amount of things that hit an organization, that's like 90, over 90%, but you're not even making this up, like the amount of internet wide background noise, like between just like tiny stuff, like scans, more complex things like attacks. Um, I mean, it's absolutely overwhelming. It's so it's so much more, it's so much noisier than anybody realizes. And so much more like what grain, the data that Greenways noise looks at overlaps with such a massive amount of, of what customers are seeing on their , their enterprise networks, their corporate networks, perimeters it's absolutely mind boggling.

Speaker 3:

Yeah. I I've had conversations with people that work in Sox , you know, many times over the years. And like, they all have kind of that like thousand yard stare after awhile . Like if they've worked in a sock for long enough, you know, they're just kinda like, that's exactly right. They just have like PTSD from like, I have nine monitors in front of me, like showing, you know, all these Pew, Pew maps in all these alerts. And I don't really like the contextualization part of it. And like, reducing the noise must be such a huge thing for them. Like, okay, I might need to worry about this, but not yet. Like ,

Speaker 2:

Yeah. Tomorrow, look, there are a lot of ways to think about gray noise and to think about this problem. But at the end of the day, like security people don't want more things to worry about, right? Like they really, I promise you, they don't, the people in the SOC are overworked. They are exhausted. It is a very hard job. They are getting alerts from a zillion security products. They are desensitized to a massive amount of those alerts being absolutely useless wastes of time. And they are frustrated. And, and instead of us coming in and claiming like the reason that people love gray noise is because we don't say like, you know what, by our product security solved, right? We don't do that. We're just saying like , look, this is a problem. And we're going to make this problem incrementally better for you. And as we get better at it, we're going to , we're going to get more and more confidence and more and more data to be able to give you fewer and fewer things that really matter to really whittle away more and more, or to more quickly get you that sort of , um, that sort of time, reduce that time to verdict of this thing. Doesn't matter. Not the time to verdict of this is a cataclysmic bad thing. That's not really what we focus on. We focus on getting you to the conclusion of this. Isn't really a big deal as quickly as humanly possible because of the exact reason that you were just describing before the people who work in socks are exhausted. And the security industry is not making it any better. We're just making it worse. And so we're trying to do the exact opposite of what every X , every security company, or every product company in the SOC has ever done. And that's just give more context, more explainability and try to help people come to try to guide people towards like this thing. These are the reasons why this thing wouldn't matter. Move on to the other thing. Oh, that alert. We don't know anything about it. You should, you should investigate that.

Speaker 3:

Right? Yeah. See, that's important. Like the part of telling people, what you don't know is just as important as saying, like, here's the things that we do now , right?

Speaker 2:

Negative and , and having , uh, or obviously you can't prove a negative, but know , like having a negative ground truth or, or that's a data point in and of itself. When you say like, when you look something up in gray noise and it's hammering your network and it's hammering your whole perimeter and you look something up in GreenWise and there's nothing there

Speaker 4:

That's important. That means it's not hitting the entire internet. It's only hitting you. Right? Like, like you should be concerned

Speaker 3:

About that thing. Right? Yeah. It's like the security equivalent of like, down for me, like, or down for everyone, you know,

Speaker 4:

Is everybody else seeing this thing or why the only one, right. And that's something that nobody was able to answer. This is so funny

Speaker 2:

When people talk about like, what is great noise is competition. I'm always like gray noise is competition is, is a mailing list and a group chat that you're in with your 40 friends who work in socks , where you say, Hey, are you guys seeing this too? Are you guys seeing this too? I'm seeing this right. And that is a bad user experience. That is not a good, that is not a good process, but it's the only one we've got right now. And that's one of the things that we're really trying to change with gray noise. And the cool part is you don't even have to become a customer to use this thing, right. Use our web interface. It's free. Right. Eventually when you use it, it's going to prompt you to log in and create an account. So we can at least know who you are, but it's free. You don't have to give us money for this. Right. Use the web interface, look stuff up anytime you're seeing an IP address, doing something weird on your network, or you're investigating, and look it up in the gray noise web interface. And we're releasing a community API that's free. And unauthenticated right. It's just got a little bit less data. You don't even have to create an account for that.

Speaker 3:

You can just look stuff up. Right? Nice

Speaker 2:

Look stuff up with that. And then if you find that after a really long period of time, look, you're going to get value out of using this thing without giving us a dime. And if you find that after using it for a long time, you get sick of copying and pasting stuff into that web interface. And you say, I really wish that I had this thing in my security products. Then, then you can talk to us and we'll have a conversation about selling you some stuff. But before that, just use this thing, it's free. It's great. I promise you, it's going to make your day a little bit better if you work in this .

Speaker 3:

Yeah, I think that's true. So how does the , um, how does the enterprise product product part of it work if it's not just the web interface, how is it integrated into like the security workflow?

Speaker 2:

Yeah. Yeah. That's a great, that's a great question. So the, the workflows vary from customer to customer , um, and there's some details that get lost, but, but at a high level, this is kind of how it looks and feels for everybody. We've got this web interface, it's this free thing. Anybody can copy and paste an IP address straight into it, or even dump a log file into it. And we'll enrich all that data. We just won't let you export it back out. And so you can do that with the web interface right now, the business model is at some point for you to really get value at a great noise, like get actual, like, you know, like big dollar figure value out of it. You have to build automation around it. And so in order to build automation, you need to use our API APIs and you have to put us in your existing products, right? We, you need to put in a gray noise integration in your SIM. You need to put a gray noise integration into your threat intelligence platform. Maybe you need to put gray noise into your soar platform. You need to put gray noise into the products that your analysts are already using, or your data pipeline, right? Your enrichment pipeline. That's what you have to do. And so in that case, there's a few different kinds of workflows. One is just enrich everything that hits our organization, or that goes through this , this, our Sam or whatever, enrich everything. And just tell us what gray noise was saying about it at that time. And let us like do analytics on that after the fact, like we have a lot of customers that do that, but it is it's expensive for you. We'll charge you a lot of money if you want to do that. Not as much as any other, as a lot of the other security product product companies, but still quite a bit. Right. And then there's other sort of like, you know, less intrusive, less advanced kind of as part of any given security analysts workflow, like a ticket gets raised, an alert gets raised. An investigation gets open just either in an automated way, look up the source IP address that generated that alert against grain noise , or even just, you know, hit it against, you know, use it, use the command line or use, you know, use , uh, kind of like your, your, your run book of scripts that you have run up or use your use , your soar , use your, whatever that you have set up to like do a lookup against that thing and, and , and report back the results. There's a good chance that it's like a known benign internet scanner. That's like, not even like a bad one, like, Hey, look, show Dan just added this vulnerability check and it tripped your perimeter. You know, your, your IDs into thinking that, you know , somebody was exploiting you, but it's not, it's just a security company. That's just checking everybody on the internet. Right. Those are the, those are the high level workflows that we see most often.

Speaker 3:

Okay. And you mentioned earlier that you guys, you know, you can tell, tell like, Hey, there's a bunch of machines compromised with say the exchange exchange bugs or whatever. How are you making those checks? Like how, how can you tell that, you know, say this OWA servers zone?

Speaker 2:

Oh, that's a great question. So , um, basically the, the shortest answer that I've got for you is that that is the entire job of our research team is to look at raw data and to first write , you know, write rules and make sense of that data. And then to second to apply accurate metadata that is like accurate expert analyst, you know , uh, expert analyst observations or conclusions from the data that we're seeing. And so what that means is like our analysts, what they do is they'll look at our data and they'll say this raw traffic that we saw in gray noise, like on our collectors sensors, this raw data means this, right ? This means that this is somebody that is specifically probing for this vulnerability or attempting to exploit this vulnerability, maybe the OWA vulnerability. Right. So then they first say like, let's make sense of the data. This is what this thing means. This is what this thing is. Right. And now then there's the next step is all of the kind of like, well, what does that mean? Right. Does that mean it's bad? Would only a bad a worm do that? Or would it, would a researcher do that or would a , would a legitimate security company do that or whatever, what are all of the different categorizations and all the other kind of inherited metadata about that, that are going to ultimately end up flowing up to a conclusion of, of what we determined the maliciousness to be of that IP address. We don't use any like machine learning on this right now. It's just, it's just analysts work. If you do this, that means this. And that means you're bad, that's it it's that it's, that it's actually that simple, but we operate at such a large scale that even super simple things like that, they mean that our false positive rate is extremely low because we're not doing any kind of like, like wildly advanced , um, you know, data categorization techniques. It's just one foot in front of the other. And the scale that we see, the scale of data that we see is so massive that it yields, you know, thousands and thousands and thousands of, of compromises

Speaker 3:

Every day . Okay. So you can tell, like, this is what a compromised OWA server looks like. This is how it acts our perspective, right? From your perspective, right? Yes .

Speaker 2:

So we're not going to , we're not going to opine on like, Hey, gray noise. Can you tell me if that server over there is , is compromised? We can say, well, we've never seen anything from it. First of all. And second of all, if we had seen something from it, let's see , uh , it did this, this, this, and that means that, you know, these are, these are the conclusions that we're willing to come to based on the data that we've seen in our collector networks, right? That this, this is what we see. This is what we know. And we try to stick to the facts and we try to make things as explainable as possible, because we don't want to have this like black box voodoo kind of like, what does it mean when gray noise says this thing, right. We want to be right every time.

Speaker 3:

Yeah. It , especially when you're dealing with something like that, if you're telling people, we know that your email servers are owned, or we know that this network is, you know, is compromised. That's not an accusation. You know, it's not something you want to say lightly. No,

Speaker 2:

Exactly. And so, and, and , and trust is a reservoir that's built over time. And so we really, really want to have people consistently look things up in gray noise for gray noise to consistently deliver value to them and say, Hey, this is what we know about it. And for gray noise to be consistently correct. So that people can trust us more. And they can increasingly kind of offload certain parts of their workflow more and more on to automation of that uses gray noise data. And we, we we're , our, our research team is like, they're fanatical about getting it right. If it means that we can tag another 10% of things that we're seeing, but we open up a margin of error where sometimes we're going to be wrong. We won't do it. And , and we do that every single time. We've always done that from the very, very beginning. And so if we say like, well, if we open up this tag, like we, Oh, we opened the aperture on this a little bit. We'll maybe capture some more stuff because this tags a little bit brittle and it's a little whatever, but it means that we might false positive. If somebody does this thing hard. No, we will not do it.

Speaker 3:

What a weird attitude

Speaker 2:

We it's when you're doing, when you're only, when your only job is to provide GroundTruth of some kind, you need to be right every time. Right. And we're not always right. That's the thing is like, we're not, we don't, we do make mistakes. We'll, we'll miss tag things. Sometimes we'll think something is some actor and it's actually another, so what we'll get, we'll get details wrong. And we always correct them. And if necessary, we'll even let our customers know we've, we've just corrected this thing. Right. We'll correct. The record we'll go, we'll go out and do it. It happens. It happens very rarely. Um, but because our tagging is done by an automated system, it's not done by a human. Um, the, the, the logic is written by humans, but the actual application of all of the analytics is done by machines. Um, it's really important that we have incredibly high standards on that .

Speaker 3:

Yeah. I'm with you on that. I mean, it does seem like there's, I can see what you mean. Like you , you could be tempted to be like, well, look, let's just open this a little bit and we'll get more stuff and we'll, you know, it'll be great.

Speaker 2:

Well, as a journalist, you know that like, it's always tempting to sensationalize . It's always it's . You always are like, man , if I change this one word, or if I said this one thing, I know that a lot of people would get way excited about it, but you're like, I just can't, you know, I don't, I , I, my journalistic integrity is on the line. If I do that. And for me, it's the same thing. I'm like gray noise. His reputation is on the line. Every time we think about writing an analytic or a conclusion in publishing that out to people. And that's why we're very, very pedantic about the way that we deliver insights. It's also fun fact. It's one of the reasons why the gray noise, Twitter account doesn't tweet with regularity, because if we start making people feel like, well , we got to put something out this week, we've got to put something out this month, then that's going to make us start to compromise on what, on, on what we think is interesting. What we think is useful for the user to know, and it's going to start making us feel like, Oh God, we've got to get something together where where's something, where's something right. But that's not how we do it. That's never going to be how we do it. Right. We're only going to stick to things what we know to be true. And we're going to tell our users about that. And if nothing happens for three months, that we don't think is, is of, of the, of the level that we should tell our users and our customers about it, we won't tweet a thing. Right. And that's just how it's always going to be. I'm sure we're going to have wrap-ups, you know, weekly, weekly kind of , this is what we saw, blah, blah, blah. But when we're really saying like, this is a thing that we're seeing and you guys need to care about it, right. We really take that seriously.

Speaker 3:

So if I see you hire like a social media team, I should start to get worried .

Speaker 2:

I would not. I absolutely wouldn't. I think that we've got so much great stuff to work off of that we're going to be able to kind of like market a lot of this stuff that we've done already, but yeah. But , but I would, I would be concerned if all of a sudden we're having like hair on fire, you know, gray noises having hair on fire occurrences every day. Like, come on there. Aren't well, lately there have been hair on fire occurrence is almost every day . Right. Or something close to it. But, but yeah, no , I , I , uh, I , if , if things are always on fire, then things are never on fire.

Speaker 3:

That's absolutely true. Um, are there any, like non-security applications for the way that, you know, for the data that you guys see and collect and all that

Speaker 2:

I'm asked that question a lot. The answer is , um, mostly no, but if there are, I just haven't seen them yet, they're there are one or two and I'll talk about them, but , um, I haven't seen them yet. I'm also just a security guy. So I'm, I have a hard time envisioning use cases for our data that I haven't thought of. It's just so esoteric. And it's so like the data is so niche and so specific that it , it mostly makes sense that like, what we're going to see is, is given the most of the applicability is going to be from a security perspective. The two things that immediately jump out to me though, is that one, there is a massive operational spend that goes into storing logs that are generated by internet wide opportunistic scan and attack traffic sometimes just for compliance reasons. So, I mean, it's going to sound insane, but I'm going to tell you right now, there's a lot of organizations out there that capture firewall logs of everything that's happening on their firewalls may have to store it for compliance. And that is a massive amount of data with very little intelligence or an error or analytical value to the analyst . They just have to store it. And a lot of that stuff is, I mean, it's, I cannot stress this enough meaningless. Right. And so we'll , we can, we know the difference between what is, and isn't meaningless. So we can tell you like, Hey, route , this we're working with a partner on this right now to do who handles all the routing. We just do the data, they do the routing. Um, we will basically say like, Hey, put anything that matches this criteria, put this into like glacier right now, everything else should go into Splunk, but put this into glacier, right? Because you're going to spend a lot of money storing a lot of useless data. So that's kinda more of like an ops thing and network ops thing, CIS admin thing. The only other thing is like when, when the internet goes off somewhere, all of the internet background noise stops there. And so there is some applicability of using gray noise in the capacity of trying to identify problem areas of the internet, or even just like pipeline issues and the internet. I remember when Iran shut off their internet , uh, not long ago. And , uh, either they shut it off for , it was shut off. I actually, I don't recall, but it was about a year and some change ago. And all of a sudden, very suddenly we started seeing, going from a regular cadence of seeing, you know, five, 10,000 a day, that five, 10,000 distinct IPS a day coming out of this, you know, coming out of IP space, that's geographically located in Iran , uh, all of a sudden to zero. And we're like, Oh, the Internet's off. And so there are, there are some other use cases, but you know, I'm, I'm reaching pretty far when I try to think of them.

Speaker 3:

Okay. Yeah. I just wondered, cause it's such a unique kind of approach and way of looking at the problem. Um, you know, it's just removing all of the crap that you don't need to think about, you know? Um ,

Speaker 2:

Exactly right. It's not sexy. It's not, it's not like we're not like usually finding ABTS and, and you know, it's, it's not, it's not like a cool, sexy thing. It's like, it's like insurance, it's like eating your veggies, right. You just, you just have to do it. Like if you don't do it, it's gonna suck. And so you just, you have to do it. It's just work. And , and , and we're just trying to make it easier and better for you. Um, but yeah, we it's, it's , uh , it is a very different approach. Most security companies try to save you money by scaring the crap out of you about the big costly breach, and then claiming to prevent the big costly breach. And that's not what we do. We save you money by functionally, helping you be more efficient in your investigations and come to the, this doesn't matter conclusion as quickly as possible. And that's where we'll see it help you make your people more efficient, make their, make the , make it easier for them to get through the queue of tickets, which will then free them up to find the big, bad breach , which is like, cool. Now you can go and do that, but we help with the basics.

Speaker 3:

So what's next. Do you have any , uh, what's what's on the drawing board for , uh, stuff you'd like to tackle with gray noise,

Speaker 2:

So many things , um, uh , the , the big things that are coming up are we're releasing an unauthenticated community API that anybody can use. It's a , it's our Gar enterprise API, but just with a little bit less data, it's you, can't, you , you know, you can see like five different five different fields and you look it up as much as you want integrated into all your products use it. It's fantastic. So the community API, we're going to be releasing, I just saw the pull request for it this morning. So , um, we're going to release that. I dunno , probably sometime, probably I guess, early. Yeah , next week. Um, after that we've got Greenway's riot right now, now , which I'm super excited about this is it's in beta, but functionally, what we're doing is we're taking the same concept of telling the analysts what not to worry about , um, by enumerating as many safe areas of the internet as we possibly can. So don't even forget about, honeypots forget about like collecting data, collecting sensor data, et cetera. Like we are literally like finding where all the IP spaces of every benign service ever, every single windows update server, every CDN, every, you know, every SAS product API, like every social media applicant , like, like every, every mail, legitimate mail server, like all of the things that we can find that in and of themselves are , are not exploiting you. Right? Like, yeah. Maybe they can be used in a complex attack. And we know that we know exactly where it breaks down, but just like identifying those things and giving the user the ability to say, what would my network look like if I just like, didn't look at that, what would my SIM look like if I didn't see all of these optimized CDN logs? And I didn't see all this CloudFlare and I didn't see like all of my office, three 65 logs, and I'm like, all this stuff like that, like, what would it look like? Right. So just giving the analysts the ability to do that , um, I hope nobody gets mad at me internally for doing this, but we're open sourcing. Uh, we're open sourcing our collections , like our, our honeypot we're open to it , using it for people so that people can use it program at themselves. Um, uh, we're doing that , uh, this year. I don't know when, but , uh , we're going that so people can literally use our collector to build their own gray noises if they want to, but it would probably just make more sense to use ours. Um, but they think they're going to be able to, yeah, it is. They're going to, they're going to be able to use our, our, our, our , um, our honeypot and program it and , and it's, and it's super interesting. Um, so I'm excited about that. Um, we're about to go live with a new website. We're going live with new public pricing. Again, transparency is like insanely important to me. So making sure that people understand that we're not nickeling and diming anybody, this is the price, this is what everybody gets. Um, and so we're going to be doing public pricing again. Um, let's see, we're going to be doing more advanced. Uh, we're going to be doing more interesting, like geographical trending later in the year. Uh , so that we're going to tell you when interesting things are only happening in certain places. Um, we're going to be, we're going to be revamping our, our , uh , analysis page. We're going to be revamping our alerts feature. Um, let's see. And then we're reaching for, it's really, really ambitious, but we've, we've got a number of other things that we want to do. There's no way we're going to get all this stuff done this year, but that's what we're shooting for right now. Um, I would say those are the big, it's a lot of stuff. Um, Oh, the only other thing to add, sorry, one last thing is like we've, we've, we've , uh, implemented this really cool bitmap , uh , bitmap technology that allows you to do enrichments against gray noise on like a tiny data file, as opposed to having to hit all of our API APIs, like murdering our APIs. So all of gray noise enterprise customers are going to get like an absolutely insane performance improvement here very soon. Um, those are the major things. Um, we, I want, I want to solve honey potting for people like , like I want to solve internet background noise for people. I want to solve honey potting for people. And I want to solve the, like, does this thing really matter? Or is this thing like not a big deal problem for people. Those are the problems that we really want to solve for people. And , um, I think that there's a lot of work that goes into doing that the right way and figuring out the right order of operations while balancing investor expectations versus revenue, you know, revenue expectations versus like the right, the right clip of progress, quality of product is , is, and where we focus our time and energy is challenging. Um, but, but it's, it is really exciting and I'm really excited for this year. And I'm really excited just to get some of these new features out to everybody and start getting, getting everybody's thoughts and feedback on it.

Speaker 3:

Yeah. I mean, that's a pretty ambitious list of stuff to get done, but it is, I mean, considering this came from , uh , you know, a weekend hobby project to where it is now, it's , it's pretty rad.

Speaker 2:

It's changed my life completely. Um, I'm for better or worse. I'm kind of the Greenways guy now. Um, yeah. And, and , uh , which, I mean, I , I couldn't be happier about that. This is like a , just a crazy hard problem and I'm learning so much and I'm really, really enjoying doing it. And I'm at this, the coolest thing about my job now is like, so we're solving the problem, which is awesome. Um, we're, we're making a name for ourselves, which is awesome. We're , we're punching way above our weight against all of these like, like massive security companies. And we're providing something that none of them can or know how to provide. We have customers that would like better , like it's a mindblowing list of, of, of users and customers that use and trust us and our products and our data. Um, and the coolest thing for me is like, I've hired so many people that are like so much smarter than me, which is just insane. Um,

Speaker 3:

That's the way to do it though. That's the way

Speaker 2:

It's crazy. I can't, I'm , I'm so lucky. I'm so fortunate for this. Like I'm every day I've got all these people who are just like, she's so smart and I'm just like, God, I'm lucky to be here. So , um, yeah, I'm , I'm super excited about it and , uh , it is really ambitious, but , um, it's , it's also just awesome. Getting invited on to stuff like this, to talk about it. It's it feels really good. Yeah.

Speaker 3:

Well, I'm happy for you, man. It's , uh , it's such a cool idea in such a it's it's a cool story of how it came to be , uh, you know, from one little idea and all the way up to this it's I love those kinds of like, you know , um, David versus Goliath and just, you know, creative people doing things are like, well , why isn't somebody else doing this? Well, I guess I'll do it, you know? Yeah,

Speaker 2:

Yeah. Nobody's going to do it unless you do it right.

Speaker 3:

Yeah. It's a big problem. It needs to, needs to be solved. Let me see if I can solve it. Um , that's the kind of stuff that I love, so, yeah. It's cool.

Speaker 2:

Hardest, hardest problem I've ever tried to solve in my entire life so much though , that I'm still, I'm like, I'm like God, however many years into trying to solve it, but yeah, it's, it's, it's super interest .

Speaker 3:

Yeah. Well, thanks so much for coming on Andrew. This was a lot of fun and hopefully we'll get to do it again.

Speaker 2:

Yeah . This has been fantastic. I've had such a great time talking to you, please. Anytime you are willing to do this again, I will be back in a second. So thanks for that .

Speaker 3:

Absolutely man, take care. All right. You too.

Speaker 1:

[inaudible] .