HSDF THE PODCAST

AI and Data Analytics for Mission Support on the Border - Part 2

February 08, 2024 Homeland Security & Defense Forum Season 3 Episode 7
AI and Data Analytics for Mission Support on the Border - Part 2
HSDF THE PODCAST
More Info
HSDF THE PODCAST
AI and Data Analytics for Mission Support on the Border - Part 2
Feb 08, 2024 Season 3 Episode 7
Homeland Security & Defense Forum

In this final episode of a two part series, learn how CBP is deploying Artificial Intelligence to support the border security mission and how industry is collaborating to create new solutions and respond to new technology requirements.

Embark on a journey into the high-tech frontier of border security as we discuss the transformative power of AI.  Venture further into the digital realm where cybersecurity stakes are at an all-time high, as we explore the necessity of zero trust architecture with our distinguished panel. 

Dive into the discussion about safeguarding our networks with robust encryption, even in the face of adversarial machine learning. This episode brings to light how government agencies and industry leaders are teaming up to reinforce our digital defenses and ensure that our security keeps pace with evolving threats.

Featuring:

Ed Mays, at CBP's Office of Information and Technology,

David Aguilar, former CBP Commissioner 

Joshua Gough, from CBP’s Planning, Analysis and Requirements office, and 

Jay Meil, Chief Data Scientist, SAIC Artificial Intelligence Innovation Factory


This discussion took place at the Annual HSDF Border Security Symposium in Washington DC on December 12, 2023.

Follow HSDF THE PODCAST and never miss latest insider talk on government technology, innovation, and security. Visit the HSDF YouTube channel to view hours of insightful policy discussion. For more information about the Homeland Security & Defense Forum (HSDF), visit hsdf.org.

Show Notes Transcript Chapter Markers

In this final episode of a two part series, learn how CBP is deploying Artificial Intelligence to support the border security mission and how industry is collaborating to create new solutions and respond to new technology requirements.

Embark on a journey into the high-tech frontier of border security as we discuss the transformative power of AI.  Venture further into the digital realm where cybersecurity stakes are at an all-time high, as we explore the necessity of zero trust architecture with our distinguished panel. 

Dive into the discussion about safeguarding our networks with robust encryption, even in the face of adversarial machine learning. This episode brings to light how government agencies and industry leaders are teaming up to reinforce our digital defenses and ensure that our security keeps pace with evolving threats.

Featuring:

Ed Mays, at CBP's Office of Information and Technology,

David Aguilar, former CBP Commissioner 

Joshua Gough, from CBP’s Planning, Analysis and Requirements office, and 

Jay Meil, Chief Data Scientist, SAIC Artificial Intelligence Innovation Factory


This discussion took place at the Annual HSDF Border Security Symposium in Washington DC on December 12, 2023.

Follow HSDF THE PODCAST and never miss latest insider talk on government technology, innovation, and security. Visit the HSDF YouTube channel to view hours of insightful policy discussion. For more information about the Homeland Security & Defense Forum (HSDF), visit hsdf.org.

Announcer:

Welcome to HSDF the podcast, a collection of policy discussions on government technology and Homeland Security brought to you by the Homeland Security and Defense Forum. In this final episode of a two-part series, learn how CBP is deploying artificial intelligence to support the border security mission and how industry is collaborating to create new solutions and respond to new technology requirements. Featuring Ed Mase at CBP's Office of Information and Technology, david Aguilar, former CBP Commissioner, joshua Gow from CBP's Planning Analysis and Requirements Office, and Jay Meal, chief Data Scientist at SAIC's Artificial Intelligence Innovation Factory. This discussion took place at the annual HSDF Border Security Symposium in Washington DC on December 12, 2023.

David Aguilar:

So, as I'm listening to this, it all makes sense. Human machine teaming speed to decision, tip and cue. So and you mentioned Oda Loop Observe, orient, decide and act. That's what we all do Absolutely. But whether it's an OFO officer or a Border Patrol agent, aaron Marine officer, a law enforcement officer, nothing is ever the same. So things are going to change. So, beginning with the Oda Loop concern or, yeah, concern to bring the greatest, most effective and speediest situational awareness to that human actor so that they can respond appropriately and then respond appropriately and, once responding, allow RPA if needed to engage also and then to hand off. So that kind of the decision and hand off process, can that be actually accomplished?

Ed Mays:

So I think the answer is yes and, by the way, I don't think the Oda Loop is just for this right or these processes. I think every human being does it every day of their lives. We observe a situation, we pull out the context that's important to us, we do a risk assessment, we decide on what we're going to do and how we're going to address it, and then we take some action and have a feedback loop with that right. I think that's the most human of things. So, as we talk about this, I think we're really talking about how do we make things better for those officers and agents, ie humans that are out accomplishing a mission.

Ed Mays:

So, absolutely, when I look at and I mentioned the Douglassy Analytics Project, it was just a small thing that we did with the CTO shop, but it grew and we announced pretty important to USBP and the PMOD office right, because they were actually doing a small implementation. But what came out of that is we found that we could collect data information, have it, trigger events right, have that the important data go up, be looked at by humans right, and then have that information sent out, if you will, to those officers and agents who actually need it, not flooding them with a whole lot of data, flooding them with the right data, right, the right amounts of data, so that they can actually make decisions. That became pretty important. Did we complete everything that we wanted to doing that POC? No, but we had a lot of lessons learned, and those lessons learned imply to me that this is definitely doable, and doable in the near term.

Jay Meil:

I'll just add you said I'm just sort of foot stomping something that you said being able to make decisions doesn't mean having a lot of data.

Jay Meil:

It means having the right data or the relevant data.

Jay Meil:

So there's, as we continue to add more sensors and add more meshes and networks, and we've got platforms in the air and we've got all sorts of things all we're going to do is throw a deluge of data at the operators and of course, that can be overwhelming.

Jay Meil:

So to me, I sort of equate it to if we understand that we have to get the right data to them at the right time, it becomes a concept like I use the term someone probably will correct me, but I call it a data kill chain, right? So we have to ingest the data off the sensor, we have to index it, we have to transform it, we have to condition it, we have to analyze and make sense of it, and then we have to disseminate that out to an operator and we have to do that very, very quickly, and we might have to do it hundreds of times or thousands of times in very short periods of time. And that's where the machine learning piece can really play, that's where AI plays the role and that ability to sift through all the noise and deliver only the right data is so important.

Joshua Gough:

Yeah, when we think about right data, chief, I think what we really need, when this just sort of boils down to the brass tacks right now, we need answers at the speed of the mission. That's what we need in the field. Even a dot on a map telling me a location of a something answers the question where is he? Where is it? I just tell you a quick story. I went on one of these, one of the major public search engines, and I typed in how tall is Mount Everest and it replied back 29,032 feet. It didn't give me a 75,000 line Excel spreadsheet and told me hey, good luck figuring this out. I went on to another, less popular website and I asked the same question and it said 29,031.7 feet, which is close enough, right.

Joshua Gough:

I went into a CBP system one day and I typed in how many people crossed the border yesterday, which you would assume, like our databases, have the answer to. I got back the correspondence manual that taught me how to write memorandums. This might be a search. This is an exaggeration. It's just really more of a sort of a vignette type thing, because that's probably a different problem.

Joshua Gough:

But I want to just sort of come back to what she was talking about it was this idea of the machines being able to process that data kill chain that he talked about. In real time, an operator has an answer to his question about where is it and what is the fastest route for me to get to it. What is it identified as? What's the relative threat level of that thing, whatever that thing happens to be? That's where that's the bridge between. I think Elon Musk has said we're all cyborgs, it's just the bandwidth problem because we carry our phones, we have the information of the universe in our hands, but we can't process it fast enough. But being able to do it, to put it into the human machine team, is really where we're going to make the money.

David Aguilar:

Thanks, Amazing stuff. So what I heard was it's about the data, it's about the data flow, it is about the assessment and analytics of the assessment, still requiring, at this point in time, a human interaction to check that assessment and analysis, selecting the information and data to communicate to the actors, the operators, whoever that's going to be, and then carry forth Go, no go, basically. So, recognizing that a lot of those steps right there and I know I missed probably 10 others in between there but recognizing the way we're talking about doing this now you held up your eye point here a second ago the endpoints there has to be a front-haul capability, there has to be a backhaul capability, there has to be a ZTA and beyond capability. And if you're doing this in the areas where CDP operates degraded, contested or denied environments, that means that it's either going to backhaul through satellite, fiber, wi-fi, some kind of backhaul. How many security features do we need to overcome to get that data to that in-player? Who's going to go?

Ed Mays:

for no go. So I think there's some good news here. I think that the work that we do with our systems already put a lot of security in place, and I think it's interesting that and I keep harkening back to this Douglas Analytics project we started a project that literally was about how do I backhaul some data right Using low earth orbit satellite. Then we realized, oh, this is a much bigger problem that we were trying to solve, but we couldn't touch all of it. So we touched a little around the periphery, around the edges of it, but what we learned was that a lot of the things that we were already putting in place for Zero Trust kind of answers a lot of that cybersecurity, those cybersecurity challenges that are out there. So I don't think there's a whole lot of cyber challenge from that perspective. But I will tell you that not today, but the future. There will be additional challenges, right? One of the projects that we are, sunil Magagari and his team and I have been working on, is something called post-quantum cryptography, and I think that is, as Q-Day comes, where quantum computers really come to their own and people thinking in the audience probably oh, that's like Sky Net and it's when the future, I think that future is a lot closer than we might think, but that's a bigger challenge and we're already starting to address that. But in the near term, I think the things that we have in place on our network, in our systems, the systems getting ATO'd we take a lot of those things into consideration before you get on the network.

Ed Mays:

The other thing that we're doing right now to address that is looking at what's called operational technology. Not operational technology as normally described in industry documents, where you're looking at control systems for, let's say, power plants or water plants or something like that, but actually looking at how our systems, via IoT or other, actually whatever those are, whether they're wearables or drones or you name the device submitting data, how do we make sure that that is secure? So we're putting a lot of thought into this. We just hired an incredible talented young lady her name Shilu Patel, and she's gonna be leading up that organization for us. I mean, we've been interacting with USBP, pmod.

Ed Mays:

We had a meeting with AMO yesterday and the day before and it's just. I mean we're starting to see how do we do these things. For instance, how do we ensure that, whatever that device is on the periphery, that's operational technology, that it's actually the cyber hygiene, is up to date. Who's doing that now? I don't know. In some cases, right, in some cases I do. But those are the. If we get those challenges under our belt and get those fixed and controlled, I think that we're along the way, down the road and handling any cyber challenge that may come up.

David Aguilar:

So, nick, one quick question on that, because I've heard that zero-toaster architecture is basically a baseline right now. It's a threshold that has to be met, but in addition to that, we know that threats are constantly evolving. One of the things that a friend of mine has talked about is the use of AI and machine learning to detect anomalous activity that might point to threats that we don't know. Yes, does that come into play?

Ed Mays:

It does absolutely, absolutely, and thank you guys. We've actually been doing those kinds of things for a while now. When you start to look at insider threat, what are those tools doing? They're looking for anomalous behavior. Did Ed May's log onto his system at zero two in the morning and he doesn't usually do that or, you know, did, and now you know from the academic perspective, were we hiding? Did I find something hiding in a file that shouldn't be there? Right, and you can see or picture or whatever happens to be. We see that a lot.

Ed Mays:

So I think these things from a zero trust perspective, we have gone a long way in addressing. But and there will be new and future challenges, right, but I think that you know, ai will play a big role in observing and looking at some of those. Look, I'm a person that, when I look at what we've done from a security perspective, I think we've done quite a lot. But I've read some papers where people using you know they were looking at, you know our networks from a game theory perspective. I was like, oh my gosh, I never thought of that and that's a weakness and I need to address it. But I think we've got. I tell people that I believe in our country. I believe in our industry and I think we've got the right people focusing and will keep us focused on keeping those networks and those systems safe.

Jay Meil:

I just want to add to the zero trust discussion. So, as we proliferate sensors out into the environment and as we're doing more tactical data backhauling and things like that, we really have to understand that zero trust landscape. So first, what is what does your trust mean? I'm not a cyber guy, so I had to really dumb it down for myself, and it's moving from trust but verify to never trust. Always verify, yes, and, and what that means is every sensor, every network, every packet of data, every person that's touching the system, every single time, we need to make sure that's really who they say they are, or really what the data is supposed to be. And and as chief said, that's definitely AI. Ml can do a lot of that. But I think it's also about the encryption standards that we're using.

Jay Meil:

We have to really think about like sort of a trifecta of encryption, right? So we have to think about encrypting data at rest, which is table stakes, you know, encrypting data in motion, which is table stakes, but also encrypting data in use. So what does that mean? When we go send data back and forth or we run some type of logic or calculation, data has to decrypt before it can run that calculation. So we need to look at not just post quantum encryption but other types of encryption to actually protect data in use as well. So if there is an adversary on the network, that data remains encrypted even while logic or calculations are taking place.

Jay Meil:

The other thing I think we need to look at is at the sensor level. So we have to start thinking could someone have tampered with our sensor that is already attached to our network, that we expect data to be coming back and adversarial ML is becoming a real thing. So we need to have fidelity and confidence around. The data coming back off of the sensor is the data we expect. The sensor hasn't been tampered with. No one is trying to push a model drift or anything like that to try and tamper with the results we're getting. So there's so much around beyond just the zero trust. But when you look at tonight in great environments and you look at how we operate with sensors and networks, we really have to think about how we encrypt and how we validate the data that's moving through these networks.

Ed Mays:

Sir I think the really good thing is that we are doing a lot of work in our environment and from the innovation with our Invent team to working with our Chief Technology Officer in his office. I mean, for me these challenges are large but I think we're up for them and the whole thing on data at rest or data wherever it happens to be all important. But these are challenges that didn't just come today. They've been there for years and making sure that we put the right practices in place, getting those sort of best practices, is really critical. And I will tell you, the challenge is large. I mean, there's 100 million attempts to date against the CBP network. That's significant.

Ed Mays:

So, you know, the CTO and I are our DAC for software development and in our our CISO, we work hand-in-hand and we, you know, collaborate to make sure that we are doing the right things. If you look at our paper that we're gonna put out and probably in a couple weeks on post quantum computing, you'll see that our CISO is hands in there, our CTO is hands in there, and it's because we realize that there's an enemy out there, an adversary that's out there, that can harvest now and Decrypt later on Q-Day right, which is a challenge it's. It's a big challenge for us. It's also a big challenge for DoD, but we're trying to put the right things in place to be ready to take on that challenge so that you had mentioned a program avatar.

David Aguilar:

I believe you you had mentioned, yes, avatar program. Yes, yes, can you give us a little bit of a sense of that? So?

Ed Mays:

we. So we, my organization, along with the CTO, we we're definitely working on some an AI, an avatar that we're Going to get a demo on in the very near future. That's going to. I think it's game-changing technology that's gonna do a whole. That brings a whole lot capability immediately to the table, and a part of that could be, you know, at a port of entry engaging with, you know, people speaking that have, whatever the language might be, and, and, you know, helping those people solve their problems. That could also be sitting inside of a, a large Compendium of data from CBP, whether it's trade or something else.

Ed Mays:

And I think you mentioned earlier, josh, that you know you looked up something and you got a really eyeball answer right, and but this avatar would be able to actually look across our data, not the you know this model will look across our data versus the external world data right, and pull in answers that are actually Valuable and reliable for us, whether it's you know you want to ship, I don't know five tons of tuna from point A to point B. You know how do you interact and get that answer information, get the forms. There's a. There's a possibility here for us to become much more efficient and much more effective, and you know also whether it's targeting, tracking bad guys, that sort of thing. These sorts of tools are very, very near To fruition. I think they're coming and I'm really excited about being here at CBP and being a part of this change and mostly, you know, being a part of the defense of our nation.

David Aguilar:

So we're coming up on the end here any questions from from the audience, and while we're looking for that, I'm gonna pose a question for you all to think about Structured data, unstructured data, open source All critically important. How hard is it going to be to make sure that we meld all of those to get the outcome that we're looking for From an assessment analysis and go no go decision maker?

Jay Meil:

So I guess I'll take that Okay. So structured and unstructured data is going to be really important to be able to access and most open source, so PAI or open source intelligence or commercially available data out there is very often going to be unstructured, something like out there is like 85% of all of our data out there is unstructured. So the difference between that means structured data is things that you would think of like you know tabular data or something in like an Excel spreadsheet. Unstructured data are your PDFs, your PowerPoints, your SharePoints, your video, your audio, your pictures, all of that.

Jay Meil:

So when we talk about open source, a lot of open source is social media, and so we need to be able to build analytical databases and systems that can pull in all of the data, irrespective of whether it's structured, unstructured or semi-structured, should not matter what data format it's in, and it should also be schema agnostic. So what we mean by that is in the past you've had to sort of build the house before you bring everything inside, but now we don't know what we don't know. So if we assume that every bad guy is only going to have one phone number and we build our databases to hold one phone number, then the other 30 are falling on the floor right. So it's really about making sure we can process all of the data in any type, with, you know again, schema agnostic, but then leveraging that open source intelligence or that PAI or that commercial data to help verify and validate or make decisions.

Joshua Gough:

And, at some point, attempting to make it machine readable too. It's got to be a 2020,. The report on CBP was. We were sitting on 17 petabytes of data, which, just for an example, if that was a book, it would take you almost 30 million years to read. You could have an army of analysts trying to analyze what it is we have and just the CBP trough, and you wouldn't get through it. They could be the size of California and you wouldn't get through it.

Joshua Gough:

It's not going to be until we can make the words make sense to the machines, that the machines can help us trough through all of that, observe, orient themselves itself around it and then let us know when something important comes up. Currently, I don't see another way around it. That's the problem that we face is the data sizes just start increasing, either geometrically or exponentially. They're just getting larger and larger and the humans cannot keep up at this point. So how does industry help the government? Come in to take a look at the issues. The problem sets the challenges. Let's spend 55 minutes on that Solution. Take five Easy money.

Ed Mays:

So I think it's a big challenge, but again, I think we're heading in the right direction. I know that we're looking at how do we have distributed data across many I'll call them in points and then being able to pull that data in when it's required and when it's needed. I think the CTO's office again is working on some mesh networks, that sort of thing. So I think these are old problems, but we now have new solutions and new tools and especially AI and ML that will help us get to that right answer faster. So I think we're in a really good place. I'm hoping that everybody here in the audience is incredibly optimistic about where we're going because, in my mind's eye, I grew up in a time when I was building neural networks the old fashioned way and it was painful. Now they're out of the box. This is cool stuff and it's great and it's useful. The utility and the change that's going to drive the capability for our officers and agents is there and it's a great time to be a part of this Great.

David Aguilar:

Great time to be a part of it, great panel. Thank you very much for all of you participating. Thank you, appreciate it, appreciate it.

Announcer:

Thank you for tuning in. You can follow HSDF the podcast on any major podcast platform. Visit hstforg to learn more about the Homeland Security and Defense Forum.

Deploying Artificial Intelligence for Border Security
Security and AI Technology in Networks