HSDF THE PODCAST

Priorities for the CBP Office of Information & Technology Part 2

Homeland Security & Defense Forum

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 27:32

Welcome to “HSDF THE PODCAST,” a collection of policy discussions on government technology and homeland security brought to you by the Homeland Security and Defense Forum

 We sit down with CBP leaders to unpack a practical transformation: a living data inventory and catalog, an enterprise search that returns cited answers instead of document piles, and “agent tech” that automates the last mile of work for officers, analysts, and engineers. The result is faster, more accurate decisions where it matters most.

 Featuring:

 This discussion took place December 12, 2025 at 8th Annual Homeland Security & Defense Forum Border Security Symposium

Follow  HSDF THE PODCAST and never miss latest insider talk on government technology, innovation, and security.  Visit the HSDF YouTube channel to view hours of insightful policy discussion.  For more information about the Homeland Security & Defense Forum (HSDF), visit hsdf.org.

Data As Lifeline At CBP

SPEAKER_02

So Neil, AI is not new to CBP. You know, you've been doing traditional AI for 10, 15 plus years. Talk a little bit about how you're incorporating custom and gen AI in into the mission space. Sure.

SPEAKER_01

I think I mentioned earlier from CBP perspective, let me just bring a little bit of story here so that some of I forgot to give a prop to somebody on our team. So our chief data officer, Michelle Zabraski, she's not here. She runs this amazing team. So data is the lifeline for anything we do, right? Like AC said, that's a fact across the board, just not for CBP, but any agency or any company, data is the lifeline for their operation. So CDO, she came in about a couple of years ago. We hired her. She's part of the CTO office, and uh she does an absolute amazing job. We have some of the teammates from her team are here, Deputy CDO, Frank Tran, Kelly, they're both here. They're doing an amazing job. I'll tell you something. This is the reason is everybody says this, but this is really a fact. You need to know what data you have first, especially a size like CVP. Billions and billions of transactions, AC said, petabytes of data, right? Some duplicates too, right? We we do have we're trying our best not to do that, but we have some duplication pieces, yes.

Building The Data Inventory And Catalog

SPEAKER_04

Right here on the dashboard. Right. It's all the information. Or 50 petabytes, but you can't have it.

SPEAKER_01

Right. So so the reason I'm saying this, for the first time, what she has done is she's done a data inventory of everything we have. That's a checkbox. I need a clap for that, I'll tell you that. That's huge.

SPEAKER_02

Yes.

Two Decades Of Custom Models

The Shift To App-Layer Automation

Agent Tech And Enterprise Search Vision

Beyond Documents To Cited Answers

SPEAKER_01

First time. So we know exactly what we have. She created a catalog out of that, and we are working towards, so now we know we can tell people this is what we have. Because remember, we are just the custodians of the data from CBP perspective, our OIT shop under the AC. Our mission owners, they own the actual data, right? It's all coming from them, from from from you know, from from both from uh OFO, Border Patrol, AMO trade, all those folks actually own the data, right? We are the custodians of this. So because of that, she's done an amazing job to work that uh work that issue for us. Now, what happens next? What happens next is that we have been doing, as Dr. West said, we have been doing custom models since ever since 2000, especially after 9-11, right? Ever since then, we have been doing a lot of custom models to do targeting component. Uh, Mr. Nial Sama, who's the XT for targeting, has done an amazing job, and they've been building the custom model for a long time. That's been there. That means we are building date, building stuff with our data. But tooling was not there 20 years ago, right? Or it's gotten better, uh, it's gotten better, but tooling was not there. But also, we had to figure out how to take some of those things to figure out what exactly we need to do. It was not really AI, AI, AI is like a very broad term, right? So I would say an analytics and analytics is to be a big deal, especially data analytics component of that, right? That's what we are doing, doing predictive analytics to figure out where this person is goods going, this person is going, uh, which is coming to us, and how do we track it and keep a track of that? So this is awesome. And we become better and better at that custom stuff. We have built a what's called AI AI pipeline. We like just like a devsecoff pipeline, you guys must have heard of that. You put stuff in here, comes out the other side, right? So that's that's what we are doing on the data side. It's very critical. Now, what's gonna happen? This industry, this is my prediction, and you guys better be working on this. It's gonna merge. We'll see the merger of this data component and the sec DevOps security component and the and the software development. You're gonna merge all those things together very, very soon. It's coming, it's happening. If we're not doing it, you need to start working on that, right? That that's gonna happen. So, because of that, the custom component, I think we got it covered pretty well. Now comes in, you know, the open AI component of two and a half years ago, a thing called generative AI component, right? So the generative AI component, what I've been working on this about two and a half years, working with the AC and uh other other Dak J, Dak Mays. I think Dak Maze just showed up. I saw him here somewhere. Uh he was here. He was here, right? There you go. So Dak Mays is here. You should talk to him. So Dak Mays, Dak J, the CISO, XT DevCoda, we all work together to figure out the next generation of how do we take. See, the thing is keep in mind, the four cognitive functions, which we always write had to write stuff for, code for, it's there now. These models are amazing. I'm telling you that. Most of the model discussion is not the models anymore. Dr. Witt, that's where the issue is. We need to discuss. See, if you're talking about hardware, we're taking if you're talking about TPU, GPU versus uh CPU, wrong discussion. That's a given. Or the LLM size, how many billion parameters you're building, wrong discussion again. Discussion needs to happen is is the app layer. That's where the automation stuff is going on right now. If you're not automating in that area, I think it's a problem. Everybody, all of us should be working on. You must have heard a thing I call agent take stuff, right? Agent tech is the last mile, I call it, right? Because what have we been doing so far? Have you heard of a concept called RPA? I hate to say this, that concept is kind of whittling away because you know, while you're looking at software development, the next generation for AI, where you're automating that component, we don't have to worry about some of those things. So I that means I type something, right? I'm I have chat CBP, for example, that AC mentioned, I type something in there, and that type, and I type something, it goes and touch a magic at the back and figure these things out using automation completely across the board, get information back. So our one thing we are doing, Dr. West, is this AC is, you know, I'm just waiting for some funding from him. It's called enterprise search. What does that mean? What this means is sitting from one place, officer, agent, Sumil, AC, XT, Devcota, DAC, Maze, whosoever it is, type something, or agents and officers, type from one place. It's called, I can call it Google for Google Strip for one place, right? You're typing something in and you're getting information back. That information will depend on your access control, what you're allowed to see. If an OFO, DFO from uh from IAD, for example, or who looks after the Baltimore area can say, how many offices are working today in Baltimore or assigned today? You doesn't have to go on spreadsheet somewhere and find a spreadsheet for today's date and look for that. No. You type in and you get the information back, accurate information back. It's critical. So that's where we are going, Dr. West. You know what I'm saying? Yeah, we want your help for that component. I'm telling you, it's critical. I'm not see the lot of great companies doing a lot of work. I shouldn't have to go and look for uh something in SharePoint and I get back a document. I looked for retirement once, you know, recently, and it showed me 6,000 documents. What am I gonna do with that? Right? You guys agree, I want content back. That means content, which is sites exactly where this is coming from, that kind of information. That's only one area, right? Then you can look at for example, computer vision, for example. NII is a good example. Same thing there. If you look in X-ray vision for anomaly detection, I should type something in, it shows me things back. Because these LLMs, the reason I'm telling you this is it's no longer basic computer vision. It decks does do X-ray vision, which is what we look at at NII comp at NII faces where you're looking at Dr. I mean, uh XT DevCoda is working on this. These LLMs are doing it now. So that's where we need the innovation component. Talk to us. Hey, this is what we are doing. The companies who are talking to him, he's telling him exactly that. There's no need to have this massive uh one-year software development lifecycle to develop something anymore. This is the week's amount of work now.

SPEAKER_04

Yep.

Multimodal AI And X-Ray Anomalies

SPEAKER_01

And that's where the action is. And we are hoping that's going to change. The last one is software development, what AC talked about. I'm proud to say all our software engineers have access to this. And if one of you guys, uh, your teams are working with us, they better be working on this. Otherwise, there's a problem. We will expect every software engineer, right, AC, within CVP to be using code assistant to make your life easy. Not your life easy, also, is because we continue to deliver faster for in the field for our agents and officers for the mission.

Code Assist And 5x–10x Delivery

Biometrics And Frictionless Travel

SPEAKER_04

Yeah, so I think I think the key points I want to reiterate those points, very important, is that uh software in reverse order. Uh, the software uh code generation and the number of things that you can deploy using code that is fully uh tested with your peers and using that code assist is gonna be very key because we, the administration has asked us to go 5x, 10x. All right, can be done with what we got. We can even go higher than that, by the way. So we've proven that we are the first agency, one of the first agencies that have shown uh the savings and the amount of lines of codes, and they went into million millions, uh, and we can do much more. So now that is proven it's gonna take off. So if I have 2,000 Govis and 5,000 contractors, everyone is expected to do that. The second thing is we've been doing predictive analytics and algorithms for a long time. That has, you know, we process 840 million people with a 99.5% match rate within 0.3 seconds with 2,200 impostors. All that is excellent. And now with AI and FIFA 26 and all the other stuff coming in, LA28, how do I use AI to assist that and go even further? And you will have even travel where you are going straight out and you don't have to stop at a kiosk. You even have mobile global entry where you had just do a uh you know a selfie and go forward. You'll be able to just walk through an airport because we already got uh stuff all linked together on that. So I think that's going to be a very big key. I think the other one was data, like this data inventory that I was showing you, everything on a dashboard. I got like 60, 70 dashboards all there. Nothing cached locally, by the way. So even if you take me and the thing self-destructs in five seconds. Uh but uh you know, I can look at database by uh number of data sets by reporting, I can sleep number of data sets, uh number of uh uh you know, organization, I can look at inventory, I can look at any kind of thing. So now that I have this, what we found is the good news is we got all the inventory. The bad news is we got a couple of copies of everything. And so, how do I get to the real uh authoritative source of data with the provenance, with the right to purport security and classification? So those are things that we're gonna look at. And I think Sunil uh also uh nailed it uh with the other point is that now that we have this thing going and people are using it, they're registering their use cases with us. And so the next thing we're gonna launch in FY 2026, announcing here because I will give them the funding, by the way, is that there's gonna be an enterprise search capability that goes across and all the Sucky search engines that we have for email and all that kind of stuff, obviate some of that, find stuff better, go into some of the relational databases and other things and look at their search and all that, link all that together with a unified experience. So to me, that enterprise search with that data is gonna be very, very key. I know we got two or three minutes, but uh I wanted to save it.

SPEAKER_02

I wanted to save a little time uh for questions, but one thing I wanted to ask you, Sonny, is you know, these next six, twelve months are gonna be key uh with the big, beautiful bill and some of that funding already being used. What can industry, the folks out in our audience, be doing uh to really, you know, help CBP better or help DHS as a whole? Are there things they can be doing when they come see you or when they go through your vendor management office? What are the things that you really want to hear from and see?

Dashboards, Provenance, And Duplicates

Funding, GPU Shift, And OT-IT Integration

SPEAKER_04

Well, first of all, there's a large investment coming. You all know the whatever the 45 billion over three years, or whatever that's that number is. A lot of the stuff going to border wall and all kinds of other uh uh NII and other kinds of technologies. But I'll just say that uh we're also working on biometrics, integrating the biometrics within DHS, with the between CVP and DHS. So that's gonna be a big deal that's gonna be happening. Uh more biometrics in that area. I think how we can do that better and roll some of these capabilities out, I think will be one area. Uh we're gonna release a technology refresh, uh, hopefully an acquisition this year, where we can uh in outsource to industry and then figure out how we can get not only because everything's CPU based, we want to go to GPU also at some point. There's TPUs, obviously, we're looking at uh quantum and AIML processors and all that kind of stuff. Basically, how do we automate all of that and then update that process going forward? I think OTIT integration, a lot of common operational picture, uh common operational environment. We're helping uh TSA with a special uh one of my staff is being uh sent over to the administrator uh immediately next week to start helping out with some of the things that they're doing. So we're all working with the DHSCI on all these sort of efforts, so there'll be a lot of area in OTIT integration. We have successfully proven that works. We've also proven that uh we can go out uh to Texas and other places and direct SATCOM down from a combination of three constellations. One of them is Starlink, you can solve the other two, uh, you know, and get that so that our agents can make that happen. What we want is our agents and officers and analysts to not be encumbered by anything going in, right? Merge that data together that when they encounter anything, operating environment. Yeah, and we're trying to go with a person-centric model, cargo-centric, so that everything is based on a person, then there's a whole bunch of information that's integrated around that experience. That is massive. That is world-changing sort of environments that we're doing. So there's a lot of exciting stuff coming in. We're uh currently at 98 countries. If you look at all the different combinations, making some deals happen in the Middle East and other places, where what we're doing is to protect America, move that perimeter out there so that in exchange for some of the advanced threat mobile and other apps, we can supplant another country that you know they're they're out in some of these places and put America up there so that we can get that information loaded to our system so that before anyone even gets on a plane with a cargo, like a pre-clearance and all that stuff, I already got all that information way in advance and I can link things together. So that so all of those areas is where industry can help. We want your analysis, we want your expertise, we want, we want you to bring what really helps the United States here to catch up in some areas. I heard the 6G, we're a little bit behind. Well, we can't have that. How do you bring that in? Cyber, if there's something that you're having a problem with, report it. Uh, you know, we're having good relationship with the FBI and others as well, trying to get in front of uh what are some of the threats. There was a CVE just released yesterday with a score of a 10. You know, so I mean we gotta get past that C VSS score immediately and start getting that patching there, right? Because that's the weakness. So I think a lot of this kind of stuff that's going in, I think you'll see a lot more coming out with acquisitions in trade. Uh, you'll see uh acquisitions in national security, you'll see stuff coming out in border security, you'll see technology refresh, uh, some cybersecurity stuff as well. And I think uh getting some of that funding on the 45 billion applied towards IT, making that better. But the keys again, I remember I told you the administration's goals seems like again, find efficiency, find savings, find better ways to do that, do more with what you got. And then with the savings and also but with the new investment that they're doing, which is multi-billion dollars into CVP, you know, help us, you know, expand and make sure that we protect the American people faster, better, more affordably, more secure.

SPEAKER_02

And I think that's going to be the order of the day. Excellent. We have time for one or two questions. And we where's our mic? I want to get the mic over right here in the middle. You'd state your name and the organization that you're with, please.

Global Partnerships And Pre-Clearance Data

SPEAKER_03

Phil Hassel with LMI. I just wanted to ask a question around the DHS chat and and chat CBP rollout. I think what I'm really curious about is as you roll that out, and it I mean it's been in production for a while now. What sort of surpr what sort of things have you seen that have really surprised you in terms of like, I did not expect these this group to be my power user, or I hadn't anticipated this application of it, or we just went way over on compute cost. What has been the biggest story to come out of all those implementations?

Security Urgency And Acquisitions

SPEAKER_04

I think that for me, I'll just say that getting a lot of user stories of people saying, This has really changed my life. This has really helped me outdo my job better, right? And they can in a secure platform, keep in mind, in the one that we're doing Chat CVP, you can put PII and we can put you know secure information. And you put that in there, it gets you that information very quickly, and we've limited, we've upped the size on it and capability. So with the tokens that we have, uh, you can pretty much do anything, and it comes out very quickly and does that for you. Uh, how it helps the CTO is that uh when he responds back to emails to me, uh he's got the best English and the best uh responses in like two nanoseconds, and I know that it's not him. I know I know use chat CVP. He tries to pass off that it was him, but you know, it's like, wow, this is the best answer. I've seen five paragraphs in like two minutes. And then I sent him to the confession booth and he said chat CVP. Anyway, all right. So my point, all of that kind of stuff is helpful, but our agents and officers in the field are coming use case after use case, that they have actually solved the case faster, better, more affordable, more secure kind of thing, faster and better because this is there. So I think the big thing is training. We need to do a little bit more training on that, I think would be better. Uh, we're working with coexisting DHS chat and chat CPP, DHS chat more going across, chat CPP within. Uh, and we're coming out with an avatar as well, next that allows us to do that. Sunil's come out with 250 languages that are coming out right on the translate, so you can you can speak that, including technogibrish if you want. Uh, but all of that stuff, that was only joke I had, by the way. Uh uh so I think all of that is what's coming out next, but voice prompt is going to be next. I want the voice prompt so that I can just say, just like Syria, whatever, but in a government setting at the highest level of classification, hey, I want to do this. It launches the prompt without having to type it in, and then it launches it and brings it out for you, and also goes out to the applications, to our software as a service, uh, office automation and productivity suites, also to the general services that we provide, all that with an overarching service and a user experience. So I think to me, that's going to be the key to answer your question. One more question.

SPEAKER_02

Right here.

SPEAKER_00

Yes, Kirk Greening with Nutanix. Thank you for coming. So uh I heard somebody say, I think it was you, Sunil, 70% of data is created at the edge. So, with that being the case, what are the opportunities and challenges with bringing AI models to the data versus bringing data to the AI models?

Chat CBP Wins And Voice Prompts

Edge Vs Cloud: Pragmatic Compute

SPEAKER_01

Yeah, I think this is this is a great question. I know when XD Burbo is here and not Dev Coda, we discuss this a lot. So just just keep in mind, right? So we do have computation computation requirements at the edge, right? We do it, do it right now, right? With some of the things what uh uh DAC Mays do and what Dak J do on their side. But I think the concept, we've been doing the edge to cloud for a while now, right? You look at AST, which is our which is towers, which is and which is autonomous surveillance towers, we've been doing since ever since I joined the government was six years ago, right? And that one is a great concept, right? Because you you are looking local at the pictures, and 70, 80 percent of stuff is junk, and you throw that out and bring back, bring what you need for an analysis, right? And with before you go that, you do that, we call it what is called the near edge and the far edge concept, right? So some of these areas. We are working with various vendors to see what makes sense. In certain areas, it probably doesn't make sense to do local compute. You don't have to, right? But physics is always there, right? You cannot get away from physics, see as see what you can process. As our boss said, AC said, office is always in charge. AI just assists. And I is a great example. You'll see a lot of the work being done. I think XT DevKota is speaking after this, and you'll be hearing from him also on this area. But what we are seeing is the extension of the cloud concept, right, is going further and further away. Right? And because of that, you have processing areas, which may not be the edge itself, but you have some areas where you are bringing the data in locally, not that further away from the edge, do the processing there, and then get the result back, or offerage storage. Ultimately, you do have a centralized area where your uh data goes in. But one thing we'd like to do is, especially from the generative AI component, right? It kind of solves so many issues for us. We are exploring things. How do we use that from the edge use cases? It really is, right? End of the day. There are now database companies, I mean, large database companies, they are using generative AI at the back end. You know, generative AI, what they use is it uses what is called a vector store. It's a vectorized database, right? And a vectorized database is a little bit different from traditional databases, right? So because of that, every company you can think about, especially large tech companies who have a relational or unstructured structured databases, they do have a vector store now to do, which is awesome. So you can store different kinds of data depending on what it is. So you'll be hearing a concept called MCP. MCP is a, you know, it's a protocol to use, it's a model control protocol, where it context protocol, what it lets you connect your services through this concept called MCP. MCP, when people ask me what is MCP, I said MCP is nothing but an API with LLM behind it. Makes sense, right? That's what this is. And I was talking to some of our partners, and this lady told me, yes, Sunil is exactly correct. It's the LLM, uh, it's a it's an API with LLM. So that those concepts you'll see more and more of, right? Some of the things utopia are also in the sense because you're trying to get to that. Remember my earlier comment about one place, type something, get everything back. That can be an edge thing too, right? You're getting information out of the edge, coming back to you. That and ultimately our plan is data is not moving. We need to keep everything in one place, but have what is called a distributed architecture where you're doing federation queries, you're running one place, and that query goes across, brings it back to you. That's utopia 2020 years ago, by the way.

Vector Stores, MCP, And Federation

SPEAKER_04

So if I can just uh one thing I want to just uh make sure that's a tech from a technical architecture standpoint, the one challenge we have is we don't we have an edge funding program that we're trying to establish. We don't have full funding for that. Okay, so I just want to say the current architecture is 1,744 locations, big network, includes the 329 ports of entry, 328 ports of entry, the 179 border patrol stations, the 79 AMO stations, and all that kind of stuff, and all that, and our international locations. But all of that backhauled currently to a tier four data center, and we got 276 apps in the cloud and all that kind of stuff. So there's no edge computing devices that are per that are proliferating across the whole environment. We want to do that. But with the beacon middle tier with Border Patrol and the edge architecture that Sunil and everyone's come up with, we want to start getting into those devices as a service if that's available. So that that's 75% of that thing, because keep in mind, we have hundred thousand assets in the IT domain that I oversee and my entire team in OIT oversees at these locations. But there's another double of that that is the OT assets from planes, drones, and all that kind of stuff. All the wall and all the other six levels of sensors the wall has, that's going to generate all this information. We'd like to process 70% of that at that edge because it's there. But to do that, I need some devices and other things. I don't have the funding for that part of the program. We're working and done some pilots and our Border Patrol, AMO, and others, O4 said, yeah, this is good to what Sunil is saying. Now we just got to bring that together to make sure that that program comes together as a program, if you're understanding what I'm saying. So the concept is there. We have the winning architecture. We know between the intelligence community and the DoD, the civilian agencies, we are quite advanced in looking at how we're doing it. We think we're the right thing. We need all of you in industry to keep pushing that concept and telling us how we can make it more affordable, to go from our first three concepts successful we've already done, to the next five or ten, and then really make this a program approach. So leverage exactly what Sunil is saying, but we need those uh edge devices. We have now seen actual devices, you know, uh actual devices that'll be the edge computer that'll be that'll be done, you know, processing at that. And so bring that out and deploy those, and then change the network topology a little bit between East-West presence and also change the topology with the SATCOM coming in. Bring all of that together so I can more effectively uh do that. So to me, that's the opportunity, if I can just say it from a CIO programmatic standpoint, and I think hopefully you're what I'm trying to say there. So yeah, I want to add to this the brilliance of Sunil that has, you know, because if I give him five more minutes, he'll solve world, he can solve world hunger very quickly because he's got amazing, amazing technology that we're putting out there. But this edge thing, we have a thing we're gonna publish it out to industry in a very close setting because it's uh currently for official use only. But we believe this is the winning architecture, along with the systems management between NOC, SOC, EOC, and all those sort of concepts. So uh stay tuned and that's a great question.

SPEAKER_02

Sunny, uh Sunil, thank you both for giving us an excellent update on your priorities, your mission priorities, the agency priorities. Uh let's give them both a big round of applause today.