PrivacyLabs Compliance Technology Podcast

Exploring Audit of AI with Jermand Hagan

April 30, 2021 Paul
PrivacyLabs Compliance Technology Podcast
Exploring Audit of AI with Jermand Hagan
Transcript
Paul Starrett:

Hello, welcome to this podcast on how artificial intelligence and machine learning are audited. We are going to have a discussion here today with Jermand Hagen of 2fifth consulting. I will turn it over to him to introduce himself in just a minute. But first I'd like to give you a sense of where we're going to do here in this discussion. It's essentially a primarily a discussion with Jermand, I will step in and introduce topics and other approaches as we get through this today. But the idea is to give our audience you the listener, an idea of how this process works. It is a fairly new topic, and it is one that touches many different aspects of the enterprise. So with that said, Jermand I should also mention that Jermand and I work together in this area, and he has an incredible background that is perfect for this area. So with that said, Jermand if you would introduce yourself and your firm and we'll then go from there.

Jermand Hagan:

Perfect. Thanks. Thanks for having me. I think you're too kind, I'll talk a little so first off, my name is Jermand Hagan, founder and principal of 2fifth consulting started off my career, as I guess a lot of IT auditors at a big what was then a big six audit firm Coopers and Lybrand and then continued on to various IT audit, IT compliance risk, and regulatory examination jobs, regulatory compliance jobs. I've worked many, many companies, some on Wall Street, a lot of the mega banks, a few of the mega banks. TIAA CREF, which is one of the larger annuity firms. And several large manufacturers Philip Morris. And Johnson and Johnson as a worldwide auditor also ran ethical hack governance process, for Citi group cash and trade, started the first technology second line of defense over at TIAA CREF and, and then ended my corporate career at Freddie Mac, where I was the interim head of regulatory affairs with the primary focus on IT and finance. So done, I think quite a number of things with respect to IT risk, audit and compliance in the field. And then also well versed in the three lines of defense, that's generally a practice in the FIs and in the States.

Paul Starrett:

Got it.

Jermand Hagan:

And then, you know, I guess now running 2fifth consulting where we give audit compliance, security advice, and run a small managed service provider.

Paul Starrett:

Got it, Perfect. Well, yes, I think that's a great background. And I've if I understood correctly, you were 25 years in the business side, we were around some, some of the first auditing frameworks were birthing themselves, NIST 800, and so forth, which is great, because you seen it all. Yeah, like

Jermand Hagan:

COBIT 1.0.

Paul Starrett:

Wow. I was we were mentioning in our last earlier that 2fifth and 25 years, it's maybe a new way for you to brand yourself. I don't know, in any event, and similar, I'm in a similar place in my career. So great. Well, thank you. Well, I think what we'll do is I What do you think? How does someone in your role for let's say, a midsize company on a fairly, you know, challenging engagement, how do you approach the process of an audit just sort of maybe a generic to the extent that you can view of how that works in the way that you think of it the way it typically the way you execute.

Jermand Hagan:

So generally, when I start any audit, I want to understand the business and understand the organizational model, as well as the actual model itself. And the compliance requirements. Generally, that gives me a good idea on, you know, what's, what's actually working well, and what's not. Also the background to understand the controls that are in place, that that that's key, to determine the scope, and generally, figuring out how much time I'm going to spend in specific areas.

Paul Starrett:

Right. And I think you touched on something that I I was an auditor back in the day in a different context, but that you are going to focus on where you see where the problems are. So it's, it's there's no one size fits all. You go in and you assess things at a basic level. And then you focus on the places where the problems are.

Jermand Hagan:

Yeah, absolutely. So again, we look for gaps in documentation, and then overly complex processes with multiple handoffs with multiple people and things that generally aren't automated, which very likely to fail.

Paul Starrett:

Got it, Great. Thank you. And with regard to what you see is the largest need, let's say two what do you think is the most important thing that you would like to see, when you come into an audit?

Jermand Hagan:

Oh, you know, I always love to see, well, maybe it's maybe maybe it's three things, I always love to see documentation, documentation and good form passes a lot of audits and a lot of regulatory exams, I like to always see that security was top of mind during development, knowing that the the developer wasn't just trying to get the process to work or to model the work, but that it was secure. And, and, you know, and they took the proper measures to ensure that the data was secure, and that they weren't violating any particular privacy or any other regulatory concerns.

Paul Starrett:

Yeah, I think in a past conversation, you mentioned process was another one that stood out.

Jermand Hagan:

Yeah, well, you know, again, the procedures are going to, you know, you know, I guess point me to process. The, the system development lifecycle is something if it's documented, the process is there, and it's repeatable, that that always definitely helps. I'd love to see that.

Paul Starrett:

Yep. Got it. Okay, and I guess the ability to see that they have a predefined process for whatever they're doing so that you know that they've been following a certain series of I guess that comes up in the documentation. So

Jermand Hagan:

Well, yeah. So again, the other development lifecycle, if it's defined is going to, you know, talk about all the requirements in order to introduce something as a new product. You know, what are the requirements for it to pass after it's been developed? You know, what are the number of acceptable defects? So all those things you'd love to see in in the actual process as well as sign off and review.

Paul Starrett:

Got it, Got it. Okay, great, thank you. So what I'm going to do now is just from my piece here is to discuss the, the AI audit side of this in a in a focused way, and then we're going to go back into the larger audit frameworks, and then go into the three lines of defense with Jermand. So basically, what we look for on the technology side, and I know that Jermand is, you know, well versed in this, but from the standpoint of the actual implementation, the data science piece, in such generally comes on the PrivacyLabs side, is that we're really looking for three basic things. And again, we're just focusing strictly on the model itself. The first part is to make sure that there's a governance process when you're building the model, are you documenting what you're doing? So as you go from one version of the model to the next, you're documenting that you're recording what's being done. What went wrong? How is it fixed? Where's the data coming from? So that's the first piece. The next one is the ability to understand what the model is actually doing. There are algorithms and resources that can be used to help determine what a model is doing. There's this ironic, inverse relationship with the simplicity of the model, which aren't typically as accurate as the complex models. So the complex models tend to be focused on however they're the least explainable. And I give a quick example, is this one where there was a, it's a something you'll find this discussed out, in the in the world of explainability. There was a researcher who had 1000s and 1000s, of pictures of Huskies, and 1000s and 1000s, of pictures of wolves. And we're, the goal was to try and determine if I was given a new picture, would I be able to tell if it was a husky or a wolf. And so what it did was this deep, deep learning neural network had like 90% accuracy, but it was picking the snow that was in the background or the wolves, because that's all it had. It doesn't care what finds in the data. So the idea there is that understanding how the model works, what is what features is it finding, in order to make a decision has to be known by the data scientists and by the domain experts. So that's the next layer of designing or of understanding and auditing a model because if you don't understand that first know what it's doing, then you don't know if it's going to misbehave, it's going to cause some sort of error, you know, it's going to generate bias or unfairness, which is very big compliance. So these are the basic buckets that are in there. There is a process that is being used to audit the data science work flow. When I was in my master's program in Data Science at Northwestern all the assignments had to follow what was called the CRISP-DM. It's called the cross industry standard process. Cross industry standard process for data mining CRISP-DM. Yeah. And so if you Google that, it there's it's it's basically kind of changed name since it was acquired by was bought by IBM, I believe. But this is the standard that is being used as the audit frameworks. There are two audit frameworks that I would give a shout out to one is called the Information System Audit and Control Association has a framework that they've developed. It's, I like their approach because of how it leverages the CRISP-DM framework. But it's the ISACA. They if you do a Google on machine learning audit, you will find plenty, I would give you a link, but it just changes often enough that it's better to just start new and look at it. The other framework, which I'll mention is the the other is the Institute of internal Auditors, they have what's called an artificial intelligence auditing framework. Again, these are approaches by the various auditing associations on this topic, we do see some new efforts with the European Union, they have a proposed regulation of artificial intelligence, we already have a podcast on that with Steven Wu, there's also new guidance coming out for with the OCC in banking, and financial services, which I know Jermand is close to. But so that's sort of the basis there, there is the we have the the way you audit, a in a model, and the auditing frameworks that around that. So now I'd like to rope you back in Jermand on that aspect of it. Because I know there are people who specialize in auditing the AI model, and there are people who specialize in auditing IT, and compliance and so forth. But bringing the two together is really where I think we where 2fifth.com consulting and PrivacyLabs brings those two together in a way that you must because of how interrelated they are, there's a horizontal aspect to this, which is required. So I, I think maybe the best way to look at this Jermand is to look at the three lines of defense as a general a general coverage, what do those three lines of defense do, and then we can go into how that applies to artificial intelligence.

Jermand Hagan:

Okay, so you know, a generally accepted control model and the financial. The financial FI space, as they call it, is the three lines of defense. Right, so the first line of defense is the actual business. They have policies, procedures, and controls that are in place. And those controls generally, are accompanied by a governance area within the business that, you know, takes a sample or a look at the outputs that are created from various processes. That's the first line of defense. So the business actually executing and then with some, some level of review, the second line of defense includes what they generally refer to as kind of like the risk area. And this area kind of puts together the guidelines as to you know, basically the boundaries as to how far the business can go. So what's the acceptable amount of risk that a particular process could basically accumulate either over a certain time or the the occurrence of a process so basically establishing kind of an error ratio in a business acceptance, and then the third line of defense is audit, where the auditor has actually come in, review the process, the processes and the scope, and then understand how well the risk function is working to support that process. So, you know, other businesses taking undue risk, or risk being identified? Or are they being calculated and quantitated? With the business executives actually know what their exposure is? Whether it be reputational, financial, operational, etc.

Paul Starrett:

Interesting. So it sounds like a maturity process where they first start with the business and correct me if I'm wrong, is that the first line is really the commercial goal of the enterprise. And in the controls around that. The next one is to take that and look at the risk that's involved to kind of make sure that it's not crossing any of the lines that it shouldn't. And the last one is sort of an objective review of what's happened to that point, is that fair statement? Or am I?

Jermand Hagan:

Oh, yeah, absolutely. It's basically, is the business executing? Do they have some tolerances that they put in place to determine what's acceptable, then is risk coming in doing another review to ensure that those risk tolerances are actually working? And if a number of risks are being accumulated? How does that impact the business's either financial, reputational posture or risk, so to speak? And then lastly, right, the the audit, Are those two functions, the checkers and the actual makers executing as designed? And then, you know, with efficiency? In some reviews.

Paul Starrett:

Yeah, I guess it's kind of a way of going in and going over everything, making sure it's sort of a double check on things.

Jermand Hagan:

Yeah, checkers is checking the checker is the exactly like,

Paul Starrett:

I see. And that should be an objective party, I suppose. So I think that what I'll do is I'm just going to go briefly over we've never mentioned CRISP-DM, because I think there's some overlap in some way in which we can learn from those three lines. With regard to the CRISP-DM, there's six stages, I'll just go very briefly. The first is the business understanding, which makes sense. And by the CRISP-DM is the way in which the scientists who use the science quality check what they're doing to make sure that that science is being executed in its proper way, just like any other science starts with the business understanding what's what's the goal, then there's data understanding, looking at the data, trying to understand how much there is, if there isn't enough, or it's too lopsided, or too sparse, it may kill the whole idea of using machine learning at all. But that's what happens there. Then there's data preparation, which is third of the six steps, that's we look at odd values, missing values, skewed information, and you clean it basically clean it up, basically, and you normalize it to get dates are in the same format, names, first name, last name is always the same format, and so forth. Then the fourth step is the actual model development itself, where we talked about explainability, things like that, we look at those types of things. And we make sure that the model is going to actually execute on the commercial purpose that it was being created for there's a there's a goal here, this model has to do something that results in a net gain. Then the fifth step is evaluation where they test the model on other data to make sure that it's that it's working properly. Because often the models trained on what's called a training set. And but that becomes kind of siloed, they bring in new data, to see how the model performs on different types of data, different scenarios, to see how how well, it works on a broad scheme, then it's deployed into this the six step is deployed into the enterprise infrastructure, it actually does its job, it's actually doing what it's expected to do. And there's an aspect of that where there's so it's monitored that to make sure that it doesn't go off the rails while it's out there doing its job. So those are the six steps of CRISP-DM. And if we map that back over to the three lines of defense, I think it kind of follows the same process is that as you go through those three lines, you are kind of with regard to the specific aspects of the model development. You're, you're you're you're sort of reviewing the model across those three lines of defense. And, I guess,

Jermand Hagan:

from what from what it sounds like, it seems like the CRISP-DM is you know, more development lifecycle with respect to developing AI products. But it encompasses everything that you know, we historically know as the IT audit approach, right? Yes. And you have to, of course, I think what we talked about earlier was determine how far you go. So what is the scope of the review? Because it could get pretty complex pretty quick, I would think.

Paul Starrett:

Yes, yes. And I think this is, again, touching on the horizontal aspects of the audit, sort of bringing that full circle here is that when you do look at, let's say, in the second line of defense, we're looking at the risks. A lot of those are things like privacy regulation, data protection, security, in particular, for data science, the security of the developments cycle, if you just keep what you brought up earlier, Jermand. And then in the final case, we have the the third party auditor, or the internal objective auditor who comes in and just brings a fresh, objective neutral view to that entire process. And so I think one of the things that we we would offer here is that the person the the entity to the that does this, as in the case of 2fifth and PrivacyLabs, we know that we have to look horizontally, because machine learning draws from data that is coming from many different places, and often that goes into a public cloud. And you have to look at the privacy regulations that that attach to that you have to look at the data protection and cybersecurity information security, that touches that if you look at the software development lifecycle, so what I would call in the data lineage, where's the data coming from? Again, yeah, and then rope, wrap that around the business purpose. And look, again, at all those areas horizontally, every time you look at a an audit of AI, you are is part and parcel to the privacy, the data protection and any other governance, required governance or compliance requirements that would go along with it. And I think that's where, you know, our ability together to accommodate that, that horizontal aspect is perfect with regard to the joint, you know, what we do is the two of us. So one of the things we can do is, I think you might get some ideas on how we can, you know, help people prepare for an audit, and then how we can be a good source of an objective.

Jermand Hagan:

Yeah, so a lot of what I've been doing now is helping firms, medium, small and medium firms prepare for audits, which is essentially putting them through an audit or mock examination, before they have to actually learn the results in the real world. So, you know, going through an actual audit, determining what the weaknesses are, and then we also assist with with some of the deficiencies in a remediation. So this way, we could give them a real world view of what the auditor or regulatory examiner would like to see. And then help them close the observations that we think are necessary. Be, you know, just regular documentation, you know, help with information security, or privacy, or with their system development lifecycle.

Paul Starrett:

Oh, got it. Yep. Yes, and I would imagine that by extension, that if a firm or an enterprise is looking for an auditor, to come in, that they would want to find someone who could accommodate the full horizontal spectrum at the same time, that that saves money, and it reduces risk. So I think that's where we would be well placed that way.

Jermand Hagan:

Absolutely. In the eyes of an auditor, or, or a regulator, so to speak, because we have the experience in both areas.

Paul Starrett:

Right, right, with your experience with all those very impressive roles you had and the way you've kind of came up in the, in the world with this, you've seen it all. And and on our side, we have the same kind of ideas, I've got, you know, some 20 some odd years and in legal, and then later and last seven years and data science and technology. So in any event,

Jermand Hagan:

it's kind of weird to be able to say, hey, you know, I remember when we when they first came out with GLB, you know,

Paul Starrett:

yes, or HIPAA. Yes, it's for those who don't know, I'm sure they do with the Gramm Leach Bliley Act, which applies to banking, which Jermand has spent the bulk if not the primary, primarily banking and finance and then with HIPAA, which is the healthcare. Okay, you know what, we're going to get into some, some acronyms that I forget what they stand for, but HIPAA is for healthcare. But yeah, no, I agree with you and even go back to Sarbanes Oxley and Dodd Frank act and so forth. I've, I've watched those come to come into life. So I think that background helps with understanding. So I think that pretty much does it. I think that we certainly could go on for quite some time here. But I do Jermand. Thank you so much. I did want to give you a chance to maybe just a last thought on what you think you think about the topic. Any any last thoughts that you think we should impart to our because I hate to leave without your mature and extensive experience without getting giving you that opportunity here.

Jermand Hagan:

Well, I would say the good news is, is that the US regulation isn't pushing so far, that there's an over burden on new requirements with respect to AI. I think the requirements are general with respect to the development, but but there are some things that companies need to think about, not only is the model or AI tool working properly, but how are you actually applying the use of that tool? From a data ethics perspective? I think that's something that companies don't generally think about. When when applying new tools and new emergent technologies, is the application of the tool ethical? And, of course, as you know, can you explain it? Those are the two main things that I see with firms which struggle and or we can help them determine how do they go about document and both of those to support their cause.

Paul Starrett:

I could not agree more, but I wouldn't have expected anything less from to bring that up. We probably should have brought it up. But that I think you're absolutely right. The ethical, the bias, the bias, the fairness, which is sort of, you know, it does permeate everything. And it is it's downstream of explainability. So great. I could not agree more. And I think that was a great way to, to end things here. So I think we will do so we will end it again. We could discuss everything but Jermand I think that they can find you at 2fifth.com consulting '2' and then fifth spelled out no hyphen just 2fifth.com. Yeah. And and then I went privacylabs.ai and I think with that said, feel free to reach out to either of us. We do think we have sort of an unrivaled competitive advantage there. But I think we'll leave it there. Thank you again, Jermand. And thank you for all for listening and watch for new podcasts from Jermand and myself on other topics.

Jermand Hagan:

Thanks a lot, Paul for having me.