Upon Further Inspection

Episode 14 - Stood the Test of Time (featuring Clay White)

Upon Further Inspection Season 1 Episode 14

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 46:02

Welcome to the world of mechanical integrity! In part 1 of our interview, we welcome Clay White to the podcast. 

Take a listen as Clay takes us through his career journey, from earning a welding and metallurgy degree to becoming a recognized expert in mechanical integrity, corrosion, and materials engineering – including his contributions to API 581. Our discussion explores the evolution of Risk-Based Inspection (RBI) models, flaws in the Generic Failure Frequencies (GFF), the critical role of Damage Mechanism Reviews (DMR), and how statistical analysis can sharpen inspection strategies. We wrap up with Clay’s perspective on Artificial Intelligence (AI), emphasizing the need for transparency, data accuracy, and responsible implementation.

If you work in mechanical integrity or engineering, this episode is a must-listen. Part 2 of our conversation with Clay White will be published on October 16 – subscribe today!

00:26 Background and Early Career

02:57 Development of RBI Programs

07:12 Challenges with GFF and Data Accuracy

10:56 AI and the Future of Mechanical Integrity

15:29 Continuous Improvement and Data Management

18:33 CML Optimization and Inspection Strategies

22:30 Importance of Data Tracking and Optimization

25:26 Collaboration Between Engineers and Data Scientists

28:39 Case Studies: Heater Failure and Corrosion Rates 

32:26 The Role of AI in Data Analysis: Transparency & Reliability

+++++++++++

Episode Acronyms & Abbreviations

API 581 – Risk-Based Inspection Technology (Recommended practice developed & published by API)

API RBI – Desktop RBI software available through Equity Software

ASME – American Society of Mechanical Engineers

CML – Condition Monitoring Locations

DMR – Damage Mechanism Reviews

GFF – Generic Failure Frequencies

IDMS – Inspection Data Management System

ISO – International Organization for Standardization

MI – Mechanical Integrity

QA – Quality Assurance

QRA – Quantitative Risk Assessment

RBI – Risk-based Inspection

RCM – Reliability Centered Maintenance

SME – Subject Matter Expert

TMLs – Thickness Measurement Location 

Send a text & tell us what you think!

Thank you for listening to Upon Further Inspection! If you enjoyed this episode, be sure to follow or subscribe so you don’t miss the next one. 

 We’d love to hear from you—connect with us on LinkedIn and share your thoughts on the episode. Have ideas for future topics or guests? Email us at inspectionpodcast@gmail.com.

Join us next time, wherever you get your podcasts. Until then, stay safe and stay informed.

Note:  The views and opinions expressed by the guest are their own and do not necessarily reflect those of the hosts or the Upon Further Inspection podcast. This podcast is for informational purposes only and does not constitute legal or professional advice. Listeners should seek their own qualified advisors for guidance.

Nick

Hey, I'm Branden. Hey, and I'm Greg.

Branden

This is upon further inspection. We got Clay white with us today.

Greg

so, let me ask you this. Yeah. So can you share a little bit with our listeners about your current activities and then also, how did you ever get started in the world of mechanical integrity?

Clay

Okay, yeah, I'll give you a bit of background. So my degree is in welding and metallurgy from a and m. I really hadn't worked too much in the oil industry, even, you know, before going to, to college. I did end up taking a, job that kind of got me into materials, corrosion, engineering side of things. early on in about my junior year in college, I had a job at, the Tennessee Gas Transmission Company, which Tenneco, in Houston. They had a lab where they did failure analysis. And so that was kind of my segue into getting interested in corrosion and materials. And, uh, found it kind of interesting and fascinating and I, you know, I always worked on cars and built stuff and, and encountered a couple of failures early on and tried to try, was trying to figure out why in the world it felt. And so I guess that was kind of my catalyst for getting into the corrosion materials side of things. after coming out at, out of a and MI took, took a job in a materials lab for a company called FMC, which owned, you know, 30, 40 different different companies including

Greg

Yeah,

Clay

wellhead divisions and mining divisions and a ordinance division out in California. And, so I, I actually worked in a, in a materials lab doing failure analysis and that kind of stuff for the first five or six years of, of my career. and then, I took a job with Exxon. Coming back from there in chemicals. and then, had a few other stents, but ultimately ended up doing, you know, inspection, supervision, starting at Exxon and, coercion materials. And actually they kind of needed more of a mechanical engineer. So I really did more mechanical engineering early on as a fixed equipment engineer. And then, about that time, that was late eighties and then early nineties I started in with API and meeting our friend John Reynolds, recruited me, to help him on, I don't remember what aspect we were working on, 5 81, but I had, I helped, head up one of the, um, section writeups around, uh, corrosion and ran the subgroup on that. And, so, goes back about the same timeframe when I met, Reynolds. Okay, Len, and then that was about

Greg

92, 93, about that timeframe.

Clay

Yeah. Yeah, yeah. About, about when we started should, maybe shortly after we started, 5 81, kind of, RBI, work and approaches. Mm-hmm. At, at one point I actually wrote a version before the API 5 81 program was released through DNVI actually wrote, tried to follow a similar pattern and wrote a, an RBI program that, a company called AB Tech ended up using at a couple of sites. I don't know if they still, I think that actually is still in use, today, but I haven't kept up with that one.

Greg

Yeah, I remember seeing that in the field. yeah. A lot. And when we bid on that original A-P-I-R-B-I contract at DNV, app Tech was one of the bidders as was a SME. And, what is it? Lawrence Livermore Labs or somebody like that. don't hold me to that, but, yeah, it was somebody out, out there, you know, west Coast.

Branden

Yeah,

Greg

yeah,

Branden

why did you end up going and writing a separate, well, there

Clay

wasn't, it wasn't separate at the time. There wasn't any competing software at the time, so what we were looking to, to try to start to implement it, we had a, I guess it goes back to, when, when AB tech kind of recruited me to, to, to work for'em and do this. we knew the A-P-I-R-B-I tool was coming, but it was still not released yet, and it wouldn't have been released for another two years or so. I think the first version that we actually tested in trial. So it was, it was just in advance of anything being available. So I wrote the mechanical side of it, and a guy named a, a process engineer, fantastic guy named John Young, wrote, he was a, a very senior, process engineer. I think his, most of his career and experience was with, Dow or DuPont. I don't remember which now. But he, he wrote the, the consequence side model. I wrote the likelihood side model, and we paired it together. So it was just a spreadsheet based version. some of the data that was available from the early work at 5 81, we incorporated some of those concepts and, and then, and then we, and then it kind of, you know, developed a life of its own. But it probably predated the API release by about two years, I guess, give or take, something like that.

Greg

Would you consider it qualitative, quantitative, semi quantit.

Clay

semi-quantitative. It was, you know, we were actually trying to calculate the probability of failure, to, to a degree. But the, the likelihood or the likelihood of failure in that case, not a true probability, but, but we were actually using real corrosion rates and data and tolerance, damage tolerance of the equipment, to, to come up with a, likelihood value for the equipment. And, on the consequence side, you know, we reason all the same kind of parameters, but it wasn't anything as sophisticated as the 5 81 model. Try, you know, calculating, you know, blast pressure and, you know, areas and, flammable effects and, and all for five different kinds of fire, damage. You know, we didn't, we didn't get into it that. It wasn't that detailed or sophisticated, but it, it led it down the same path and, or a similar path to coming up with a, a risk based on a likelihood consequence model and then, and then, the inspection side of it.

Greg

Yep. I certainly remember seeing it around and it seems like it, it got a little traction on the A SME side too. I'm not so sure if some of the work there didn't morph out on the Oh, I dunno. Yeah, it could, could've been because Ron Leonard was heavily involved. He was, Ron was our first side. Yeah.

Clay

Yeah, Ron. Ron, that promoted it. one of the first, RBI assessments we did was for Ron at his, massive, chemical plant up there.

Branden

Yep. since you were in charge of the mechanical side, how did you go about the likelihood piece? what was your starting point?

Clay

Well, it was, it was, the damage rate, either expected or measured damage rates. and then comparing that to a, a model for the location and what the damage tolerance of the equipment based on its design was. So we were, we were doing some, some math in terms of, you know, how thin of an area could we get, before it would potentially fail. And as we got closer to failure, we were essentially signing a damage factor similar to what a 5 81 did. So mm-hmm. that's how we did it.

Branden

Okay. I was just curious because I, I wasn't sure if you'd gone down the GFF route and had kind of a starting point'cause you've been, you've been on, on record being pretty, vocal about gffs as well. Yeah,

Clay

I'm not a huge fan of the, of the, uh, life fraction model for a number of reasons. you know, later of course, when I've given several presentations at 5 81 trying to promote, some changes and, several of them are kind of based on the problems with the GFF. it's not, it's not a particularly accurate way to determine a probability of failure. So when you start with a, number. And we, and if you know the background right, we, we haven't, in API, we have not updated those numbers in 20 years. And when we started 20 years ago, we didn't really have particularly good data. We were using a couple of different sources for data, but it was more akin to offshore equipment, number one. And number two, it wasn't particularly well calculated to produce an actual GFF. If you, if you look at the, and for those that may listening, that may not know, GFF is generic failure frequency. So they're basically trying to take an actuarial approach like you would on, um. say a life insurance, right? Where you're looking at an average lifespan of a, of a person and then you add factors relative to, to risk factors essentially that would, shorten or extend that lifespan, right? So that's kind of the similar approach to what we did with the GFF and the damage factors, with 5 81. But, you know, it doesn't take a lot of time to realize that the GFF numbers can't be right. You know, you've got the exact same GFF number for multiple different types of equipment, right? And, if you look at the magnitude of the GFF, it's not really in line with what we expect for failure frequency. And when you compare actual, frequencies, I tracked. 12 sites, towards the end of my career, which we didn't really finish talking about, but it was, I, I ended up at, Phillips 66 and I had it at the time of 14 or 15 different sites. And so one of my jobs, for them at corporate as the director for their mechanical integrity programs was to publish, uh, uh, essentially kind of a high level roll up version of what kind of risk we had and how we were performing, you know, on MI programs and whatnot. And, so we had a very detailed metric package where we were looking at all different types of failures and causes, how many clamps you had, how many leaks you had, those kinds of things. And, comparing that data. But, you know, from 12 different refineries, I knew the equipment counts how many different types of pipe and vessels we had, and then what, how many failures we had comparing that to either the GFF or the GFF plus a damage factor. We were, we were using 5 81, at, at at, at the end. it, you didn't get a particularly good comparison. It was, it was probably, almost a order of magnitude off on a predicted damage rate, so predicted failure rate. so I knew, we needed to think about a different way of doing this. So

Branden

conservatively order of magnitude or non conservatively. And did you see a large difference? No. It was just

Clay

too, it's too conservative. Yeah. It's, it's too conservative. Yeah. It's too conservative.

Branden

And were you seeing specifically

Clay

for piping?

Branden

Specifically for piping? Yeah. So that was gonna be my question is were, were you seeing large differences, right? So you said all the gffs are, are the same, right. Except for tanks. And you were seeing that between the different equipment types, the different component types, that there are vastly different failure rates that that are Yeah.

Clay

From what we were experiencing Yeah. Yeah. so the answer then is do we, do we try to go back and try to fine tune the GFF or do we or we change the approach or basis, you know, which, some people are advocating how to go to a more of a true probabilistic model, which I like, it's just, it's more data, more time, more energy to kind of produce that, that level of a, of an answer. And then of course, more recently, what's kind of gotten me, keyed up about AI is that one of, one of our prime contractors, for our industry is said, you know, well, we, we need to change the entire basis. We ought to make it based on ai. And when you base it on ai, you won't be able to regulate it. And that's, that's what prompted the, the last presentation I gave in November, 2024,

Greg

clay, just for reference, as I was at DNV at the time, and I'm not defending the generic fed frequency approach in any way. but, we got, we smashed together data from different sources, and two of the primary ones were the ARITA database. ARITA database. Yeah. And, and the Marshall. database, which was actually from the fossil fuel power industry.

Clay

Right, right. Yeah. Came to, ultimately they still produce a Marsh McClellan report? Right?

Greg

yeah, the marsh McClenan But, but the other thing too, there was, I, I remember when you probably remember this too. We were putting those tables together. there was a lot more extrapolation and interpolation, in my opinion, on piping than there was on pressure vessel. Well,

Clay

yeah. And then piping too. You know, the other thing they, we never did in on piping is piping was excluded from the, the absolute validation, uh mm-hmm. Results and the mm-hmm. Relative validation results. We didn't, they didn't do anything on piping, so there was no, there was no smoothing or adjustment to any of the data for, for piping. So it kind of, it kind of wasn't considered when we did those two. reports, right, right. From vessels and tanks towers. Well,

Greg

and in fairness to you, AI wasn't anywhere near as prevalent as today, and neither was computing horsepower, and now with probabilistic and things we can do with computers, we can do that stuff a lot faster instead of telling the computer compute and come back the next morning and get the answers. That was, that's

Clay

been my example, Greg. I remember many, many times, the last thing I did before I went home at night was punch the calculate for on a, on a unit level. Right. Because I knew it was gonna be four, four hours and you hope to God you didn't have critical errors in it when you came back in the next morning. You had to do it again the next night, you know? Yeah.

Greg

And now we can run a few thousand Monte Carlo simulations and Oh, I know. In a split second, you know, it's no,

Clay

the, the data, I mean the API 5 81 approach has stood the test of time. Right. it's done well. It's just not particularly accurate. And more of my passion in the last couple years has been trying to influence some change, improvement around the piping piece specifically because, you know, early on. A lot of folks, you know, tried to do piping RBI and it, it didn't particularly work well. So I think most people kind of abandoned it and went on, you know, like I did at multiple sites and, we tried it and didn't seem to work particularly well, so we limited it to just the vessels and, primary pressure vessels and we excluded pipe and we stayed on a half-life, kind of approach for piping. But now in the last five years, I know, I know of at least four majors that are using, using RBI for, for piping, but, in looking at their approach on it, I was still concerned about, you know, the answers they might be given because of the specific approach they'd taken for piping. You know, you were, they were still looking at averaging, you know, multiple components as a circuit and then plugging that data in, right? Mm-hmm. Mm-hmm. And I've, I've showed in presentations at API, that that's just not conservative and

Greg

representative circuits, some people would. Yeah.

Clay

Yeah. AB absolutely. And, if you stop and think about it, you've got, in any pipe circuit, if you've analyzed much data on piping, you know, that, you may have, you know, 20, 30, 40, whatever, TMLs on a, on, but they're across multiple different components. Every component's got a different T men, even though the pressure rating for it was all class one 50 or whatever. Mm-hmm. But, you can't compare a corrosion rate and the root of a 3000 pound DRT to, you know, to, the 12 inch main pipe. You know? Mm-hmm. Life, it, it just doesn't work well when you're trying to, analyze risks that way without recognizing the, the kind of the limiting factors, either the higher corrosion rate locations or the shorter remaining life locations. So, and we've showed that data mathematically Right. At API meetings. we've showed that you, you can come up with a non-conservative future inspection relative to an actual predicted failure on an individual component within a circuit if you were just averaging it across the board without any other considerations. Right. So,

Greg

right, right. You know, this is really not off base. but, Clay, you and I used to have some of the, some conversations I used to really love when we were both together back in the day at equity. But, this is something that's always bugged me, is working in the area of risk and risk-based inspection and me primarily fixed equipment reliability is how come we don't see more organizations when they come up with what they think is a good technical approach. How come they don't go back and test it on past failures to see how well it would've done? I don't see a lot of that. Do you maybe I off?

Clay

No, no. I, I think you're absolutely right. Part of the, part of the reason is, you know, I don't know if it's a human nature thing or whatever, we don't. We don't always learn well from our mistakes. We don't always capture the data necessary to come back and, and revisit that in the future. You know, we don't keep good data. In some cases, they intentionally kind of purged it from their history, you know? Mm-hmm. I remember looking at a, trying to research a, a couple of failures and, the site had actually intentionally purged all of the data related to that failure. I don't know if it was a, if it was a legal concern or they just didn't like that his past history and, wanted to make sure that nobody could dredge that buck up. I, I don't know, Greg. There's, yeah, that's always kind of

Greg

baffled me. Same thing in RCM. You know, people come along with all these big ideas that they want to charge you a gazillion dollars to implement and buy the software. It's like, well, you, you knew when this failure occurred and you know everything around it. Why didn't anybody go back? And test.'cause there are times when we really, I think, do know everything around it. Yeah. And anyway,

Clay

yeah. Cer certainly if you're running an RBI program and you have an unpredicted failure at your site, right? Mm-hmm. Where you, there's a reason you failed to predict it. Right. That's, we used to take, I used to specifically take metrics on, on this kind of stuff for, for years to try to improve our programs. I certainly would advocate, you know, going back and, and looking at why was it missed, you know, for mm-hmm. From an RBI approach or just a, you know, just a standard inspection approach. What did you fail to predict? was it, you didn't, didn't have the right operational data, you didn't have the right, the CML allocation or location specific kind of a problem. For you failed to predict something that that would've allowed you to stay in front of it as a potential before it actually physically failed. Right. So,

Greg

yeah. I, I actually have a theory about that and one of the best tools we have for continuous improvement, because I don't care how good the initiative is, there will eventually be a failure and that for sure, and, and we will learn from that failure to improve the program, but we've gotta capture the data and, and go after it intentionally, to make it happen.

Clay

One, one of my favorite quotes is, if you fail to learn from history, you're destined to repeat it. Right? So, yes, sir. The other one that's a big issue to me too these days is, is around, kind of the, the current. Trends to try to right size CMLs. I, I liked your version, Greg. You actually coined that phrase back in the day when we were at equity. Remember the, that first work I did, you know, looking in and applying some statistics. Back. man, that goes back to about 2000 or 2001 maybe. you know, a lot of people were, were talking about CML reduction, but, but you had coined the phrase, I guess either right? Siding or right sizing or our

Greg

CML optimization was optimization.

Clay

Yeah. I think that's the word. We, we may add

Greg

more, we may take away, but, and we, and,

Clay

and we did, obviously we recommended in a couple cases you add a lot, a lot more. Yeah. The, but still, you know, that true, that seems to be more in vogue these days actually than when you and I were starting to do that kind of stuff back in, in early 2000. But, these days, like I, every site I've gone to seems like they've had some kind of visit from a company trying to, trying tos true. Trying to, and, and then in those cases it was more reduced than to, to, right size I, I suppose, but I have seen some really scary stuff on that one. when I've gone back and pulled the old data compared, it, it really, really concerns me because, you know, if you're applying, you know, whatever formula you've got, you know, either you're just looking at it and deciding, I can, I can remove some of these. Or you're basing it overly basing it on some, statistical routine, right? looking at the data, how, how much can I reduce this and still come up with essentially the same equivalent, average likelihood, right? for that equipment. I've, I've seen some stuff that really worries me, to be honest. So

Branden

same here, clay.

Greg

any pitfalls, clay, that you could point out?

Clay

well, one for sure. if you're looking at it purely mathematical, right? You think about, the math approach, right? You, if you're, if you're deriving statistics based on existing data, what it precludes is, is do you have the right data? What's the accuracy of that data? And did you take the data in the right locations relative to the damage mechanism? So those are, those are three reasons that just, just a data approach is not good. You really want to make sure that you've got good quality data. But, but even before that, that you've done a DMR, you understand what damage mechanisms are likely to occur, and you had a, a good inspection strategy relative to where the damage is occurring, right? So if you don't have good data, you may have missed the worst case location. Yeah,

Branden

entirely.

Clay

You know, the, the DMR is

Branden

a really important piece there.

Clay

That's absolutely, it's the start for so much of our work that really needs to be done on, all this stuff. Right. So, so, good, good. Starting with a good DMR and then, you know, the thing that I've seen from, I've done lots and lots of auditing, in my career But, what I've, what I've seen from auditing is that, you know, a lot of sites will go through the, the DMR, but then they've never actually adjust their inspection program relative to what the DMR says. They, they put the DMR on the shelf and not actually adjust their inspection plan. Right. that's not good. That's, and that, but that seems to be a very common thing across the industry is they'll check the box DMR done, but they don't ever go back and change their strategy relative to the, to the DMR. Now, obviously that's the, that's where the rubber meets the road. Right. And, and you don't, if you don't adjust where you're inspecting and why or how you're inspecting for damage, you're not gonna take the benefit of the DMR.

Branden

That's, I mean, for me, when you start talking CML optimization, and I, I talk about a lot with folks, it, the first thing, I call it a susceptibility review, right. Because bringing back the RBI piece of susceptible locations, right. you gotta figure out are the CMLs in the susceptible areas, and you can only do that from the dmr.

Greg

Yeah. If your CMLs

Branden

aren't in the right area, there's no point in trying to add or remove. It's moving them and then Exactly. And then you gotta track'em for five years and see what data you can get. You gotta have a couple data points in there. So to, to go through the optimization effort. Half of it, if people have, don't have, haven't gone down the path far enough. Half of it is just doing a DMR and reviewing at that level.

Clay

Right, exactly. Of course, you know, changing, adding CMLs and redoing drawings and re a circuit, you know, all of that is tedious work, right? Yeah.

Branden

Yep.

Clay

It's, I know you can do it in the back end, probably a lot quicker. I, I only really know how to do it from the front end of, of a couple of programs and it's tedious and time consuming and, it's not easy to accomplish, right? So you, it, it costs, costs you to do it and, but it's the right step. if you're trying to analyze the data you have on an existing circuit and you haven't done that step first, you're likely to get bad data. You're gonna get at least for sure, kind of one tailed data. You're only gonna see half the picture if you haven't been inspecting in the right location. So that's, yeah, that's one of the biggest fallacies I see. The other side of it is if you're just doing the statistical analysis, you ignore the fact that maybe I don't have all the right data in the right locations. you know, people don't think through. A lot of times I, I believe, how to do a good statistical analysis on the data, right? And, and one of the first things you want to do is you want to determine, do I have a, based on what statistics you're trying to run, do I have a population of data that supports the statistical approach I'm doing? Right? So if I don't have, enough data to actually derive a statistical significance, then I'm, I'm averaging, you know, very few numbers, right? Like, so in, in the extreme it's like, what's the slope of a line between two points? You've got no clue. It, it's not, it may not be anything close to a straight line, right? So you really have to understand that piece of it. It has to be based on the right. Kind of damage mechanisms and taken in the right location. Then you, on your statistics, you really need to look at if you're using those to, to do, like a sampling statistic, which I'm a pretty big fan of, of using. You gotta make sure you have statistically significant population of data in order to derive the, the statistics that it's based on. otherwise you're just gonna get garbage and you, you'll never know it's garbage. It will give you an average corrosion rate, right? Mm-hmm. You can do, you can always get you an average corrosion rate. Mm-hmm. You know, with two, two data points. And if you're just blindly plugging this stuff into a program and getting an average, and then determining a, sampling statistic and a confidence interval, you can calculate all of those too without having the right amount of data in the right location. So it'll still give you an answer. And that's the scary part, right? Yeah. You don't understand that. Yeah.

Branden

Well, what do they say? data can say anything you want it to.

Clay

Sure. Or, or the way we used to say it, you know, data doesn't lie, but liars configure, you know? There it is.

Greg

I did actually, somebody, I think you asked, Branden about an example. like Clay, I won't get any specifics, but I've seen what can happen when you turn a data scientist loose without any context and without understanding at least some understanding of why we do A DMR and what it means or whatever. I've seen that go seriously awry. And on the other hand, I've seen. When you take the time to collaborate with the knowledgeable corrosion materials engineer specialist and the data scientist, man, some beautiful, powerful things could happen.

Clay

Sure, sure. I'm a, I'm a full believer in, in running statistics on, on the data and looking at can I reduce the CMLs? That's, it's a very valid approach, but you've gotta think about all the things we just talked about, right? Do you have, the right location? Am I getting the right data? You know, you may not even have the right, the right corrosion rate kind of assigned'cause you're not looking in the right location. So you have to do those things to do this. Well, a tedious part of the job, obviously is going back and re-looking at what strategy was employed at the time for the data that you're collecting or trying to analyze, right? Mm-hmm. So if you go mm-hmm. Pull up an iso, understand the damage mechanisms, and then meticulously think through where that damage was more likely to occur. is it a six o'clock position or 12 o'clock position on a horizontal run pipe? or how much effect could I have at small dead legs or, or in high flow areas, and look at the strategy that was employed on that circuit and then start to make adjustments. Look at the data that they collected, and then make some adjustments to those TMLs. That, that is a very tedious process and it's not always really accomplished with, well, with an inspector. Now, if you've got enough experience and you know enough about damage mechanisms and maybe with a bit of some consultation from somebody that, that's, more of a corrosion materials guy, certainly you can do that. But, you know, we don't have enough corrosion materials engineers in our industry to start with, let alone for working at that level of detail. So, it is, it's not easy to do, I can tell you that from doing it. I did a review of all the injection and mix points at one site here recently and help them configure A RBI approach for it, which we had to write a complete set of rules for how to do that. but, looking at each one of their ISOs and where they were collecting data, made multiple recommendations on changing CML locations. and that actually resulted, this is a plant, it's a major, refiner and it's a, they've been collecting this data for a long time, but they had missed some key areas relative to what kind of damage could occur. And, from that, they were able to go back and they had, I think last time I, they talked to me about it. They, they said they had found seven near misses. they were close enough to failure that they classified it as a near miss. I don't know what number of CMLs they ultimately, you know, were, trimmed it to, I gave them some recommendations, but they were gonna go back and, and re-look at it. But that's a key step is to go back and really understand, where do you have your CMLs and what were you looking for and do you have the right techniques and strategy,

Greg

you know, clay, I remember something you and I were both connected to, and I won't get into any geographic or company names, but of a heater that failed and, I remember that unit had, RBI done on it like about, I don't know, four or five years previously. And the site determined, like we were talking with'em about, you know, it's been three to five years since you last did the RBI. Do you think it would be a good RI think it would be a good idea was my recommendation to redo. The, the DMR read, Hey, let's see if operating conditions are still the same and everything. Well, so fast forward, a few years later, and, the heater burns down and we went back and looked at that first RBI study that was done years earlier. We don't know who did the DMR. And when we looked at the corrosion rate that was being used for the main header going into the heater, and this was in high temperature sulfur service, they had used three mils per year. The corrosion rate to model that header, and you looked at the dichotomy curves, it was 35 mils per year. But when you looked at the IDMS data, the corrosion rate was one and a half mils per year. My theory is that whoever assigned that three mils per year, and this is just a theory, I don't know, looked at that and said, well, if the IDMS is saying one and a half mils per year and we put in three, we're probably safe. You know, probably safe. Maybe that happened, maybe that didn't, but I, I kind of feel, and again, only God knows what would've happened that if we'd have redone the DMR at that time, which means we would've had to revalidate the operating and the corrosion materials engineer, a real specialist would've looked at that data and caught it. Yeah. Yeah. That's just, you know what's, I think you remember that either.

Clay

I don't remember exactly that one, but certainly that's happened a lot of times. yeah, it's interesting to me, Greg, if you remember back in the early days when we were doing RBI, you know, we kind of had a, it was a bit of an unwritten rule, right? we would do, and I, I still do this, I will use a. Either the worst case of the calculated rate or estimated rate or actual measured rate, I will start somebody off using the worst case that forces them essentially to go inspect it based on that worst case likelihood, right? Mm-hmm. Until they prove that, that I don't have that kind of a rate. Right? Realistically, you're challenging them Yeah. And allows'em, so you, but, but that starts you out a little bit conservative. Clearly, if you're using an, you know, an estimated or a calculated rate to start out your RBI on, it can, it can tell you to do a lot more inspections than maybe you have been doing in the past, but. And unless you can really prove that your data is good, your corrosion rate data is good, meaning you've anticipated those damage mechanisms, you've collected the data in the right areas, and you have a good quality, you know, IDMS program, and you're doing all the right things for, measuring, corrosion rate, you know, to me you always start conservative and, force, the, the site to, to come to grips with that and make sure that that's not correct. Right. Yes. We don't seem to do that much anymore. we seem to either base it purely on whatever corrosion date data you have, whether or not that was a good program or they'd put the CMLs in the right location, or you're maybe basing it on somebody's opinion and they may or may not have ever worked in that unit, that type of unit. You know, in industry it's, it's a little scary at times for sure.

Branden

you mentioned, QA and data assurance and getting data in, and, you know, there's a lot of data being moved around. There's a lot of data that we're taking in the industry now. that kind of leads us to a conversation around ai. I'm curious, where do you see the future for AI in our industry? You've, you've worn a lot of hats. You've seen a lot of good, you've seen a lot of bad, yeah. Where do you think there is a, I mean, there has to be a place, we can't say that there isn't a place for artificial intelligence to be able to support, especially supporting on the Yeah. Global version of refining and chemicals. Where do you see it fitting in?

Clay

it's probably almost unlimited in how it could fit in and, and in terms of where it should fit in or what we should be concerned about is, you know, is if you're ultimately, if you are, using an AI routine in order to, to calculate anything in terms of like a risk or a likelihood of failure or a consequence of failure or when I should be reinspecting this equipment. You know, number one, it's incumbent upon you to understand any of the biases and what that is based on what those are actually being based on. can you get all the way down into the details of how it's specifically calculated? I mean, when I do, I, I use AI now to do a lot of data analysis and, but I'm very specific about it. I either have it use a model that I created, or when I let it run on its own, I ask it to show me the calculations and considerations that it came up with for doing that analysis. Right. And then I kind of look through for key things. I'll give you an example, a quick one, and then we'll finish talking about this. so I I plugged in a circuit that I was having some, problems with, and I wanted to see what kind of results I'd get from ai. So I plugged it in. I was using chat, GPT, I think fourth or fifth version. Anyway, so I fed it all the data off of spreadsheets, and I said, you know what? What would you assess the, you know, the average corrosion rate to be and, the confidence interval for that corrosion rate and, for use, to represent the circuit. Right? And, and are there any concerns? So it came back, gave me all the calcs and said, you know, your average rate is this, your confidence interval is this. And,

Greg

and,

Clay

you know, and because I knew the data that I'd given it, it had only a couple sets of readings, right? So it wasn't something that had, you know, 10, 15 surveys. It was only a two or three surveys of, of data. And so I said, did you consider this some of this data, some of these data points. Were based on only two readings. You know, did you consider the, the statistical significance of the population of data for these calculations? and Chad GPT came back and says, oh, great idea. We should do that. You know, if you're depending upon me to think of every problem that could occur within, you know, a complicated set of statistical analysis, you, you've made a mistake to start with, right? So I happen to know that one because I looked at the data, so, you know, you've gotta be really careful about it and, asking open-ended questions without constraining it to, data resources that you want it to pull from. you know, the biggest mistakes in AI these days are often made because it's, it's pulling bad references off the internet, right? that's the key to me is I think we'll use AI more and more obviously, but you have to, again, like any program, you have to understand the bias and the limitations in, in those programs. yeah. And, if you're gonna hang, you know, your company's livelihood either in, producing chemicals or whatever it is you're producing, it's incumbent upon you, especially when people's lives are at stake. Literally. You need to understand it and make sure that you're getting good answers from it. So that was the genesis of a concern from a presentation I gave. At, the fall uhmi executive leadership meeting, it was very clear that AI was going to come and be used for things like RBI type analysis or other type of, RCM or whatever type, type of analysis. So we knew it was coming, but the key is you have to understand it. And, and I think to a degree we can regulate it. you can't necessarily check every reference, but there's a couple ways that I think we can do this. And I think an industry, not just us, but not just all industry, but I think a lot of different industries are struggling now with how do I regulate ai. But you have, like any program, like we've already said, you have to understand the basis for it, and you have to make sure that you're, that it's correct or, or you've gotta devise a way to test and check it. So one of the, ultimately as it doesn't matter if you're using RBI today, very few people could quote all the nuances of how the RBI is calculated today in their plant, right? And, which is another problem. But, you know, failure to understand, that's no worse than failure to understand how AI is generating your answers, that you're, that you're hanging your future inspection dates and potential failures on either, right? So you have to understand that. A couple, couple ways that we could do this. of course, you know, one I suggested that we have either the company that's providing it for you maps out the math on all the equations and provides it to the company they just sold it to. number two, the company, whoever's writing the software, takes on some financial risk of anything that happens that'll help fix a lot of problems. They'll be a, they'll be a hell of a lot more careful than, when they're, when they're creating this stuff. Right. And then the third way I suggested we do this is, look at, creating a compendium of example, problems that have been worked out, either known failures, where we've worked out all the details in the math, in an RBI approach, for comparison that you could run through. any new AI approach or any new, you know, qualitative or quantitative, approach worked.

Greg

Examples.

Clay

Yeah. Worked essentially a worked example or a marquee data set, if you will. Mm-hmm.

Greg

Some

Clay

people call it, and you could certainly do it not only from that, but you could do it from QRA. So if you, if you had a very detailed quantitative risk assessment performed, you know, and there's bunches and bunches of those, we could actually use that as the basis for a full set of, of marquee data examples to, test the problem, test the new software solutions that are being generated. I mean, those were my initial thoughts on how we might, three ways we might approach doing this. Mm-hmm. To make sure we were getting good quality data.

Branden

that's a neat idea. the marquee data set. if you want to come up with some new model, some new method, right. Of, of determining failure. Time, time till failure.'cause that's essentially what we're trying to do. Or time till inspection or risk

Clay

consequence too. Yeah.

Branden

Yeah. Any of it. Right. But the final result is an inspection or a Right. An action, time till action. And having a data set there of those known failures, and then testing it to see, does it miss any of those, feed it all, provide that full data set of all the information that's required or that we have, and that we know. That's a neat idea.

Clay

Hmm. Yeah. ultimately you could score it depending upon if it's, it's a quantitative result, you could actually score it relative to the actual results. Right. And, and then you could kind of look at, okay, you know, if I'm not getting very close to this, there's gotta be a reason why, what, what's the difference in the calcs and why? But, you've gotta almost now have a marriage right between. The, the company's producing the, the AI solutions, and then, and you wanna make sure that they're not just, creating the solution that fixes that one incident, right. That where they weren't very accurate. You know? So you've gotta be really careful about that too. But, but, ultimately we've got to come up with a way to, make sure that we're getting good reasonable answers that are, you know, giving you a, a good degree or margin of confidence in, in the answer on what it's telling you is the potential future failure date or when you need to reinspect.

Greg

So you're saying transparency

Clay

is important? I think so. I don't like the idea of the black box. Right? So that was what I titled The presentation when I gave it, and I gave a couple of examples. I gave a AI example, on, on Tesla, the first fatality. Mm-hmm. I used that one. because essentially, the guy was trusting the car to drive for him. a tractor trailer pulled across the path, the radar system that, that it employed for their autopilot, was aimed down lower. So it went under the tractor trailer that was completely across the freeway. Never saw it. Hmm. And the camera system, the tractor trailer box trailer was painted white. and based on the sky, it couldn't distinguish well enough to identify it as a problem. so that was one. there's been like 47 fatalities, on, on Right. Teslas, you know, from their version of the autopilot. And they claim it shouldn't be driving the car for, you need to pay attention, you know, as you're the operator of the car. Right? Right. And, and now Tesla of course, has cameras, it watches, and it knows everything, every adjustment, everything you're doing, including what you're physically doing when they, at the time of an accident with their black box now. So, the other one was a military example, where they were testing AI to, identify threats and then, develop a missile firing solution. And then it, and it required an operator intervention to agree to it and authorize the launch of the missile. Right. Or not. But pretty interesting stuff for sure,

Branden

man. Yeah. Some of that, the hallucinations, is what I think they've called it, where it just starts making things up and

Clay

Oh, yeah.

Branden

But the reasoning, I mean, you can't, can't argue with some of that reasoning there, flawed reasoning, but at least it's reasoning, which is interesting.

Clay

Yeah. it's definitely a bit scary, you know? I, I've realized now that, that I get more predictable and better results, you know, when I provide the frame of references that I want the AI to use. Right. it's a bit more of a program than I thought. I first started thinking about AI as a, you know, I just log in and ask it a question, right. And, and let it. Search the world and to come up with good answers. Right? But it can't understand the difference between fake data and real data. it can't judge, I don't think, a better approaches versus a worse approach. And so what I've done more recently is I, you know, I, I code, code in, so I've got a whole series of things I code in, including reference documents, considerations that I want it to use for doing statistical analysis or whatever I'm doing. So it's got a whole series of instructions, so almost like a program parameters, if you will, for what it needs to consider. And that, that, that kind of keeps it more grounded in something I understand or know where it's coming from. Right? So, I don't know if you'd ever be able to get that out of a third party producing, say an RBI tool, right? Dunno if they're willing to do that, and then maybe with their clients they're willing to share it. They certainly should be able To do that with a client that's bought their software, they ought to be able to Right. Identify that. But I think the best way is some independent checking on the quality of the data and answers you're getting relative to something, in my case, relative to re what I consider to be reality, actual known failures or, or worked out QRA kinda worked out, examples. Yeah. I think are the best way to do it.

Branden

Yeah. what's your favorite, prompt hack that you've been using lately? Like prompt hack? Yeah, maybe not hack, but your favorite, your favorite prompt that you're using to help. Like mine right now is I'll ask it something and I'll, and then I'll follow it with, and ask me any questions that you need to make sure that we get a quality response. And then it gives me 5, 10, 15 questions that I go back and I answer, which helps hone in what it, what I'm asking. Yeah,

Clay

yeah. You know, I, you probably have a lot more experience at it than I do at this. You know, I try to frame up the problem as fully as I can and then provide, data and or documents that I think are reasonable examples, for what I'm trying to do. And then I'll tell it always to list out, show the math that it's basing it on, and then I'll go through that to look for any problems or issues. That's how I discovered the, in the one case I gave earlier, that, that it wasn't looking at calculating any statistical significance on the data population. That it was using to, to derive the averages and, and, z critical, Z value and, and confidence intervals. So, you know, you're right. Asking it more questions. That's a good holistic question for sure to ask. you know, I don't know how to get around the problem. Like it didn't recognize itself that it should have checked for statistical significance. I knew it should have, but, but that's'cause I've done it a bunch, but, you know, I, there's a lot of stuff I don't know to ask it. So how do you get around that? I don't know, Branden. That's a good question. Yeah.

Branden

John Reynolds, we had him on a little bit ago, and, he says he uses it. he's dabbled with, I think probably something like chat GPT or something, and his take on the whole thing is that it's a really good grad student. He views it as a really good grad student where you can give it a task, it can go do it, but you still have to go back and check it just to make sure.

Clay

Yeah, yeah, absolutely.

Branden

Well, good, I appreciate your time. Appreciate you coming on and talking and, you've got a lot of experience. You've worn a lot of hats, you've got a lot of opinions. it's really nice being able to sit down and talk with you. So, yeah, enjoyed it. Thanks for having me. This was

Greg

great, clay. This was great man.

Branden

Thank you for listening to Upon Further Inspection, a Mechanical Integrity podcast. This episode was co-created by Inspectioneering, and CorrSolutions. Our producers are Nick Schmoyer, Jocelyn Christie and Jeremiah Wooten. This podcast is for informational purposes only and does not constitute legal or professional's advice. Listeners should seek their own qualified advisors for guidance. If you enjoyed this episode. Please join us next time wherever you listen to your podcasts. Until then, stay safe and stay informed.