Audit Trails, RBQM, And Agentic AI Explained Artwork

Transformation in Trials

A podcast about the transformations in clinical trial. As life science companies are pressured to deliver novel drugs faster, data, processes, applications, roles and change itself is changing. We speak to people in the industry that experience these transformations up close and make sense of how the pressure can become a catalyst for transformation.

All Episodes

Transformation in Trials

Audit Trails, RBQM, And Agentic AI Explained

February 11, 2026 • Ivanna Rosendal • Season 7 • Episode 9

0:00 | 39:12

Send a text

Ever wondered why a clean CSR still leaves you unsure how a trial actually ran? We dive into the hidden layer that explains the “how”: audit trails across EDC, IRT, eConsent, and ePRO. With guests Ellis Hiroki of Study OS (now rebranded siteroAI) and Nechama Katan of Wicked Problem Wizard, we unpack how E6(R3) shifts sponsors from “we can export logs” to “we continuously analyse them,” and why process measures—not just outcomes—are essential to real RBQM.

We break down the obstacles that keep teams stuck in CSV purgatory: fragmented vendor data, missing standards, timestamp chaos, and brittle one-off scripts. Nachama shares pragmatic use cases that matter—like ePRO entries after discontinuation or suspicious mass updates—and how to prioritise by likelihood, detectability, and severity. Ellis explains why general AI isn’t enough, and how a purpose-built, agentic approach uses models to plan steps and generate validated SQL or code, rather than hallucinated answers. The result is auditable reasoning, repeatable checks, and faster paths from a clear question to a trusted signal.

From there, we connect signals to action. Analytics without workflow creates noise; analytics with RBQM workflows produce root causes and durable fixes. We explore how audit logs become the first true process dataset in clinical operations, and how broader operational inputs—logistics, communications, and training—can also be measured when systems are API-first and integrated. If you’ve ever watched a leadership question trigger a scramble in stats programming, this conversation shows a cleaner route: experts ask in plain English, the system produces valid code and dashboards, and teams focus on insight rather than plumbing.

If this resonates, follow the show, share it with a colleague, and leave a review. Your feedback helps us bring more practical, high-signal conversations to the clinical trials community.

Transformation in Trials is a podcast investigating how we can change life sciences to get treatment to patients faster.

Getting treatment to patients faster requires well-functioning organizations. How do we do that? Ivanna Rosendal has written a book called Maneuvering Monday, about how a group of people try to make their organization better. You are certain to have a good laugh at their expense. And potentially get inspired how you can help make your company better.

I have been independently producing this episode since 2021. You can now support the show by Buying Us a Coffee. Each episode costs 99USD/ 85 EUR to produce.

Join the show as a guest - apply via this Form.

Support the show

________
Reach out to Ivanna Rosendal

Join the conversation on our LinkedIn page

SPEAKER_01: 0:31

Welcome to another episode of Transformation in Trials. Today I have two awesome guests in the studio with me. We're gonna talk about audio trails, we're gonna talk about agentic AI, we're gonna talk about RPQM and what's happening in regulation right now that kind of makes all those things related to each other. But before we dive into all of these topics, I want you to know who my guests are. Ellis, would you introduce yourself first?

SPEAKER_00: 0:53

Sure. My name is Ellis Hiroki. I'm the CEO and founder of a company called Study OS, breaking Agentic AI to Pharma. We specifically focus on the clinical data management and RBQM elements. Before that, I started my career off at Microsoft in the big data, machine learning, and analytics space, where I worked on the original Azure Synapse Analytics project, which has since become part of a larger offering in the Azure ecosystem. Following that, I had various successes and failures across different healthcare startups. And right now I'm spending all of my limited brain brain power and energy working in the pharma space with folks like Nahama better serve this industry.

SPEAKER_02: 1:26

Awesome. Nahama, would you mind introducing yourself? Yeah, so my name is Nahama Kitan. I'm now officially the chief wizard at Wicked Problem Wizards. I have degrees in math and statistics, which I call liberal arts engineering degrees. I'm qualified to do learn anything and to do nothing. And I spent the last eight years or so in the pharmaceutic, having joined a large pharma company where I implemented a global RBQM solution. And now I'm working with tech startups, tech integrators, and sponsor companies to understand how they can really go to scale and be effective in their RBQM and audit trail solutions. I also run an audit trail working group for the eClinical Forum, which is getting ready to publish a paper, fortunately internally for now, but it will be open to the public once it's been internal for a short while.

SPEAKER_01: 2:16

Awesome. Well, before we dive into further topics, I want us to make sure that we get some of our listeners along with us for the journey that are new to the space. So if I were to ask both of you, how would you explain what an audit trail is and what is it for?

SPEAKER_02: 2:32

I'll do the kind of the business side and let Alice do a little more of the technical side. So in a regulatory environment, you've got a couple requirements. One is you have to say what you're gonna do, you have to do what you said you were gonna do, and then you have to show you did what you said you were going to do. And if I was doing this on a piece of paper, I would write out my instructions on my SOP. So I would say what I was gonna do, then I would do it, and I would check off each time that I did each one, and then I would have a document, a piece of paper that says, hey, that's my proof that I did what I said I was gonna do, right? There's this phrase in in pharma that says if it wasn't documented, it didn't happen. Right? So the neat thing about computers is that you don't have to do that anymore. Computers know if you build the right systems and you build these systems to be, I call it design for validation. You you build a system that's already designed for these types of questions, tracks who did what, who entered data, who logged in, when did they log in. It's all of that they call metadata that's been tracked around it. Who did the data entry, when did it happen, date timestamps, all of that cool stuff. What we discover, and what's magic about audit trail on clinical trials, is that the data that comes out of the trial, how many, you know, the patient's blood pressure and when did they get dosed and when did they sign the e-consent, all of that information is interesting information for the person analyzing whether or not the trial was successful, right? Was the pain med effective or not? What it doesn't do is it it only kind of very indirectly if you squint and look cross-eyed and whatever, tell you how the trial ran. If on the other hand, if you think about if you've been to a hospital in a Western country recently, the person pulls out their badge, they scan in, hey, I'm giving you a dose of a medication, they scan your wrist bracelet, right? Everything is scanned and tracked in an audit trail so you can see the whole process. So the audit trails for the EDC, while it's only data entry, and the audit trails for the IRT or the eConsense or the ECOAs and the EPROS all start to show us what actually happened at that site and gives us this really powerful ability to understand what the behavior at the site was and was the data collected in a way that makes it fit for purpose? So that starts to answer the questions in E6R3 of do you have data that's fit for purpose? Did that process that that site follows? Is it consistent? Does the audit trail mirror consistent behaviors? Or are they putting in one data point and changing it six months later for no reason?

SPEAKER_01: 5:04

That's a really good explanation. Ellis, maybe you can tell us more about how does one log an audit trail?

SPEAKER_00: 5:09

Sure. I'm gonna repeat some things. One tacit assumption in that was that the systems are actually built as CSV and part eleven compliant, which may not actually be the case. But let's just say you're working with a vendor in your clinical study, and they are actually CSV and part 11 compliant. So as a result of that, every action taken in the system should have an entry in a system that dictates what happened, right? So who, what, when, where, why, right? And so I think one of the big challenges in the space to date is that there isn't a lot of predefined analytics, but I think more than that, there isn't really a setup or a flow that enables people to do these kinds of analytics. So one, a couple of features of audit trails or audit logs that make them particularly difficult, I think, for the lay user to analyze is one, there's just a lot of them, right? So if we think about every single entry or line item in a clinical study, you're gonna have at least one, if not many, more log items. So it's gonna be some factor of the number of data points in your study already. If you have three million lines in your study, then you're gonna have 15 million log entry lines, right? And so as much as we like to tell ourselves that we're using advanced analytics and good systems for this, the reality is a lot of people are still using Excel, and this goes beyond the capabilities of anything you can load up into a spreadsheet and look at yourself. So there is one huge issue with that. The second one is that audit log entries are by definition isolated from the rest of the study data. So that means that you have to work often with the vendors. There's often roadblocks to acquiring the data itself, so you have that problem. Then there's lack of standardization because there isn't some kind of standard that really predefines what audit exhaust should look like. And so everyone's gonna do it a little bit differently. It's not a huge problem, but it's just another roadblock and another issue that you're gonna have to deal with. And then finally, in order to actually generate these insights, it's not just about, hey, can I look at the audit log independent of the actual data, but it's about marrying the clinical data with the audit log information to actually see essentially the journey of the study. And all of that requires things like the ability to host the data in a specific place, the ability to run the analytics, the ability to actually understand what you're looking for, and so on. And I think a lot of those are both technology infrastructure as well as skill requirements that frankly the industry doesn't really have set up. And I think that's like a big reason why the e-clinical forum working group has focused so much on what it would take to actually do this. But I think the last thing I'll say on this is that the industry makes it very clear with R3 that there that is the expectation for sponsors to do it, and I think it's something that we should all be focusing on.

SPEAKER_01: 7:36

Well, my first experience with the audit trails was back when I started uh in live sciences 15 years ago. The company I was working for was facing an audit, and we got our audio trail export uh in a CSV file from the vendor, and my first task was to make that into a spreadsheet. It sounds like we're actually not too far away from uh that reality 15 years on, but we can no longer afford to be in that reality because of the regulatory demands, is that correctly understood?

SPEAKER_02: 8:05

E6R3 has made it very, very clear that auto trail analytics is a requirement, is not enough to say your system has an auto trail that you could pull up, but that you need to have regular planned reviews. Regular planned reviews, in my opinion, and in the in the working group's opinion, require automated data flow, because otherwise it's impossible. You can do it without automated data flow. It's very hard. And then analytical tools to manage that data so you can figure out what to do. And even before that, kind of what are you looking for? What are you reviewing it for? 15 million lines of data, I'm looking for what? Right? That's not a filter and sort or the lookup at that point. It's you have to have specific questions.

SPEAKER_00: 9:17

That's another element. I mean, the other thing to keep in mind is that even though it's been very slow, the adoption of e clinical solutions has increased, which just means that the more there are more data points that have audit log exhausts to be looking at, as opposed to maybe 15 years ago, where well things are still a lot on paper. So I you know, I'm not gonna say that everything's digitized now, but that is a trend and so uh we should uh act accordingly.

SPEAKER_01: 9:39

Yeah. But it's also a question of having those many systems that actually touch the clinical trial. If we really want to make sense of what happened in the trial, we also need to make sure that we both have access to those uh data logs. Some of them may be with our CROs potentially, some of them may be with our CRO subcontractors as a sponsor company. So it's creating that whole data infrastructure first before we can even look at the data, and then running into standardization issues. However, just looking at the single system, what are some of the challenges that we're seeing in our industry for just making sense of an audit trail?

SPEAKER_02: 10:11

So uh I think the first step is like DMIU's clinical clarity or clinical relevance. There's a lot of things you can go look at, right? And not everything is equally important. And this is an industry that likes to say, oh, we're gonna go check every single thing. No, you're not gonna go check every single thing. You can't. And you don't really want to. EClinical Form and SCDM published the paper in 21, which was kind of the audit trail paper that everyone references. And in it it had, I don't know, 30 or 40 use cases that they listed. They were clustered, but they weren't prioritized. So the first thing to do is to go through and prioritize those use cases, which again, the new paper will have had done. If you're a member, reach out to me and we can show you that paper. You prioritize high, medium, low, and it's how likely is it gonna be to happen, how hard is it to detect, then what's the severity if it happens? Let me give you a couple of different examples. If you have a patient on an EPRO device, so a patient entered form and they're entering an assessment and they leave the trial because God forbid they've died or something, and those EPO entries continue to happen. That tends to lead regulatory people to question who was entering that data to start with. Now, a lot of people, when that these conversations first came up and I first started having them with industry, someone said to me, Oh, well, have your EPO vendor do the analysis. But the EPRO vendor doesn't know the study data, doesn't know if that patient has discontinued from the trial. You know if that patient was unenrolled from the EPO, but you don't know if they've been discontinued from the trial. So you need to be joining the interesting questions. So this is where the detectability gets hard. You need to take your study subject status page and your EPRO page and make sure that every patient that's no longer in that trial is now no longer entering EPRO data. Or that you don't have other weird EPRO's like devices that are known to be sitting at a site or now being used by multiple patients at home. It's those types of data checks. And so you need to kind of pick some questions, then you have to explore the data to see what are the useful analyses. And this is where the industry has really gotten stuck. There's been a lot of progress in the last couple of years of getting the data moved over. But once you get the data moved over, that data gets thrown to someone like me in my former role or a PhD statistician, I think very expensive US-based resource to try to get sense of the analysis. The way someone like I do analysis is I open up Jump or SAS or R and I do a one-off analysis, and then say, oh, isn't this cool? And everyone says, Yeah, that's really cool. Now do it every week. And you're like, no, I that's not my job, right? And then I try and I can't. I've done this, right? This has happened. This repeatability is a problem. So what's cool about an agentic system like Study OS is that that analysis now happens inside of a tool that itself can track everything. And so you open up a chat bot, ask a question in English, which by the way, now you no longer need me. Anyone, my boss could have asked that question. Someone sitting in a quality meeting can ask that question. And it takes the data file, it understands the data file on the context. Because you could ask ChatGPT or Gemini, but if it doesn't understand the context, it's not going to help. And we can give some examples, but it understands the context. Write SQL, which is now repeatable. SQL is repeatable, validatable, built for validation. That SQL gives you an answer that can then be run next week. If you see it, dump it to a dashboard, run it over and over and over again. So that gives you a way for the first time to really explore the data, figure out what's useful, figure out what questions you can ask that make sense to ask, which ones you run and you're like, oh, there's nothing there, or I would love to ask that question, but I'm not gonna go tie up someone in either local or a 10-hour time zone difference to run that analysis and get back to me in three days. You can just spin this and query it and explore it. It auditors like Play-Doh. You've got to like get your fingers dirty and play with it and build things. And does it do this? What works and what doesn't work and what's useful and what isn't? It's not something that's so well understood that you can just stack it up and build it. It's not, it's not ready for full-time production yet. You need that exploratory root cause analysis capability. And so you need tools to do that in.

SPEAKER_01: 14:33

And how far are we with the identic AI portion on top of the audit trail? How does it work?

SPEAKER_00: 14:39

Well, before I say anything, I'm gonna caveat that these systems are all rapidly changing so quickly that by the time that some listener might hear this a year from now, all of this could be completely outdated. So I'll just talk about what exists today as of this recording. So a lot of these large language models that people play around with in systems, they're not connected to your study data in general. And so even if they were connected to your study data, which some companies have perhaps set up contracts with that, they're not really purpose-built for the purpose of data analysis. So a lot of these systems, if you've ever heard of the term hallucination, they're very eager to give you what they think is the right answer. And I'm anthropomorphizing a bit, but that's essentially what's happening is that it's trying to predict the the next token or the next set of words that's most likely to satisfy the prompt. And then so in this case, it could hallucinate or come up with an answer because it thinks that that's essentially what they're what you want to have. And obviously that can't work in any kind of regulated system. So the first case is you want you you cannot rely on these systems alone to provide you answers. They're really good at like search engines type stuff, and so you know, for that kind of QA work, it's fine, but for anything that requires regulated workflows, they're insufficient. So I guess the first thing I'll say is that you can't you have to avoid the hallucination port and you need to have it connected to your data. So what we do that's very different is instead of using these kind of AI models to actually provide you a direct answer, we actually just use it to create the code that then runs against your data, and then that provides you the answer. So that distinction is really important because we don't allow it to provide the answer directly, so there's never a hallucination. What it does is it translates it into SQL or in the future into Python or R or SaaS, and that essentially gets run on top of your data. And what does that grant you as an industry? It means that one, it's auditable, you can see exactly what code was run at what time, it's validatable, you can review the code itself so anyone with any kind of experience working with that code can look at the outputs and it's repeatable, which Naham had mentioned earlier, but by virtue of running it in the system, you can have it run every day or every week or every month and be certain that it's the exact same thing running every single time. So I think that these are the kind of key components. Now, the question about how it pertains to audit or anything else like that, the system itself is actually generic in that it can run against any number of clinical data sources and audit log included. It is it is a difficult thing to make this system work, in particular for clinical studies. So I mentioned this in a talk I did for C Disk around their DDF USDM standard, but the kinds of queries, the kinds of prompts that we have in the clinical trials are significantly different than the vast universe of sales and marketing data that that most of these models are trained on. And so the other problem with the generic models is that they're all based on like sales and marketing information. So if you ask the question that is temporal, for example, it'll assume that you're talking about the fiscal quarter or the last 90 days or something like that, which would be completely ridiculous to ask. Like no one cares about how the patient's vitals looked in the last fiscal quarter, right? You know, it's all based on the protocol, and so there's a lot of context that you need to add as well. So there's an element of one, it needs to be connected, two, you need to use code and not so that it doesn't hallucinate. And three, it needs to be kind of purpose-built for the industry so that it can answer questions in a way that makes sense for any sane person working in the in the space.

SPEAKER_01: 17:50

That's uh an interesting description, and also it it makes a lot more sense. And also I'm just identifying and acknowledging that this tool that we can use to figure out what's up with our audit trails has its own audit trail that is audible.

SPEAKER_00: 18:03

There's a s there's a level of introspection there which you can ask questions about it itself, which I think is kind of funny. It's a little meta.

SPEAKER_01: 18:09

Yeah. That's neat. And and also just that's I can suddenly see how this could become a part of the landscape for clinical trials because it is so robust.

SPEAKER_00: 18:19

Yeah, I think the the big thing that comes to my mind is that when I look at the space, there are a lot of subject matter experts. So they know already what they should be looking at. They already have an idea of what issues could happen in the study, right? And so I think there's this perspective that sometimes people have where the tools are trying to replace the expertise. And I don't think that that's the case. I think what's really missing is that people have intent, right? So they know what they want to look for. They want to reconcile this data set or they want to look, they want to examine this kind of to examine the specific data set or something else like that. But the outcome of them getting to that analysis is actually quite large, right? Because most of the time there's some kind of technical gap that exists, right? So maybe they they're they're somewhat proficient at SaaS, but not enough, or rerunning it every single time they lack the infrastructure. So there's like a lot of these little things that kind of exist in the way in in between. And so that's where a lot of the costs are born. And so when I think about what we're trying to do here, it's like how do we use these kinds of tools to really close the gap between a user intent and the outcome of actually putting a new control or a test or a check into the study or you know, exploratory analysis. And so that's kind of the way we think about it.

SPEAKER_02: 19:22

Let me give an an example, Ivana, of some of this temporal stuff to make it English. So the last time you asked your data manager or your central monitor or your RBQM expert to figure out between which visits did adverse events occur? Or between which visits do most patients start taking concomitant medication. So ask your average data person this or clinical trial person this, an operational person would be like, I don't touch the data sets, right? A data manager would say, okay, you want me to do what? So they've got a one Excel file where you've got one tab for each CRF page. You've got your study and you've got your AE page, you've got a data visit page. You're gonna figure out if you're smart pretty quickly that not every patient has an adverse event. So now you need to take your demography page and match it to your adverse event page so that you've got a whole list of all the patients with the AEs that they had. Now you've added null values. By the way, nulls and data are just really, they're zeros. You're dividing by zeros, okay? But wait, there's more than one AE per patient, but there's only one demography record per patient, right? You start to like your brain starts to explode. So now I have to join my demography intelligently to my address event. And then I have to get date times to work. Okay. As a society, we can't agree on European versus US date formats. Computers are even worse about that, right? And and oh, and now you've opened the file in Excel. So Excel has fixed a date time format. You close it and open it in RSS because you've given it to someone who now knows something more about programming and they can't get the date time. Like I've done all this, right? Just getting the data set. And what this tool does is it says, oh, okay, an AE is a log page. You don't have visits attached to the AE's. Often your first person would say, Oh, I opened up the AE page, there's no log, there's no visit dates. I can't do the analysis, right? Yeah. And so it understands what are visits and that you have to open up the AEs and attach that AE to the visit that happened before and after it and glue it in between two places. And then it can run, you know, build out the data file and build the system. Without a tool helping you do this, it becomes a mess, even for someone like me who knows what they're doing, to do this reliably. And to write the SQL, I don't know about you, but I don't write SQL natively. And then and I can't get the date formats to work in SQL either. That's my other problem. I used to work for Kaiser Permanente years and years ago, and I had a sticky note on the side of my computer with like four different systems of SaaS date formats because in Teradata SAS did something different than it did in whatever. It was horrible, right? It was horrible. Just your basic stuff. And that's not adding any value. At the end of the day, what the the person sitting in data management needs to be doing is getting insights, not proving that they can write SQL. Or you build a Cartesian join and your database administrator calls you up and yells at you. Or, you know, right? I've done that to have taken down their systems. Your scratch drive has exceeded tolerance. We're going to shut it down, your job down, right? Like all of this stuff is not, should not be my job. I should be able to ask a question in English, which you can do in Study OS, you know, between which visits do most AEs occur? And it gives you an answer within three minutes now on a moderate size study. Right? It just changes it so that I don't have to worry about getting into my car with a screwdriver set and a tool set, making sure it works, checking all of my ties. This is what it used to take to run a Model T. Right? Back in the days of Model T's, you had to be a mechanic to start your car. Now I just get into my car and you know, if I'm in someplace like San Francisco, I think it drives itself. But you know, it we're getting just more and more towards where the technology is becoming invisible as it should be, so then I can focus on wait, I need to get from point A to point B. I can stop having to worry about every single step that I'm doing, which is what data managers and central monitors are doing now. It's like, wait, do I clutch first and then break? And all they're wanting to do is get from point A to point B and they've lost track of even where they're driving. Getting out of the garage is a problem.

SPEAKER_01: 23:30

This also happens like in a pharmaceutical company when whenever someone from senior leadership or the board or the investors asks a question about the study, an innocent question, and suddenly like there's a flurry of activity in the statistical programming department. Everyone's just trying to figure out which data sets to correlate to answer this question. It sounds like with a tool like this, frankly, if it's the leadership of the company, they could ask this question themselves, perhaps learn the answer, or at least the lives of the people who are supposed to give them the answer, would be much easier.

SPEAKER_02: 23:58

That's what we're trying to do. So we're trying to get it so people can stop worrying about the car and actually worry about where they're going.

SPEAKER_01: 24:03

What is the overlap between what we can now do and should be doing with audit trails and say RBQM?

SPEAKER_02: 24:10

So RBQM was set up to be process control. If you grew up outside a pharma and you saw the early RBQM regs and the early QTL systematic, that language was all classic process control language. So when I first joined the industry, I said process control, no problem. I know what process control is. Process control is only done right when you're measuring data about the process, not the outcome. Okay. All the clinical trial data that we collect is outcome data, the data that goes to the CSR, the clinical statistical report, the findings that we send to regulatory are all output measures. So the output of the trial, they say nothing about the process. And if you try to do RBQM only on outcome measures, and my personal experience, it is extraordinarily hard to find process issues. So I'll give an example from a different industry. If all I care about is whether or not my kids get into a good university, that's an outcome measure, right? And maybe I'll take some interim outcome measures and I'll take their grades, end of year mark, grade point averages at 12 years of school, and I've got 12 data points. That's not going to get me the information I need to know to fix a child who's having troubles in school. I need the process measures. What are the process measures? Are they going to sleep at night? Are they getting exercise? Are they eating meals? Are they hanging out with friends with an appropriate amount of time? Are they hiding in the rooms and never talking to anyone? That might be a good sign if they're studying. But you need those process measures. And it's the audit trail that gives you that process measurement. Now, ideally, I had a senior leader that I worked with once who said, well, if you really want process, you need to go put barcode scanners on everyone's wrist, stickers on everyone's wrist in a clinical trial and make them scan themselves every time they did anything. Well, yeah, we do that in hospitals, we do that in pharmacies. They they print off their barcode and they put it on their tag, the ID, and they they scan themselves in and you know exactly what happened when. But we don't have enough audit trail, but the audit trail is really the first place where we can start to say what happens on our trial. And the data needs to be the basis of your RBQM implementation. So that's the first overlap. The second overlap is that you are not going to query audit trail data. So you find a statistical finding in the audit trail data, what do you do with it? Well, you need to have a workflow. Analytics without a workflow isn't useful because now you've found a piece of information you can spend the next three weeks trying to figure out who to talk to about it. It needs to go into an RBQM workflow where you say, okay, we found something statistically, let's have someone else look at it. What do we think this means? Let's send a CRA to go talk to our site. Let's have these conversations. Can we figure out root cause? Because statistical correlation does not imply causation. Causation can only happen if there's a conversation. Those conversations happen ideally within an RBQ and workflow because it's not, hey, how come you always enter all of your data at midnight? You're not going to query audit trail point. You don't even have a way to query it, and you shouldn't be querying it even if you could. Yeah. And so you've got both it's both it's the workflow, and then I think it's the canonical key data set. And we haven't used it because it's just been so hard to get.

SPEAKER_01: 27:27

And then analyze it. So now finally we're at a stage where we might actually be able to analyze the audit trail and get some clues.

SPEAKER_02: 27:34

We can now get the data, we can push it into a space where we can use it, and there are now finally tools in the environment. Not very many, but they're beginning to be tools in the environment that will give us access to it.

SPEAKER_01: 27:45

I'm curious to hear more, Elis, from your standpoint. How is the industry reacting to tools like yours suddenly being available?

SPEAKER_00: 27:52

I think it's a very early stage. I think time will tell. I think the first thing that I've noticed is that most people, their reaction is that this is something that they've been looking for. It's not as if what we're doing right now is something that no one's ever conceived of, right? Yeah. It seems like a lot of times when you talk to especially folks who've been in the RBQM or the data management space for a very long time, they'll say something to the effect of like, oh, well, like 10 years ago I asked for this very thing, or five years ago I was imagining something like this would exist and I couldn't get it, or or even more recently, we were in a call with someone who said something like, Oh, I would have bought this product except you guys weren't around 18 months ago. And so we're trying to do it now ourselves, which, you know, we'll see how that goes. But because it's not very easy, so not to diminish that, but I think it'll be very difficult. But I think one thing that Hama said, which was pretty interesting, is your point about the clinical study report as being kind of the final outcome. And you know, a lot of what we're looking at are these output variables. We're looking at the downstream of a lot of things. Auditrail is a set of inputs that we can look at. But if you think about it, there's actually a huge universe of different operational inputs we could be examining as it pertains to controlling our studies better that we're not. And I do wonder if in a universe where tooling makes analysis easier, whether or not we'd have a better time controlling things like logistics and shipping, communication, and so on, to really kind of process and hammer down at these kind of variability as we as it pertains to like how we work with the sites, what information that they're getting, when are they getting it, in order to kind of better control these kinds of outcome data points.

SPEAKER_01: 29:26

Very cool. And my reaction is also like, yes, this is totally exists. I'm so happy that like someone built this.

SPEAKER_00: 29:32

And we don't have to If it it feels somewhat obvious in retrospect, right? The knowledge that something that translates you know human language input into a programmable output which can run against data seems very obvious to anyone who's kind of talking about it. And yet despite that fact, I have not come across another company in the space trying to do that.

SPEAKER_01: 29:48

Well, if it would be easy, uh people would have built this 18 months ago already or or five years ago. Well, as we start rounding off this episode, I always ask my guests the same question towards the end. And that is if I gave you the transformation in trials magic wand that can change one thing in the life science industry, what would you wish to change? I think the first is access to data.

SPEAKER_02: 30:07

We need to get to where all of our vendors and all of our systems and tools give me that auditorial data, all of that data that I would want. I I really should be able to start with the data that is being collected that I can't get, and then as everything else goes E, right? E source, e seg, e this, e that, e the other, start to get the rest of that information and then be able to feed it into a place. Because at the end of the day, the best analytics without any data is is a pipe dream. It's not anything else. So it's get me my data without charging me a ton of money because you're an e-pro vendor and you can. Sorry.

SPEAKER_00: 30:45

I don't understand why vendors can charge to get access to the data. I feel like yeah, I I know that'll that'll just open up a whole canal. Yeah, it's wild to me.

SPEAKER_02: 30:55

We need to bring some e-pro vendors in too. Yes.

SPEAKER_00: 30:58

This has nothing to do with what we do in particular, but I think if I could wave one magic wand, it would be for the industry to understand that there's no escaping computers and programming and all sorts of other things. And so as a consequence of that, I wish folks would take a step back and think about what an integration API-driven first approach to running a study would look like. And so the most obvious example that folks think about is things like digitization of the protocol. But that's just one of many, many, many places where if you took a step back and thought, okay, well, let's say we've never run a clinical trial before and we're starting it again in 2025, right? How would the systems look and how would this communication happen? And how would we have interrupt between the systems? There are plenty of companies with more than enough money and more than enough time to actually take this step back and think about it. And I I think that they should have the courage to do that because I think it would have a profound implication on how we run studies today.

SPEAKER_01: 31:57

I completely agree. It would be interesting to imagine it today instead of considering the stuff that we bring with us historically. Great. Well, if my listeners have questions for you, want to learn more, both Ellis and Hamma, where can they find you? How can they reach out? LinkedIn.

SPEAKER_00: 32:12

Yeah, LinkedIn is great for me. You can also reach me directly at email at h-i-r-o-k i at studios.co. Not dot com, couldn't get dot com. I have no idea what the dot com is. Please don't go there.

SPEAKER_01: 32:22

Yeah, the those people who grab the domain names, what are they doing with them? Thank you so much for joining me today.

SPEAKER_00: 32:29

Thank you so much for having us, Ivana.

SPEAKER_01: 32:30

Yes. We might have been done, but we're not. We're we're coming back for another definition that is very important. So Illis, will you explain to us what agentic AI is?

SPEAKER_00: 32:41

Sure, yeah. And to be fair, I I don't know if the industry itself has exactly a singular definition, but I I feel like in a lot of marketing lingo, people like to append the word agentic to essentially make consumers think it's even smarter AI, which I don't think is a really good one. Basically, it comes down to are you delegating some amount of the planning and reasoning to the model itself before execution or not? So it's the difference between are you asking a single question and coming up with a response? Are you saying come up with a plan and then execute on those specific tasks? Right. So I guess the difference between like full supervision versus partial supervision. At the end of the day, by the way, as a consumer, I don't think it should really matter whether or not the system is agentic or not. I think you should look at the outputs it produces and see whether or not it's fit for purpose for your business, because that's really what matters. I think the one caveat with the quote unquote agentic versus non-agentic is because the agentic planning portion is by definition delegating some amount of the planning to the model itself. Your question of you know reproducibility and oversight is going to be a little bit more challenging. So I guess it really just depends on you know where it is in the cycle. But for most of the people listening, it's probably just coming at you as a marketing term, and I wouldn't look too much into it. I don't think it really matters.

SPEAKER_01: 33:53

It does not mean AI plus.

SPEAKER_00: 33:55

It doesn't really mean AI plus. At the end of the day, it's basically the same models being run, and I I wouldn't worry about it too much.

SPEAKER_01: 34:00

That's a great clarification to include. Thank you.

SPEAKER_00: 34:03

Yeah, no worries. I don't know if you wanted to stop the recording, but I can actually explain how it works specifically, what it means within the context of our system, or we can kind of so within the context of our system, for example, the reason it's called agentic is because, like, let's say you ask a question, Nahama had to ask earlier, between which two visits do adverse events occur the most often, right? And so in this case, you could either ask that question and then provide all the context and hope that the model comes up with the right answer and there's some accuracy that's gonna be lost there, or you can say come up with a plan of solving. And so first it's gonna say, okay, first I need to analyze what is like the distribution of visits, and then I'm gonna have to look, and so it breaks down the task into more discrete operations, and then from there, because it's kind of filled its context with all of that inf extra information, then it can go create the final output. And so it's created a plan for how to answer the question as opposed to trying to do it in one single go. And that's the portion of it that makes it agentic. Now, that's not nearly as complex as some of the more fully agentic models where you have uh you have like go research this thing on the internet, and now it's gonna come up with a plan and it has a bunch of tools and it it goes on for 30 minutes or something like that. But the accuracy the more ambiguous the plan is and the more long range it is tends to diminish quite a bit. And so I think from the purpose of a consumer with agentic AI, you really have to wonder, you know, what is the kind of boundaries of the problem that you're trying to solve? And the more scoped and clear the boundaries are, the better the results will be.

SPEAKER_02: 35:26

Yeah, and what's cool about it in the tools, so you can see what that reasoning is. And one of the questions is wait, how do you get humans to learn how to do this type of reasoning? And it provides that first, I've asked a question and it now explains to me, okay. I literally explains I'm gonna go take the demography file and I'm gonna take the AE file and I'm gonna take the visit file and I'm going to do some math on the visit date, on the AE file dates and the visit dates, and it starts to explain to you how you would actually go do it if you did it yourself. And so that's the first entry point to upskilling and learning. Because often when you're learning to do something when you're a kid, you learn to do it by watching someone else do it first. And then if it does something really strange, like it decides that AE's all happen on Tuesdays, you're like, whoa, wait, stop. You're not making any sense, right? You made something up. Or I didn't know you wanted between all visits, so it gives me every AE happens when you visit one and visit eight on an eight visit study. Well, yeah, I know that. That's what you have to tell it. Consecutive visits. I learned that because it was like, how many A's between visit one and two, one and three, one and four? Not helpful, right? So then you can go back and challenge that learning, but it gives you a way to you can really do that human augmentation with a computer rather than human being replaced by a computer. So it's kind of a bionic brain rather than a robot that's replacing you.

SPEAKER_00: 36:52

That last point, by the way, which is to say that these systems aren't perfect, right? And they're and they're not mind readers. So in this case, we do our best when building the system to provide enough information that's relevant to the industry as well as the study to answer the questions correctly. But most of the time, questions themselves are slightly nonspecific. And so in this example, you know, it should it should make sense that you're not talking about like pairwise visits up until visit eight, but you weren't specific about that. And so when you see in the reasoning, you're like, oh, I'm planning on looking at everything between one and eight. You just stop it immediately and go, whoa, whoa, whoa, hold up. Actually, I meant, I meant between three and four, specifically. And you're like, oh, okay, let me restart my reasoning plan and I'll get the analysis to you in a second.

SPEAKER_02: 37:33

That is so helpful, like being able to see what it understands. I think a vagentic AI is interns, right? You have to sit them down. Okay, every morning, what are you doing? Right? You went off on wait, you you're gonna solve this. Here's the problem you solve. What am I gonna do? Really, you're gonna do it that way? No, you're not. Let's go back and think about it, right? They they're they're basically really nice interns that don't get depressed. You don't have to worry about upsetting the intern too much.

SPEAKER_00: 37:58

I like to think about it as like a good programmer who doesn't understand your industry, which is kind of like what we're working on so hard. We're working on it so hard to not make that the case, but at the end of the day, right? Like, you know, you want something that's really good at programming and you know your industry, and so that's kind of the whole sweet spot of you know, you can't do the technical part, it can, but you know, maybe there are specifics that you want to imbue it with, and so that will always make make it better.

SPEAKER_01: 38:21

Also.

Ivanna M Rosendal

Host