From Depp-Heard to Epstein: How eDiscovery Became Everyone’s Problem | Data Xposure - Ep 16 Artwork

Data Xposure: The Podcast for Data Risk Leaders

What’s hiding in your data could cost you everything.

Data Xposure is Exterro’s biweekly podcast for senior legal, privacy, compliance, and digital forensics professionals navigating the ever-evolving landscape of data risk.

Each 20-minute episode dives into the headlines—major breaches, court cases, enforcement actions—and unpacks the real breakdowns behind the news: misaligned policies, siloed systems, and operational blind spots. Through narrative storytelling and expert interviews, we connect the dots between real-world failures and the proactive strategies used by successful organizations to avoid them.

Designed for leaders who want to elevate their influence, sharpen their risk posture, and lead with confidence, Data Xposure cuts through the noise to deliver urgent, actionable insight—with a tone that’s curious, clear, and grounded in reality.

If you're responsible for managing data risk across legal, IT, security, or compliance—this is your front-row seat to the moments where strategy meets consequence.

🎧 New episodes every two weeks. Subscribe wherever you get your podcasts.

All Episodes

Data Xposure: The Podcast for Data Risk Leaders

From Depp-Heard to Epstein: How eDiscovery Became Everyone’s Problem | Data Xposure - Ep 16

May 05, 2026 • Exterro • Season 1 • Episode 16

0:00 | 39:31

The headlines are hard to ignore.

High-profile cases— the scrutiny of celebrity text messages and photos in the Depp-Heard litigation to the release of Epstein-related documents—have turned private communications into public evidence. What was once buried in legal proceedings is now playing out in real time, shaping reputations, careers, and corporate risk.

But these aren’t edge cases. They’re signals.

In this episode of Data Xposure, we explore how eDiscovery has moved into the mainstream—and why it now impacts far more than just legal teams. Because the same types of messages, files, and digital conversations making headlines are being created inside your organization every day.

Doug Austin, a leading voice in eDiscovery with over 30 years of experience and the Editor of eDiscovery Today, joins us to unpack what’s changed—and why so many organizations are still unprepared. From the explosion of collaboration tools to the growing expectations of regulators and courts, he explains how everyday data has become a business-wide liability if it’s not properly understood and managed.

For legal, compliance, and security leaders, this is the shift: eDiscovery is no longer a moment you prepare for. It’s a continuous reflection of how your organization operates.

Because as recent headlines make clear—
the risk isn’t just in what’s exposed.
It’s in what’s been there all along.

Thanks for tuning in to the latest episode of Data Xposure. Don’t forget to subscribe so you never miss an update. For show notes, resources, and to connect with us, visit exterro.com/data-exposure-podcast/

SPEAKER_00 0:07

In the Depp Heard trial, private texts, photos, and voice notes didn't just support the case, they became the case. Jurors, media, and millions watching weren't reacting to legal arguments, were reacting to raw, unfiltered data. And the Epstein proceedings, years-old emails, contact records, and flight logs resurfaced. Data that had been sitting largely untouched until suddenly became central to the public scrutiny and legal action. This is e-discovery, not a back-end legal process, but as the mechanism turning everyday data into public evidence. What used to stay buried in legal workflows is now unfolding in real time, shaping reputations and exposing risk far beyond the courtroom. And here's the question every organization should be asking right now. If you had to turn over your employee messages tomorrow, would you know what's in them? Because every message your team send, every file they share, every system they use, it's all part of a record that can be reconstructed, reviewed, and scrutinized. Welcome to Data Exposure to you by Xtero, the leader in data risk management software. This is the podcast for data risk leaders, where we can cover how data becomes both your greatest asset and your greatest liability. I'm Mike Hamilton. Today, we're joined by Doug Austin, one of the most respected voices in eDiscovery. Doug brings over 30 years of experience helping organizations navigate legal data challenges, and he's the editor of eDiscovery Today, where he's been publishing daily insights since 2010. And a quick note, Xtero is a proud sponsor of eDiscovery Today, bringing practical, real-world insight to the professionals managing data risk every day. Today we're talking about how eDiscovery went mainstream and why, whether you manage it or not, you're already a part of it. Doug, thanks for joining Data Exposure. Before we dive into everything happening today, take us back a second. How did you get into eDiscovery in the first place?

SPEAKER_01 2:08

Well, before there was an e-discovery, uh, I got into what was known as litigation support, where we dealt with paper documents. And I got into it so long ago uh that the company I got into was Price Waterhouse, which was a big eight consulting firm then. Now they're big four and they're PWC. Worked on a large litigation as one of my first uh couple of projects, where we built a couple of different databases, attorney work product database. We also built this cool app because a lot of the documents of the uh evidentiary documents were on microfilm cartridges. We actually created this program where you could do a search in the database, identify documents you wanted to print out, send it to the app, it would tell you what cartridges to put in, and you'd get it started, get it spooled up, and then hit go, and it would it would spool to the pages and print out the pages you needed. And to me, the ability to use technology to do things like that and really drastically shorten the time to get to the evidentiary documents you needed was fascinating to me. And from that point forward, being in the legal industry and and litigation support and eventually then eDiscovery in the uh early 2000s when really electronic evidence became so predominant, just has become something that has always uh uh fascinated me and I've really enjoyed every minute of it.

SPEAKER_00 3:31

That's great to hear. And and you are one of the I don't want to say founding fathers, but you are one of those those very well-known people in the space. You carried the torch for the community to really give it that visibility that probably hadn't been getting 10 years ago.

SPEAKER_01 3:48

What kept you in it for those 30 years? Like a lot of e-discovery professionals, you fall into it, and that was the case with me with that project at uh uh Price Waterhouse. I, you know, I went to to school. I when I went to college, I wasn't sure what my major was going to be until I took my first business computer class. And that's when I realized I want to use computers and technology to help solve business problems. I got a business degree with uh a concentration and management information systems, and so that was what I knew I wanted to do. And then once I had a first legal project, I was like, these are as as serious and as significant uh business problems as there are. When you're dealing with litigation investigations, what have you, this is the type of stuff that I went to school for. And that really became what I wanted to do from that point.

SPEAKER_00 4:38

That makes sense. Well, let's dive into the podcast topic. The headline cases that we are trying to draw people in on are the Depp Heard case and the Epstein files. Let me give a quick overview about these two cases. The Depp Heard defamation trial was a highly publicized legal battle between two famous actors centered around allegations of defamation tied to claims of abuse. The case drew massive public attention with proceedings broadcast widely and debated very aggressively on social media in other platforms. And then the Jeffrey Efsang case involves a sprawling investigation into a financier's accused of operating a large-scale sex trafficking network with ongoing legal actions and document releases continuing to reveal the scope of individuals and institutions connected to the case. Both became global news stories, not just for their legal outcomes, but for the broader questions they raised about accountability, transparency, and how information services in high-stakes situations. Doug, I don't know if you followed these cases as they were coming out. I think it was hard not to, but eDiscovery probably doesn't pop up in people's minds when you mentioned those two cases to them or just to anyone, but I'm guessing it did for you. Why was that the case?

SPEAKER_01 5:57

Well, really, at this point, just about any case like this, I think about electronic evidence first. I mean, I can't I can't watch uh true crimes uh uh, you know, uh like 48 hours or dateline or any of those anymore without being interested to find out what sort of electronic evidence are they gonna have that's going to ultimately lead to identifying the murderer or what have you, because we have such uh an extensive digital trail that um these cases almost always today involve some sort of electronic evidence that leads to um identifying who the killer was. These cases are they're completely different from that, but they still involve a lot of electronic evidence. I mean, Depp Bird involved a lot of videos and pictures and text messages, the Epstein um investigation, government investigation, a huge amount. I think something like three, nearly three and a half million documents, um uh 180,000 images, 2,000 videos, massive scale, a lot of electronic evidence, whether it's true crime, whether it's uh basically a defamation case, which was a divorce case, but just high profile and defamation issues coming into play, and then a huge government investigation, it still involves electronic evidence. And for me, that's one of the things about first, and about the issues that you deal with. And the Epstein case, privacy is a huge consideration. And one of the things we don't talk about enough is the importance of redacting evidence because this being such a high-profile case, a lot of this evidence has been made available to the public, but they had to make sure to redact the um identities of the victims in images and videos and documents, what have you. That's got to be a huge undertaking. At the Deaf Heard case, it was not just about the evidence, what it appears, but also the underlying metadata associated with that evidence, which came into case uh, you know, it came into discussion in a couple of instances in that case. All those types of cases all have those kind of considerations. And that's that's what makes e-discovery interesting. It's just you know the same thing time after time. The evidence is different, the issues and the and the requirements of the case are different, and how you handle the evidence is different.

SPEAKER_00 8:10

To me, both those cases brought out a couple different themes that I'm curious to get your feedback on. With the Epstein files and the production, were you surprised that things weren't redacted?

SPEAKER_01 8:23

Unfortunately, they're more common than we like to think. In fact, the redaction flood type of cases we've seen, in addition to this case, where, mind you, they did have a monumental task of trying to make sure everything was redacted. And it's probably understandable. Some you know, instant instances were missed, which is horrible because that then puts people's names out there in the public, which is not which is a horrible thing. But I mean, I've seen I've just recently covered that uh a case where once again somebody, you know, put a document into uh Adobe with the, you know, where they they did an overlay type redaction, and you know, people know uh that you you can just like highlight the text and find the text underneath that overlay. That's a 101 type of redaction, 101 type of error, and people still make it all the time. It doesn't surprise me. And unfortunately, it it just continues to scream the need for really good uh understanding of e-discovery fundamentals and things like understanding what you know when data is properly redacted and how to do it, and make sure that you can confirm that it is redacted.

SPEAKER_00 9:32

Yeah, and it seems like there's such a bigger microscope put on these e-discovery type activities now, but I don't think the legal market's making that connection when you're out there talking to other people that they are really understanding these connections and how taking e-discovery maybe a little more seriously is warranted.

SPEAKER_01 9:52

There's two type two groups of people there's the e-discovery bubble, or the the people who really have taken e-discovery seriously and tried to learn about it and learn the best practices and what have you. But then there's um an extended legal community. There's an extensive group of folks who don't know that much about e-discovery. There's 1.3 million attorneys in the US. And so when you think about it, not all of them are going to unfortunately understand the best practices. And it's just as important for the small cases as it is for the really big cases. There's still electronic evidence involved, there's still best practices that come into play. We're making headway. We're always, there's always more work to do, which is why, you know, all the things that Xtero does with the education and and and uh, you know, what I try to do and what others try to do, every bit, every little bit helps. It's going to be one of those continual battles. We're going to continue to try to educate people and hopefully minimize that group that doesn't know those best practices and fundamentals that they need to know.

SPEAKER_00 10:55

Doug, if you were an organization, what data would suddenly be in scope that most teams aren't thinking about? This really showed up a lot in the Dep Heard case. Even though this was between two actors, there's some data that they probably didn't think was in scope or that could be found.

SPEAKER_01 11:13

There were the communications between them and things they used to document what happened in their marriage and what have you. Obviously, text messages. The couples text each other all the time. Some of those text messages were at issues. That basically puts mobile device data at issues. Also, because every mobile device has a camera, um, a lot of pictures, a lot of videos. That was a very mobile device intensive case because that's where a lot of communications, a lot of our activity is spent these days. Many of us uh get like a weekly report of how much time we're on our device each day, and it's usually multiple hours for just about any of us. We live on these devices, and that case was key. What uh was also key is the underlying data associated with some of this, the metadata. You know, I wrote about the case um, you know, a couple of years or so ago when it was going on because one of the notable things about with that uh famous Amber Heard picture where she basically showed bruising was the metadata showed that it had been saved in a photo editing program called Photos 3.0, which of course doesn't guarantee it was edited, but it doesn't look good when it's saved in a photo editing program. That's the type of stuff, the that underlying metadata that goes with the evidence, that case really highlighted that that uh consideration more than we see in in cases now. And then of course, these days, that case was probably a little uh early. Uh but we're seeing more and more cases involving AI-generated evidence, chat logs um with chat GPT or Claude or what have you. You can't it seems you can't almost get on a Zoom uh meeting these days without having somebody throwing out their recorder note taker. I even do Zoom webinars and they pop into those. And then all the artifacts you get from a co-pilot. There's always new types of data, new types of evidence we have to account for, which I guess gives us job security, but certainly keeps us up at night too.

SPEAKER_00 13:12

100%. And I guess it depends on what forum you're in, but if you're in the civil context, some of that may be unduly burdensome. But I guess if you're in the criminal context, it's fair game, right?

SPEAKER_01 13:23

Sure. Yeah, absolutely. Each case has a different kind of uh consideration as to what evidence might be important and might be proportional. A case like that heard where I say mobile device data is uh probably a majority of the relevant data involved in the case. That's where you have to get into that. Whereas maybe in a corporate case involving communications, maybe those mobile devices are less important. Plus, then, of course, you might have BYOD considerations where the organization might not even be responsible for those devices. Now, on eDiscovery today, we've covered a few cases where they weren't found to have possession custody control of BYOD devices because they had a policy in place that established their limitations. And in a couple of the cases, the party also subpoenaed, third party subpoenaed the employees for those divide data on those devices. One case they didn't and paid the price when the court said that wasn't uh you know within the uh possession custody control of the defendant. You get those types of considerations in the different types of cases. One of the cases that uh tends to be big from a uh forensics consideration is IP cases. An employee leaves an organization, either goes to a competitor or starts their own company. Did they take data? Did they not take data? You get into forensic examination of you know what uh you know the their existing devices that had they had had at the organization, new devices that they have, that could be mobile devices, that could be computers. The different types of cases involve different types of not only data you go after, but how you go after it. That's one of the things that continues to make eDiscovery such a discipline. That's the word I'm looking for, such a diverse discipline because case to case, the data and and how you go about it getting it is always different.

SPEAKER_00 15:13

One question I have as a fall to that, Doug, we've talked a lot about mobile and some of these new data sources. We saw this shift when we went from paper to email. Practitioners kind of getting up to speed on how do I actually collect email? I'm used to paper. What are you seeing from a practitioner perspective on their ability to effectively collect these new types of data? And how is the court really treating that?

SPEAKER_01 15:40

Well, uh uh it's interesting because we have so many different uh collaboration and communication apps out there now. So we used to print it primarily communicate through email. Now we communicate through uh text with our mobile devices, we communicate uh within our organizations, within Slack, within Teams. A lot of organizations have both, Slack and Teams. There's just so many other different collaboration apps with data that you know you're sharing and working on and what have you, and all these cloud cloud-based systems where that data is, you know, has has basically become data that we're pointing to instead of just putting into a communication or what have you, which of course then means that underlying data can change from when the person saw it to when that communication was sent. These are the types of issues that we're struggling with now. And just like when email first came along and we struggled on how to exactly work with that in the best way possible, and platforms evolved over time and we standardized workflows. We're starting to do that with uh um a short message format type of system. So with mobile device data, with Slack with Teams, we're standardizing workflows. Platforms have APIs to go get that data and do it in a standardized way to upload the data, to have it be in the best format possible, uh, to then work with and and make you know determinations and and what have you in discovery. The thing is, is it's you're always feeling like you're chasing something because just as soon as you get these platforms uh you know established a workflow for, along comes a new platform, and you've got to do the same for it. It's a never-ending battle, and you definitely have to be um you know working with a platform that can be flexible and can and consistently stays up on you know what the common data sources are out there and provides you know a standardized workflow and automation capability to get that data in in a useful format uh when you're dealing with it in discovery.

SPEAKER_00 17:44

A big point that the listener should really take seriously is this is all within the discovery parameters. There isn't a, well, it's too hard for me to get to that, or I don't know how to do that. Would you agree with that, Doug? That even though there's all these data sources and it's becoming increasingly more difficult to keep up, there is still that duty to keep up.

SPEAKER_01 18:09

Absolutely. I don't I'm not sure if I addressed the court, the the the court part of it and your previous question, but yeah, plenty of cases now, plenty of case law where parties have been required to go after that data, um, to go after the Slack data, the Teams data, what have you. So, yeah, this is something courts are expecting. It is again Rule 26B1, federal uh rules of civil procedure deal with proportionality, and you have to establish the proportionality of it. But in many cases, more and more cases, that data is becoming the critical data in the case. Then it's really pretty easy to make a proportionality argument that yes, this is proportional to the needs of the case, and you've got to have a way of uh going through it and producing responsive data from these systems in cases.

SPEAKER_00 18:57

And in the Depp Heard trial, context really drove impact. The threads, tone, timing. How does eDiscovery reconstruct that kind of narrative? Can you give me some insight into what you're seeing with legal teams and how they're constructing this narrative? Is it still very manual? Is it automated? But as we saw, the context means everything. These little bits and pieces, what you just touched on about the photo editing software, case in point, it all matters.

SPEAKER_01 19:29

Absolutely. There's technology that helps uh with that. With AI, we're beginning to see the ability for it to do things such as sentiment analysis to determine when people are angry or or frustrated or what have you that can help with context. And it's also, again, sometimes the metadata that can that can impact that. It's probably a little bit of technologies helping us and then understanding how to use the technology and then supplement that with best practices. There was a case a few years ago, speaking of context, where that involved fabrication of a text message, the Rossbach case. This um uh lady was suing her former employee for sexual harassment or employer uh supervisor, had what looked like an incriminating text message. Turned out that she produced an image of it, and the image she produced had a heart eyes emoji associated with it. Reportedly, according to her, it came from her iPhone 5, and it was determined that that particular version of the heart eyes emoji wasn't possible with the iPhone 5 because it didn't run iOS with version 13. That's the case of the smoking emoji. Emojis are an example, a great example of context. You say a statement one way and it can be taken as like a threat, and then you could put a smiley face and it's taken completely differently. That's another example of how um data and uh you know things like uh emojis and things like uh sentiment analysis can be used to really get a sense of the context of the evidence, not just valuing it at face value, but using these different kinds of cues to determine just exactly what that evidence really means.

SPEAKER_00 21:15

Let's move over to the Epstein-related documents real quick. A lot of older data resurfaced years later. Now don't want to get into the legality of like should that data been kept or should it have not been kept? But let's apply this to an enterprise specifically. For enterprises that are trying to reduce liability risk, what does it say about data retention and long-term exposure? And how are companies underestimating really how long data sticks around?

SPEAKER_01 21:44

Yeah. Um, a case like this, because you're not talking about a huge organization, uh you're talking about a financier who I'm sure had you know a number of contacts and and what have you, three and a half million pages, that kind of indicates there was likely not much. Of a data governance program that they operated under. And absolutely, that's one of the things that you have to keep in mind because, you know, organizations that don't treat data governance seriously and establish retention periods for their data and what have you. Not only does that data potentially become subject to litigation, discovery, and litigation, but it also is data that could be exposed in a cyber attack and things of that nature. There's so many reasons to have your data house in order. And organizations have a lot better understanding of that now than they used to. And they're using technology to help with that. And for a lot of the systems they use, they're implementing things like automated deletion when a retention period expires, as well as what's important is once you have something like litigation, suspending that and automating the legal hold process or hold in place or what have you. That that's all part of that um equation that has to be considered. I definitely think that organizations understand a lot more, but there are challenges. You know, one of the big challenges we have in organizations today is shadow IT and shadow AI. People using systems that aren't approved and creating data that could be discoverable and they're not on top of it. It's always a challenge of trying to make sure you are understanding what people are doing. And that's one of the things that you have to do up front. You have to not only make um establish a data governance program and establish retention periods for your data, but you have to really take steps to try to understand as much as you can about your data landscape out there and what people are doing and making sure that you're aware of the risks that are associated with it.

SPEAKER_00 23:46

Doug, if we remove the celebrity factor, how similar are these scenarios that we talked about within DepHerd and Epstein to what enterprise enterprises deal with within litigation, within internal investigations, within subpoena requests? How similar is this?

SPEAKER_01 24:04

Let's take the Epstein one first. If you took the public aspect away from that one, obviously if this were a case that involved the same kind of topics, you would still have redaction requirements, but you probably have something that would be, you'd have some protective orders in place, you'd have some certain things to try to, kind of attorney's eye with attorney's eyes only and what have you. Yes, you'd still want to redact and do the things you can do to protect privacy, but when it's between parties, it's not as big an issue as it is something that's so public-facing as this case. That's definitely one consideration that I would say would be different is the public aspect of just what's out there and how important it is to maintain privacy and and uh you know manage that data. Um in Deaf Heard, but that really without the without them being famous actors, it's pretty much almost like a divorce case. You know, and we see you know those types of cases. There's there's probably hundreds of thousands of them every year. There are, you know, certain types of data, you've got to you've got to make determinations. Parties might be trying to make the other one look bad, or it may just be like the data that involves what dividing assets in a divorce or what have you. Um those those issues remain the same. And the considerations of evaluating that and making determinations, I don't think that one differs as much because a lot of those considerations would be the same. If there was a defamation case involving two people who weren't widely known, I'm sure you'd have some of the same things that just wouldn't be on court TV being shown every day because that's a that's the cases like that happen all the time.

SPEAKER_00 25:42

Yeah, in just like an employment-related matter, right? It's kind of a he said, she said almost in the same fact. You have a company saying something, you have an employee saying something, we're gonna try to get the context of what really happened. How did this get out there? Or why did it take them so long to review this many documents? We saw a lot of that with some of the back and forth with the government in reviewing certain files and how long that discovery would take. I remember them saying it would take months, and you and I both know if you have the right amount of technology, you can really boil that down and um and streamline that.

SPEAKER_01 26:20

Absolutely. The Epstein case, a lot of videos, a lot of images. We really didn't until fairly recently have the technology where we could get a better understanding of what's in those quickly. We have uh transcription capabilities, AI-based transcription, so we can quickly figure out what's contained in videos that might be useful to us. From an image standpoint, one of the most under-discussed capabilities of AI that really is a significant benefit to eDiscovery professionals is the ability for AI to look at an image and tell you what's on it and give you the ability to very quickly kind of do a triage of which images can I just cast aside immediately and say, okay, these aren't going to be important to look at. I won't even have to open them up. But these might be. Um this was probably a couple of years or so ago when ChatGPT added their image capability. I just took a picture of my cluttered uh kitchen counter and and it basically identified all the items that it saw right down to the Nutella jar and stuff like that. It's amazing just how good the technology is these days and how um you know you can apply it. They may have applied it here. The stakes were very high, so they had to still make sure they got the job right, but there's really no excuse for not exploring applying technologies like this and applying AI to transcribing videos and understanding what's in them, identifying contents of images, and using that to help streamline that process. It just seems like a no-brainer to me.

SPEAKER_00 27:50

Let's dive into AI a little more. You brought it up a couple times already, but for you, what's actually standing out in the legal space when it comes to artificial intelligence? What do you feel is delivering real value for legal teams today? And what do you feel like is a little bit overhyped?

SPEAKER_01 28:09

We can't do this without talking about AI, right? So what's not being talked about enough is all the different use cases you can conceivably apply AI to. In the last couple of years, I've done the state of the industry report and I've asked a question: what use cases are you applying generative AI to? And you know, things like document classification for relevancy classification, document summarization, ECA, PII and privilege identification, stuff like that. And people are using it for those things. And that's one of the things that is so unique about generative AI and the possibilities, is that the possibilities are really endless. I continue to hear all the time about how it's being applied in ways that I'm like, hey, I hadn't thought of that. That makes that's a really cool application of the technology. I've heard of the ability to do things like do a prompt and say, go find financial documents and not only give me the data back, but put in a little table that I could throw into Excel, stuff like that. Really amazing stuff. There's certain no-brainer use cases like using it for ECA where there's not a defensibility burden. It's just basically trying to get a better understanding of your collection and things like that. Probably from a standpoint of document classification. I think people, just like was was the case with TAR, people are all up about how can we make sure it's defensible and what have you. The same best practices that applied to TAR apply to Gen AI in terms of the ability to validate the results and what have you to ensure that you've got a defensible result. One of the things that has to be said about the technology is it's not an easy button, and a lot of people want to take it that way. It's the technology capabilities are great, but you still have to apply best practices along with that technology to use it the right way. And that's what I is probably not being talked about enough. People are hyping the capabilities, but they're not hyping the importance of understanding how to use those capabilities in the right way.

SPEAKER_00 30:08

Definitely. I want to piggyback on that point and talk a little bit more about AI governance. We've talked about hallucinations and where is that data actually going? When you prompt something within an AI tool, we've seen cases. You probably can cite the case for me, Doug, but we've seen cases where it was from one of the big tech firms, they uploaded some code into one of the chatbots. And now is this in the public domain? Can anyone use this, right? So talk to me a little bit more about AI governance and what you feel are the things that attorneys and legal professionals, because they're a little skeptical. What are those things that they need to be watching out for when they're looking at what AI they should be using and how they should be constricting their use and ultimately the enterprise's use of AI?

SPEAKER_01 30:55

Yeah, we'll take hallucinations as an example. We've seen, I think we're up to what um well over 1,300 cases, hallucination cases in the Damien, yeah, Damian Charlatan database. And some of these have been from major fur when it comes from an A AI governance standpoint. There's education, and people need to understand what these tools can and can't do, and understand the importance of verifying the results, because that's always an important thing to do. But they also need to standardize it within their workflows because if you know it, but then you know what happens. Sometimes people get in a hurry and they cut corners, they skip steps, and they were like, oh, got a got got a deadline, gotta get this out. They just skip that step and then you wound up with submitting a filing that has hallucinated cases in it or hallucinated facts or what have you. It's the workflows as a part of that, making sure that you're standardizing and ensuring that you follow those standardized workflows. I also then think you have to have policies about how to use these tools and and when and what's acceptable and what's not acceptable. Because yeah, I'm sure there's plenty of organizations there, especially with shadow AI on the rise, where some employee has probably loaded some uh sensitive documents up there to try to use uh to get ChatGPT or Claude or what have you to analyze those documents and provide some feedback on it. And that's definitely not something you want it want to have happen. We just had a case recently where the court basically said the parties can't load any documents involved in the litigation to public LLMs like ChatGPT or Claude. They can only use closed-in type of platforms like eDiscovery platforms that have AI capabilities for that. That makes a lot of sense because you have to be careful about you know where you're putting that data and what you can do with it. Organizations have to be uh mindful of that. A very standard uh, you know, aspect of litigation today is protective orders or other agreements between parties that says you won't take the data I produce to you and upload it in a public LLA. Parties have to protect themselves against that because that's something that wasn't a factor uh just as l as much as a couple of years ago. But now it's something that probably happens pretty commonly. And a lot of organizations and a lot of legal teams have said we've got to make sure that that's an understanding we establish up front in a in a protective order, in the new side protocol, or even just communications with the parties.

SPEAKER_00 33:28

Yeah. And looking at AI that is built with those parameters in place, not leveraging something like uh Chat GBT, but upgrading to the business license. Anything you upload is going to be within or behind your firewall. I do want to give a shout out to the e-discovery community because I feel we've been at the forefront of AI. Go back to the Silva Moore, that was in what, 2012? And Judge Peck's been talking about this for almost 15 years. And even though it's might not gen AI, it was still considered back in the day artificial intelligence. So for anyone that's speaking on it from an e-discovery perspective or has been in the industry for a long time, kudos to all of you for being at the forefront of all of this.

SPEAKER_01 34:12

Yeah, and I mean tar, yeah, tar is is another form of AI. We use we we genericize AI, but there's so many different ways and applications of AI. You know, what we had back then, what we have today. But yeah, absolutely. We've been using AI as long as anybody has in the legal communities. And eDiscovery has been at the forefront of that.

SPEAKER_00 34:34

But just have a couple more questions for you, Doug. Thank you for sticking with me. It's been a really great conversation. Let's look ahead. We've seen in the coding computer development world, AI has been coding. Uh, listen to a podcast episode where Anthropic is using AI agents to code, but then they're having AI agents supervise the coders, and they have supervisors to the supervisors that are agents as well. It seems like things are moving in that direction. Let's talk about in the context of legal e-discovery, risk mitigation, all these different types of legal workflows. What does autonomous AI in in these formats look like? And what are you most excited about? And what are you worried about?

SPEAKER_01 35:18

From an e-discovery and legal standpoint, autonomous AI looks like a managed process where you have certain uh uh certain processes or things that are repeatable that you can apply autonomous AI to, but with human humans in the loop, human-managed process. I mean, take case in point, you could you could say that the automated classification of documents to determine whether to identify whether they're responsive or non-responsive, or at least make a recommendation as to whether that's a like an that's that's an agenic process essentially, but it's human-managed. You're providing the a lot of people's best practices is prompt iteration. You create a prompt, you run it through a sample of documents, you evaluate it, you make some adjustments, and sometimes you have multiple iterations where you get the prompt where you want it to then apply to the document collection. Then you still validate on the back end. And a lot of these are human-based processes around an automated process. That's always got to be the case when we're talking about legal because the stakes are so high, and you know, technology is ever going to be perfect. You still have to have some processes in place that you know you can defend your approach on in terms of how you've arrived at the data that's being produced, or the data that you've identified for redaction, or whatever the case might be. And so to me, it's always a mix. You know, there are certain processes you can identify and and pinpoint and say, this is a good, this is we can apply an agentic process to this. And really, it's not so much different from before AI. We still had agents or or things that you know automated things that we did, we applied to various processes and and took certain steps that were automated. Now we're just doing it with AI. The key is understanding what it will do, making sure you uh you're prepared for the results and don't take any output as face value. Make do some validation checking on it, make sure it's a defendable result that you can stand by.

SPEAKER_00 37:22

Last question for you, Doug. If someone listening today was looking to reduce their legal risk, what should they go check on tomorrow?

SPEAKER_01 37:30

That's an interesting one because uh I want to I want to say a couple of things, but uh certainly when it comes to reducing legal risk, understanding your data. We talk so much about the technology, but technology can't work without the data. The more you understand about your data, the better. That's you know, having solid data governance policies in place, having technology to support that, continuing to try to stay up on what you know the people in your organization may be using with shadow IT and shadow AI challenges, there may be data sources out there in your organization you're not aware of. I would say that's probably step number one. And then step number two is have processes and plans in place on how you handle litigation, investigations, what have you, before the case is filed, before the investigation's launched. Because once you're trying to figure it out after the case has been filed, you're already behind. You have to be prepared for well, how would I handle a case like this? And what processes and procedures do I need to have in place so that when that case comes along, I know step number one is look at these data sources, look at these potential custodians, set legal holds in place, all that sort of stuff. You know, you need to have those defined up front, not once the case starts. So that I would give you two, not just one.

SPEAKER_00 38:50

Well, Doug, it's always great talking with you. One of the great minds, a knee discovery. Thank you for taking time to chat with me and be on the data exposure podcast.

SPEAKER_01 38:59

Thanks, Mike, and great talking with you as always as well. And and thanks for having me.

SPEAKER_00 39:06

Doug, really appreciate you joining us and to our listeners. If this made you think twice about your data at your organization, share this with someone on your team because whether you manage eDiscovery or not, it runs on your data. This is Data Exposure, the podcast for leaders managing data risk before it becomes a reality. Subscribe wherever you listen, and we'll see you next time.