 
  Connecting Society: How everyday data can shape our lives
Ever wondered what happens to all the data collected about you by government and public services? Whenever you sit a school exam, visit your GP, go to court, or pay tax, a wealth of information is created to help these services run. But how is this data used, and could it actually improve your life? 
Connecting Society explores the fascinating world of administrative data, showing how this valuable information is de-identified and used securely for research to inform better policies and support communities. 
Through conversations with experts from government, academia, community organisations, and the public, we reveal how linking data and making it available for research can uncover solutions to real-world, interconnected challenges - from improving health outcomes to tackling inequality and more. 
Join our hosts, Mark Green, Professor of Health Geography at the University of Liverpool and ADR UK Ambassador, and Shayda Kashef, Senior Public Engagement Manager at ADR UK, to discover how the data shaping your life could also help shape the future. 
Connecting Society is brought to you by ADR UK (Administrative Data Research UK). Find out more about ADR UK at https://www.adruk.org/, or follow us for updates: https://x.com/adr_uk | https://www.linkedin.com/company/adr-uk/. This podcast builds on a pilot series known as DataPod, produced by ADR Scotland.
Connecting Society: How everyday data can shape our lives
2. Justice in the age of data
In this episode, we examine how data collected by the Ministry of Justice (MoJ) is being used to drive positive change in the justice system. Our discussion explores the types of data the MoJ collects, why it is collected, and how de-identifying, linking, and sharing this information securely for research can reveal new insights into the justice system.
Amy Summerfield, Head of Evidence and Partnerships at the MoJ, shares how data linkage programmes like the Data First initiative aim to address issues such as reoffending and improve the efficiency of justice services. We also hear from David Maguire, Project Director of the Building Futures programme at the Prison Reform Trust, who sheds light on the realities faced by people in the justice system. From the probation system to outcomes for defendants, prisoners, and the wider public, David highlights gaps in understanding and what changes are most urgently needed.
Through real-world examples, this episode demonstrates how using administrative data can contribute to better outcomes for those in the justice system, support for justice personnel, and a more efficient and effective system overall.
Wondering what administrative data is? Visit https://www.adruk.org/our-mission/administrative-data/.
If we used any terms you're not familiar with, check out ADR UK's glossary at https://www.adruk.org/learning-hub/glossary/.
For information on Data First go to https://www.adruk.org/our-work/browse-all-projects/data-first-harnessing-the-potential-of-linked-administrative-data-for-the-justice-system-169/, or for information on MoJ datasets made available by ADR UK: https://www.adruk.org/data-access/flagship-datasets/?tx_llcatalog_pi%5Bfilters%5D%5Bwork%5D=800&cHash=c420033b8cba2bed85ac90343d2aeab9.
Connecting Society is brought to you by ADR UK (Administrative Data Research UK). Find out more about ADR UK on our website, or follow us on X (formerly Twitter) and LinkedIn. This podcast builds on a pilot series known as DataPod, produced by ADR Scotland.
Shayda: Hello, and welcome to Connecting Society, a podcast about how everyday data can shape our lives. I'm Shayda Kashef, Senior Public Engagement Manager for Administrative Data Research UK, or ADR UK to me and you.
  
Mark: And I'm Mark Green, Professor of Health Geography at the University of Liverpool. We are your co-hosts and guides around the wonders of administrative data.
Shayda: Mark, can you tell our listeners what's in store for today's episode?
Mark: Today, we'll be diving into the world of crime and justice and how data might be used to transform people's lives. Think somewhere between CSI Miami and The Bill. ADR UK funds a lot of research in this area, right?
Shayda: That's right, and for ADR UK, a lot of it is done via the Data First programme. I remember drafting the press release to announce the Data First programme in 2019 and the data space has changed a lot since then. The way it used to work is that the individual courts within the justice system collected administrative data, but they didn't really talk to each other, so it was hard to map out patterns of court use as people may enter and re enter the justice system for a variety of reasons. We've got the family court, Magistrates Court, Crown Court, the probation system, the prison system. What Data First allows us to do is link up these datasets in a responsible way so we can answer some of the most pressing questions, such as those related to racial injustice, substance abuse, rehabilitation and so on.
Mark: And with us today to help explain what some of the most pressing justice system issues are and how we might use data to resolve them, are David Maguire, who's Project Director for Building Futures for the Prison Reform Trust charity, and Amy Summerfield, Head of Evidence and Partnerships at the Ministry of Justice and project lead for the ADR UK funded Data First programme. 
Hello and welcome to the podcast, both of you. 
Amy: Thank you very much for having us.
Mark: So, before we get stuck in we'd like to get to know you a little bit better. So we set you a little task. So can you tell us what's your favourite statistic? Amy, would you like to go first?
Amy: Oh, my goodness, I mean, some of the justice system statistics, I can't really describe as a favourite, because, you know, these are people in vulnerable situations, sort of in, it's some of the most challenging times of their lives. What I'll say is someone said to me when I first joined the Ministry of Justice, something like 61% of statistics are made up on the spot. So that's what I'm going to say, is my favourite statistic. I'll keep it light, and I made up the 61% to add the layer to the joke.
Mark: That’s brilliant. David?
David: So Mark, I'm hoping by the end of this podcast that I'm half as excited about data and statistics as you are, a favourite statistic, a question on a favourite statistic, which, as Amy indicated a favourite statistic is hard to think about. There's just so much data right there, and I think maybe some of that will come up in this conversation, but maybe a starter for 10 would be, it's not a favourite, but I think it's an important one, and it's from the Justice Committee's "public opinion and understanding of sentences", and it tells us that the average person’s sentence has increased from 14 and a half months in 2012 to 21.9 months into in 2021. However, when asked, almost a third of respondents to the justice committee survey believed that the average prison sentence lengths had become shorter, of which 9% thought they had become a lot shorter.
Mark: Excellent. Well, that's, I think that's a good introduction. 
So we've got the hard question out of the way. So let's go into something a little bit easier. Amy, can you tell us a bit more about the data the Ministry of Justice collects and links together. What do you what do you routinely collect, and what was the ambition behind making all of this data available to researchers? 
Amy: When people interact with public services, any public service, that public service collects data on you or your case, for example. So when we go to the doctors, they've got our name, our date of birth, our medical history. So all public services collect a wealth of information like that, and up until now, and probably still now, it's fair to say it's vastly underused, that amount of data. Shayda indicated at the beginning, you know, it's collected on siloed systems. They don't often talk to each other. It can be fragmented.
And in the in the justice space, it's no different. So magistrates courts didn't necessarily talk to crown courts. And I'm saying “talk to”, I mean information sharing. What we are doing through the Data First programme is linking the datasets from across the justice system so we can get a more comprehensive picture of who's interacting with the justice system. So that covers information that is collected for operational and administrative purposes, not specifically for research, but collected at the courts. So criminal courts, family courts, civil courts, through to prisons and probation service. That's the data we collect as a business as usual basis.
But through the Data First programme, which is an investment by ADR UK, we've been able to create unique identifiers through those datasets and link cases and people across the system. So as people come into the Magistrates Court, they may go on to the Crown Court, they may receive a custodial sentence then, they go into prison, and then when people leave prison that they they'll often go under probation supervision. So we can get a big picture of kind of people's interactions across those different parts of the justice system, so that's the kind of data that we collect already. 
Instead of sitting on administrative systems, we're trying to make best use of it, both within government, but also to share it with academic researchers so they can access it and start to explore these kind of really rich data assets for some of our big evidence gaps. 
Mark: That's really fascinating and such a wealth and depth of data you have available, why wouldn't people use this? You know, you said it's vastly underused. I mean, one of those reasons, you said, is fragmentation. But why aren't we using this data? You just think it’s so amazingly detailed.
Amy: These systems weren't set up for research purposes. They're set up kind of to support the operation of the courts or the management of prisons, for example. So they've up until now, been used for what they were intended to set up for, but we realize that there's this wealth of information sitting there. You have to be very careful that we're sharing data ethically and responsibly.
There's a lot of cleaning to do of the data, and by cleaning, I just mean there'll be lots of data that's just not completed very well, because it's not a busy prison officer's top priority to make sure that every single field on their data management system is feel is filled out. They've got much more important things to do on the front line, if you like. So there's a lot of work that's done to sort of unpick how good the quality of the data is before we can then see what identifiers we can link to other datasets with. So it's not that we didn't think it was a good idea before. It's just been a long time coming, really.
Mark: Absolutely, like a shout out to those people who clean data. We couldn't do our jobs without them.
Shayda: It's almost gives you hope that things might change now that data is being used more efficiently and that we have better evidence to base decision making on, important decisions. David, from your perspective, in working at Prison Reform Trust, how would you describe the state of the justice system now? And can you tell us a little bit about the Prison Reform Trust as well.
David: I’ll maybe say a little bit about what the Prison Reform Trust is. I started at the Prison Reform Trust in January 2020, before that, I was at UCL as a British Academy post doctoral researcher. And before that, I did research on men and masculinity and education and pathways into prison. In one form or another I've been around prisons for over 20 years. I was attracted to go and work for the Prison Reform Trust, A, because it's got a fantastic reputation across our sector, and B, it was for a very particular programme that focuses on those serving the longest sentences.
But maybe just a bit about the Prison Reform Trust. So for those that don't know, the Prison Reform Trust is a charity, and it works to create a just, humane and effective prison system, and we do this by working with people like Amy and yourselves and anybody else to generate data, research. We give advice and information, facts and statistics, and we just try and highlight some of the pressing issues across our prison mostly England and Wales prison systems. My particular role is for the programme building futures, and this work focuses on those that serve the longest sentences. So a lot of advocacy work and charities like to focus on and advocate for people in prison, but often we dodge those serving the longest sentences, because they are those that the public feel, and many feel, and we feel many should be in our prisons.
What we try and do in this day job is we harness the expertise that we see within the prison population so those actually serving the sentences, is what we try and consult and work with on how we can better create a more just and humane space for those serving those eye watering sentences. And I think a lot of what we try and do is develop and complement other forms of knowledge and research by harnessing that lived experience. 
And when we think of long term prisoners, we often don't think of women serving long sentences. So we've made them much more visible through our invisible women's workstream. So looking at some of the pains and gender pains of imprisonment, for women, we're looking at people aging in prison. Because, of course, aging is a huge problem across prison populations. We're looking at how people progress through their sentence or not. So there's just some of the key things that that we use and we harness that inside experience to kind of add to already existing knowledges. 
Shayda: Thanks, David, I know we've spoken about the power of lived experience for research, particularly research like this. You mentioned a few really interesting areas of work. What would you identify as some of the current most pressing issues at the moment?
David: Well, I think any of us who work around prisons or in prisons can't get away from the fact the capacity crisis that we currently are seeing. I think another big issue that we can't escape from is the remand population is a big problem, and that's risen over 80% since 2019 and now accounts for almost 20% of the prison population. So the remand population are those that might be refused bail and are put on remand awaiting sentence. That's a large part of that population. And I'm laying this out because I think some of what the Data First can help us with some of the solutions and answers to some of these current crises.
Shayda: So Amy, David listed some important issues there currently being faced by the justice system. Is Data First, in a position now or in the future to look at some of these issues?
Amy: Most definitely. For example, if you take the pressures on prison capacity, we can look at, I mean, obviously the Ministry of Justice does a lot of modelling and forecasting of prison capacity and everything with slightly more live data, but Data First datasets, when you can, if you when you are linking magistrates and Crown Court data through to prisons, you can look at which type of sentencing options may have an impact on whether somebody obviously goes into custody, but also longer term if they're more or less likely to reoffend.
So for example, we know from other research that short custodial sentences are not effective at reducing reoffending, and actually, evidence suggests that community sentences for less than 12 months can be more effective in supporting people from reoffending. But Data First gives us data at a scale that's not been possible before to look at those patterns. What sentencing options are likely to improve the chances of prisoners not reoffending, for example? So that's really important insights: what sentencing options lead to better outcomes for people who have committed an offence? So that would be one way in which Data First can sort of shed light on those types of questions.
We do know a high level that there are factors that protect the risk of reoffending. So supporting people to get a job, supporting them to address substance misuse issues, employment, education, those types of things, but where we're lacking a little bit more is: what's the relative impact of those interventions? How do you sequence those interventions? 
So, what works for who? Why? When? Those types of questions. And the Data First datasets should help start to help answer those questions. So at what point in a person's interactions with different public services, could we have intervened? Could we have better targeted services to support them, to divert them from the justice system, for example? 
Through our flagship data share between Ministry of Justice and Department for Education, we've got a lot of insights around the educational and social care backgrounds for children who end up in contact with the justice system. If we are able to layer more information into the linked datasets around what kind of services or interventions these children have had experience of, we're better able to understand what works to improve outcomes when we should be intervening. How can we improve their outcomes?
David: I can't help but agree with lots of what Amy's just laid out there. Certainly focusing on my work with long term prisoners. One of the things that I think was touched upon there is who we sent to prison, and for how long. Those serving the shortest sentences, and the cost to everybody involved in that, for that, I think, is, is hopefully some of the consequences, the cost who it is, is something I know can be hopefully pulled out of some of that Data First data about who it is that goes, how it links to local services. 
And I think the other thing Amy is the recall, what are we sending people back to prison for? Some of these datasets might be able to give us some answers around that is, how do we stop sending people back to prison? Because, again, that's a huge population. 
Amy: So we've linked data to the probation datasets so we can now have much richer information around who were recalling to prison and what for. We published some research based on the magistrates and Crown Court linked data that provided new insights on who was returning to court and what they were returning for. I think it's around 18% or something, of defendants returned to court more than six times in the time period, you know, between 2011 and 2019 and we now know quite a lot more about what these people are coming back into prison for, the types of offences, their sentence length.
And we should be able to, it would be a very interesting research question to look at the recall population as well. Because, as you say, the recall population is increasing. That's increasing, then the pressures on capacity of prison. We need this data. We need robust, large scale data to put to ministers, to put this to decision makers, policy makers in MoJ and across government. We need that data to sort of say this is what's happening, and this is our advice on what we think you can do to change this.
Mark: I think that's really, really exciting, because from what you're saying is it's how can we improve upon what data we have access to? And you say doing more at scale, at bigger scale, doing more granular work, particularly focusing it on interventions and what works, on what doesn't work, and then feeding that back to ministers to make better decisions. 
Clear that you have loads and loads of data, and you know, that's being put to really good uses. What sort of evidence do you think is missing at the minute, or what type of datasets do you think are kind of missing that you think could really make a big difference, I guess, in making better sentencing evidence decisions, or even, you know, making a real difference to people's lives? 
Amy: The thing that immediately pops into my mind is we have very limited data on victims and witnesses, and that's something that academic researchers are always saying that would benefit them in terms of the impact of crimes and things. So data on victims and witnesses would be helpful. 
The other thing is, so the datasets at the moment are very they can tell you the what, they can't tell you the why and the how. Within our kind of court transcripts and judicial sentencing remarks, they'll have information around mitigating and aggregating factors, so the factors that were taken into consideration when people were being sentenced. And I think if we could find a way of utilising that information from court transcripts and so on, which is not a straightforward thing to do that anyway, I think that would give us a more richer picture of the how, how decisions are made. 
What we are doing in Data First, we're in the process of sharing a dataset called Oasys, which is offender assessment data, basically, looks at the risks and needs of people in our prison system. So it looks at, you know, kind of more mental health needs or educational needs, or specific incidences that they've experienced while they've been in prison, so incidences of self harm or violence or adjudications, that kind of thing. That gives us a much richer picture of what their experience within prison has been like. But then the impact those experiences might have on, for example, their sentence length or their probation experience, whether or not they come back into the system. We're looking forward to academic research and cross government research making use of that Oasys data.
Shayda: One thing that I'm sure applies to all administrative data research, but particularly might come to mind here, is, how do we protect individuals from being profiled with this kind of research? Can you give us some information around that? 
 I think I mentioned earlier, it's our responsibility to make sure that we want to maximise the use of this data, so we can get experts in academia and across government to maximise the use of it for new policy and practice insights. But we also have to be mindful, this is people's personal data, and for it to be a public asset and for people to use it, we need to be really careful with it. We have to make sure that it's only shared on secure platforms. It's only shared with accredited researchers. 
Amy: All of our data we share with the ONS secure research server, and that operates on the five safes framework, but essentially it means every project researchers want to use the data must be approved and accredited by data owners. All researchers that access the data must be accredited researchers, so they go through certain training, they sign up to certain conditions of access. Nothing can come out of that server that could identify an individual person. Data owners like the Ministry of Justice and the ONS SRS will not release any information from that secure server that's going to identify an individual, so whether that's a person in prison, whether it's an individual judge or a student, you can't take that information from the secure platform.
We have engaged with people with lived experiences or advocates for those with lived experiences, and generally the response is very positive about making use of that data for a public good. But we recognise that there have also been concerns from privacy lobbies that we've had to respond to to reassure that no one will be identified in the use of this data that we share.
Shayda: Thank you. That's really reassuring that MoJ has done so much engagement with representative groups, charities, lived experience, as you just mentioned, Amy, and one of them is the Prison Reform Trust. 
David, you spoke about the power of lived experience. We've got loads of researchers applying to access these justice datasets. If there's some advice or a message that you can pass on to them, or something that you'd really like them to be mindful of when analysing these datasets, what would that be? 
David: There's a few things that I think might be helpful to just put out there as somebody who doesn't work within these big datasets. If you speak to anybody in prison and across prison about how things are inputted about them, the data is only as good as it's inputted, right? And the way sometimes that, and Amy's alluded to this, we look at prison staff, we look at the pressures that they're under. 
We look at people who work in social work and in probation, all these services for some time have been under huge, huge pressure, and the data input in, and the quality of that data has been raised as being questionable by those subject to that data. So you'll speak to people in prison, and they'll often say, Dave, I got this print out, and it's just a cut and paste from 10 years ago. It's literally a cut and paste from 10 years ago, and I've gone into the parole board with this, or I've gone into have my security reviewed, and the information is, in some cases, decades out of date. 
It's about real people. It's about people we've lived next door to. It's about people we've met on the landings. It's about people we've met in the care system. It's about people we've supported, and it's people to us, and often some of people's most painful, harrowing life experiences, whether it be a long sentence, whether it be losing your children, whether it be kind of exclusion from school. So these datasets that you have that people have to come to cold and objectively, I would always try, and although people have to be separated, always reinforce that these are some of people's most painful, profoundly life moments. 
The lived experience thing is important, but I think we often have to keep it into perspective as well. I worry sometimes that we use this lived experience as a legitimacy vehicle, and I think there's a conversation that I've been trying to instigate is how we engage much more critically with how we use and draw from lived experience, and how that is part of a knowledge and it contributes to other forms of knowledge. And I think as researchers, as practitioners, as academics, we need to be much more mindful in how we're signalling our use of lived experience.
Mark: I think that's really powerful. And as a data scientist, I think sometimes we're a little bit removed from that lived experience. I'm a really big advocate for engaging with people lived experience to make sure that what I do, and I think what we should all be doing, is doing relevant and impactful research and not some sort of tokenistic engagement. It should be a meaningful voice that is embedded throughout everything that we do.
David: And just perhaps one other final point is we are no way representative when we draw on lived experience, there are often, if you’ve been around long enough, you'll see that it's often the very same voices that we're drawing on, which is a very, very small part of the groups that we're trying to represent. So some very kind of difficult issues to tackle there. And I think that conversations going forward, we all need to be a little bit more reflective in what lived experience is and how it contributes to knowledge.
Mark: Amy, I know you've been doing some excellent work in your hub, particularly engaging with external researchers. Could you tell us a bit more about that?
Amy: In the hub, what we're trying to do is take a more strategic approach to developing the evidence base. So there is a number of research questions, number of evidence gaps or evidence needs that we need to understand about our users and what works to improve their outcomes. We want to focus on answering those more, longer term questions. 
We're also working on improving the use and the accessibility of evidence, as I've said before. So are we making best use of the evidence that's already out there? And in doing all of this, we're improving the collaboration that we have with academic researchers. 
Shayda: I think that collective approach in acquiring evidence is so important to tackling complex issues. I think we could be talking about this all day, really, but we're running out of time now. 
So thank you, Amy, and thank you David for such a fascinating conversation and inviting us a little bit into your world. We'd like to close each episode with the bottom line, which is, quite simply, why should the average person care about this? 
David: We send people to prison for crazy, crazy amounts of time. We send people to prison in some of the longest sentences than nearly all our European counterparts. And if we look at what we get out of them sentences for the money we put in, and what we provide in decades keeping people behind bars, the public would be completely at a loss. So I think that we need to work much harder to win public trust in what it is that we do and what it is that prison service does, and how it can use data to much better improve the outcomes. 
I think, before you bring the public along, is you've got to kind of have something to show them that works. And currently our prisons and our probation and everything else, the public has very, very little knowledge around it, but mostly they don't have faith in it. And when we go to release somebody, we're very easily whip them up. And rather than them saying, I have faith that the systems have worked, they have no faith that the systems have worked. And therefore, public pressure works to keep people in prison for longer and not let them progress to get out. 
Amy: It's a big question. You can come at it from a number of different perspectives as well. So why wouldn't you want to understand what works to improve people's lives? Why wouldn't you want that, because it's a moral thing to do, because we have these insights, we should be using them. You could look at it from a financial perspective. If you invested in one area, you might save money in another. 
I think why data is important is because it's another layer of evidence, another layer of weight to add to the discussion and the debate. If you can say, for example, this evidence strongly suggests these short sentences don't work. It's more than a hunch. You've got it kind of a little bit more in black and white, and I've mentioned before, ministers will want to see this evidence before they start making decisions. 
Shayda: Well, that's a lot to sit and think about, so our I think we should leave it there. Thank you both again for taking the time to be with us today. More information about Data First and the Prison Reform Trust can be found in our show notes. 
Mark, what was your biggest takeaway from this episode? 
Mark: I think for me, it's just how much has happened in such a short period of time, and makes me feel more excited about, you know, the amount of data that's being used to inform real world decisions. 
And whilst there's still much more to come, you know, it really gives me hope that we are using data to make better decisions in how we think around justice and crime. 
Shayda: Absolutely. Can you tell our listeners what's in store for the next episode?
Mark: So on the next episode, we're going to dive into the impact of research using data from the Nursing and Midwifery Council from Scotland. 
Until next time, stay curious about how your everyday data might shape society.
 
       
      