Today, public agencies are under pressure to collect and interpret data that reveals exactly what happened when a policy was implemented. Andres Arcila joins to explain how this is changing public policy, and help us understand the truth behind public data.
Andres is a senior research data scientist with AB Inbev, where he develops demand estimations and economic forecasts in the brewing industry. He also holds a PhD from Waterloo, specializing in policy evaluation and applied economics. This fall, Andres will return to Waterloo to teach a data analytics certificate program for public servants and business professionals, offered through WatSPEED.
Learn more about the Data Analytics for Behavioural Insights certificate: https://bit.ly/3cU8DeT
EXCLUSIVE OFFER FOR UW ALUMNI
Waterloo is pleased to offer 4 months of free access to LinkedIn Learning (February through May 2023). With LinkedIn Learning, you can access more than 16,000 digital courses taught by industry experts to help fuel your appreciation for lifelong learning. We offer playlists of what other Alumni are learning to get your started and spark your interest!
Register now: http://bit.ly/3wP2Td0
Megan Vander Woude 0:23
As taxpayers, we have a right to know what policies are dollars support and how effective they are. Politicians and public agencies like to tell us how well their plans worked out. They want to tell us how many people were helped by a new program, how many new jobs they created, and how much money was saved because of their choices. But where do those numbers come from? And how do we know they're credible? Today, public agencies are under pressure to collect and interpret data that reveals exactly what happened when a policy was implemented. To explain how this is changing public policy, and help us understand the truth behind public data. I am joined by Andres Arcilla. Andres is a senior research data scientist with BeerTech, a division of AB InBev, where he develops demand estimations and economic forecasts in the brewing industry. He also holds a PhD from Waterloo specializing in policy evaluation and applied economics. This fall Andres will return to Waterloo to teach a data analytics certificate program for public servants offered through WatSPEED. Thank you so much for joining me.
Andres Arcila 1:29
Thank you for receiving me.
Megan Vander Woude 1:30
Yeah. Well, it's great to have you here. And I want to start with some of what I was saying in the intro. I'm sure that we've all seen or heard announcements about past policy interventions before where, you know, a politician might come on the news and provide how many people were helped or they'll give a certain amount of money that was saved because of a new policy. Where have those numbers come from in the past and do they actually indicate whether the policies were successful or not?
Andres Arcila 2:02
Yeah, so they usually will come from a analyzing very descriptive statistics that will analyze trends between different variables, and some of them will come from single regression techniques that probably will leave out important analysis. So it's difficult to say that the the outcome that we saw in policy work directly affected by by the actual policy. But don't get me wrong, right. The people and researchers they did the best they could do with the data that they have the most challenging part of analyzing any policy is the availability of good data. But that's changing right now. And with the public agencies are much better now at collecting data and analyzing them and also analyzing and collecting information from their own policy interventions. And not only that is nowadays, everyone or most of the population owns a smartphone on any sort of wearable device. And that allows us to generate data and record data every second. And the fact that we can use that data in policy intervention. In policy analysis. This is very exciting. For example, there is a recent study from UWaterloo researchers that use Google mobility data to identify the effectiveness of the restriction policies during the COVID a pandemic on the number of public cases in Ontario. And this is a very exciting because this Google mobility data has been, it has been used in different settings that were used on identifying what is the best place to open a new restaurant or the best place to open and store but the fact that we are using now that information to uncover and answer these difficult and challenging questions is very exciting.
Megan Vander Woude 4:00
Yeah, that is exciting and such a great point about smartphones and how much data they can collect. So it sounds like in the past we've kind of grappled with this assumption that a correlation between two data sets means we can say that one thing caused another to happen. It sounds like this is improving with more data coming in, but maybe you can just take us back to the fundamentals here and explain what's the difference between the correlation that we might see in a data set and causation?
Andres Arcila 4:39
That's a very important question. And I'm glad that you bring it up because this is something that I tell my students always to make that differentiation. So correlation in general refers to the degree to which a pair of variables are related. For example, if both variables move together in any direction, whereas causation refers to the influence by which one variable contributes to the production of the other ones. For example, movement of one variable implies movement on the other or one variable precedes the other. So there is this old mantra that says that correlation is not causation and I can give you a lot of examples. When when this is not true. And one of that is a very funny one, actually. A couple of years ago. There is this a soccer player called Aaron Ramsey. So Aaron Ramsey a became famous online, because every time that he will score a goal someone famous will die within the next two to three days. And he's a very, very, very, very strange to that because all these needles those two particles are completely unrelated. But if you go and calculate the correlation, you will go and look the trends of those two variables. You will say, oh, something's happening here, right? But obviously, that is not true. Now, don't get me wrong, is not that a correlation is a bad thing, right? A correlation in general carries a lot of information about a causal relationship between two variables. One really good example about it is comes from economic geography. So researchers have been using a satellite information in order to identify what is the degree of development of a region and the way they do that is by looking at the intensity in the lining in a map and satellite map, it happens to be that that intensity, that light intensity in that map is related with how much power is consumed into that in that region. And that power consumption is related somehow with the with the how developed is that area so so there are all these the cases where correlation is not causation, but correlation. Many times say a lot of about causation.
Megan Vander Woude 7:10
Yeah. So I love those examples, especially the first one. It's great to get a very ridiculous example to explain to explain that. When we first spoke Andres, I mentioned that I was somewhat familiar with this concept because of my background in marketing. So when I've written things before you know, when essentially I'm trying to sell something, or you know, when a politician wants to win an election, you want to tell a story to your audience about your work or the product that you're trying to sell. But if you don't have this statistical background, it's really easy to tell an inaccurate story with the numbers that you have. And this is exactly what you're trying to tackle with a new certificate program through WatSPEED. It's called data analytics for behavioral insights. Can you tell me more about the program and what public servants or other professionals could expect to learn from it?
Andres Arcila 8:04
Yeah, so this program is a three course certificate, where students will learn all the basic necessary statistical methodologies for policy evaluation and how to implement those analysis. The first course covers all the bases the statistical theory the second course is a coding class on how to implement those statistical methodologies into the most important and popular statistical software, Python. And the third course are more advanced class where we teach advanced statistical methods to uncover causal effects on what different types of methods should the individual or the student use for different types of data. But this is obviously not only targeted for policymakers, it can be a taken for any business any professional that wants to work with data. So any business professional that want to learn how to use science to take better and informed decisions should take this course so we teach them how to leverage the information that the companies already own. So nowadays, many companies either are developing a data science team, or has a team or has also data and they want to use this data to take these better informed decisions. So with this course, we will teach the student or we the student will gather the necessary skills in order to to answer that. You mentioned for example, that you have a marketing background, right. With the methodologies that we teach here, we can you can go and answer a question whether is that say that it did last marketing campaign achieved the goal that I wanted this new marketing campaign, increase our volume sales, etc. So all those questions that are obviously causality questions, they all can be answered with the methodologies that you can pay that the students will learn here.
Megan Vander Woude 10:08
Oh, yeah, that's great. And I think you're a really great instructor for the certificate program because you obviously have this background in policy. And you've done a lot of research in that area. But now you work for a private company and you work in in the brewing industry and use data to help beer companies make decisions. So that's pretty cool. Um, so I mean, getting back to public agencies. It of course, like this is really great for people working in public policy, but I think you know, as just like, citizens we're bombarded with, with data and information every day. In media, we hear all these database stories. Do you have any tips for people when they come across an exciting headline or maybe they hear a public figure making a bold claim? About a new policy intervention? Are there certain things that maybe we can look out for questions that we can ask to be more aware of what's actually going on?
Andres Arcila 11:15
This is a very hard question to answer. The best thing that I can tell you is that always look at the story behind the statistics, the story behind the data. So what is the mechanism for which the individual who's making this claim can assert that it's a causality. And as you mentioned, is very easy to lie or disguise with the statistics and we saw this a lot during the peak of the pandemic, right? Every day we were faced with a statistics, number of cases, number of deaths, number of hospitalizations, etc. And many people were using that data in order to come up with their own stories, or their own conclusions about how we should handle the pandemic, what policies we should implement what policy we should not implement, right? So it's very difficult to answer the question, but what I will tell you is that always look at the relationship the story behind the data. Obviously, the best way, like the best way is going read if the data comes from a study go and read the study. But it's hard, right? Because in order to understand these very complicated studies, you have to have a set of tools that will allow you to understand the conclusion from the study. But as a general public, there is nothing better than a good story behind the data in order to understand so if the story makes sense, if the data is consistent with the story, then you could be certain that a you can identify some causal relationship between them.
Megan Vander Woude 12:57
Yeah, that's, that's really great. And thank you for answering a very complicated question. I understand that that one can be a little bit tricky. Andres, this has been fantastic to learn and talk to you today. Thank you so much for sharing some of your knowledge and some of the information about this new certificate program.
Andres Arcila 13:17