Environmental Professionals Radio (EPR)

NEPAccess, Big Data, and Academia with Dr. Laura Hoffman and Dr. Aaron Lien

September 17, 2021 Dr. Laura Lopez-Hoffman and Dr. Aaron Lien Episode 35
NEPAccess, Big Data, and Academia with Dr. Laura Hoffman and Dr. Aaron Lien
Environmental Professionals Radio (EPR)
More Info
Environmental Professionals Radio (EPR)
NEPAccess, Big Data, and Academia with Dr. Laura Hoffman and Dr. Aaron Lien
Sep 17, 2021 Episode 35
Dr. Laura Lopez-Hoffman and Dr. Aaron Lien

Welcome back to Environmental Professionals Radio, Connecting the Environmental Professionals Community Through Conversation, with your hosts Laura Thorne and Nic Frederick! 

On today’s episode, we talk with Dr. Laura Lopez-Hoffman, professor at the University of Arizona, and Dr. Aaron Lien,  Assistant Professor of Rangeland Ecology and Adaptive Management, about NEPAccess, Big Data and Academia.   Read their full bios below.

Help us continue to create great content! If you’d like to sponsor a future episode hit the support podcast button or visit www.environmentalprofessionalsradio.com/sponsor-form 

 

Showtimes: 

0:00  Intro 

1:29  Shout outs

2:03  Nic and Laura discuss Big Data

11:41  Interview with Dr. Laura Hoffman and Dr. Aaron Lien starts

15:38  Dr. Laura Hoffman and Dr. Aaron Lien talk about NEPAccess

19:43  Drs. Hoffman and Lien also talk about Big Data's role in NEPAccess

30:38  Drs. Hoffman and Lien discuss Academia's part in NEPAccess

42:41  Outro

 

Please be sure to ✔️subscribe, ⭐rate and ✍review. 

This podcast is produced by the National Association of Environmental Professions (NAEP). Check out all the NAEP has to offer at NAEP.org.

 
Guest Bios:

Dr. Laura Lopez-Hoffman is an environmental scientist working at the intersection of natural science and environmental policy. She is a champion of using data-driven approaches to evaluate the outcomes and performance of environmental laws and policies. Laura oversees all aspects of the NEPAccess endeavor.

Dr. Aaron Lien's bio will be coming soon.
Connect with Dr. Lien at linkedin.com/in/aaron-lien-9a1b2114

 
Music Credits

Intro: Givin Me Eyes by Grace Mesa

Outro: Never Ending Soul Groove by Mattijs Muller

Support the Show.

Thanks for listening! A new episode drops every Friday. Like, share, subscribe, and/or sponsor to help support the continuation of the show. You can find us on Twitter, Facebook, YouTube, and all your favorite podcast players.

Show Notes Transcript Chapter Markers

Welcome back to Environmental Professionals Radio, Connecting the Environmental Professionals Community Through Conversation, with your hosts Laura Thorne and Nic Frederick! 

On today’s episode, we talk with Dr. Laura Lopez-Hoffman, professor at the University of Arizona, and Dr. Aaron Lien,  Assistant Professor of Rangeland Ecology and Adaptive Management, about NEPAccess, Big Data and Academia.   Read their full bios below.

Help us continue to create great content! If you’d like to sponsor a future episode hit the support podcast button or visit www.environmentalprofessionalsradio.com/sponsor-form 

 

Showtimes: 

0:00  Intro 

1:29  Shout outs

2:03  Nic and Laura discuss Big Data

11:41  Interview with Dr. Laura Hoffman and Dr. Aaron Lien starts

15:38  Dr. Laura Hoffman and Dr. Aaron Lien talk about NEPAccess

19:43  Drs. Hoffman and Lien also talk about Big Data's role in NEPAccess

30:38  Drs. Hoffman and Lien discuss Academia's part in NEPAccess

42:41  Outro

 

Please be sure to ✔️subscribe, ⭐rate and ✍review. 

This podcast is produced by the National Association of Environmental Professions (NAEP). Check out all the NAEP has to offer at NAEP.org.

 
Guest Bios:

Dr. Laura Lopez-Hoffman is an environmental scientist working at the intersection of natural science and environmental policy. She is a champion of using data-driven approaches to evaluate the outcomes and performance of environmental laws and policies. Laura oversees all aspects of the NEPAccess endeavor.

Dr. Aaron Lien's bio will be coming soon.
Connect with Dr. Lien at linkedin.com/in/aaron-lien-9a1b2114

 
Music Credits

Intro: Givin Me Eyes by Grace Mesa

Outro: Never Ending Soul Groove by Mattijs Muller

Support the Show.

Thanks for listening! A new episode drops every Friday. Like, share, subscribe, and/or sponsor to help support the continuation of the show. You can find us on Twitter, Facebook, YouTube, and all your favorite podcast players.

Transcript is auto-transcribed

[
Intro]

Nic  
Hello and welcome to EPR with your favorite environmental enthusiast, Nick and Laura. On today's episode, we give our shout outs, Laura and I discuss the future of big data in science, we sit down with Dr. Laura Lopez-Hoffman and Dr. Aaron lean to talk about the NEPAccess program and they are running out of the University of Arizona, if you're an NEPA practitioner, or just interested in NEPA in general, it is a must listen. And finally, we all know that the Sahara is the largest desert in the world but let's put it in perspective, if the same size as the United States, including Alaska and Hawaii, but to put it in a different way, it's about 44,000 football fields long or 1.5 million camels long, who knew. Yeah, it's just, I can't believe at the scale is, it's 1000 miles wide too. So it's not just long, It's just crazy.

Laura  
I believe it having flown over it and looked out the window and seen it, and then sat down and looked out the window, an hour or two later again, and still see it. It's pretty crazy.

Nic 
Yeah so wild. And I'm super jealous you've done that, so please be sure, as always to subscribe, rate and review. Hit that music

[Shout outs]

Laura 

Our shout out for today goes to TBAEP, and the committee that leads it. This is our sixth annual Women in STEM workshop, and every year it's sold out, it always has great leaders at the tables so be sure to get your tickets for that it's happening September, and we know a lot of our listeners are new to the profession so don't forget to check out the NAEP job board at NEAP-jobs.careerwebsite.com for new and exciting opportunities in the environmental fields. Nick and I love doing this show if you love it too and would like to keep us doing it. We need your help, we can't do it without our awesome sponsors so please head on over to environmentalprofessionalsradio.com and check out the sponsor forum for details. Now let's get to our segment.

Nic
Sweet.

[Nic and Laura's segment]

Nic  
But yeah, so we got it we got to talk big data. Let's talk about it let's talk big data.

Laura
Let's talk about it, big data,.

Nic
Big data. It is just a fun fun word to say.

Unknown Speaker 
So big data is that like Internet of Things, or is that, what's the other word, there's another word for big data.

Nic 
There's actually, there's more. Now you've caught us, you caught us Laura you've got us. We only know so much. No, big data is really, it's really interesting, it's one of those things that I don't think anybody ever really thinks about, because you just don't ever, it's  in everyday life right like even how you search your emails is big data, you know, if you type in a search engine and and tries to find the words you know that's that sound kind of big data, but you know what we talked about with Laura and Aaron was like a step beyond that, right, like I think you were mentioning that Google has started telling you the tone of your emails which is just fantastic.

Laura 
Well, Dr. Laura pointed out that, you know, it's not just a database, it's a knowledge and engagement program, you know. Yeah, calling it just the database, you say a database like oh it's a repository that's just where data lives, but data in itself is coming to life in other ways.

Nic  
Yeah, yeah and you know, even trying to understand, like you know tone of an email is one of the hardest things in the world to do, and it seems easy but we screw it up all the time, just as, as individuals, you know, you're like, What is this really rude means snarky, you know, and then you talk to the person and they're like oh nice just answered it quickly. I'm sorry I didn't mean to be right like that happens all the time, and you're  Yeah. Yeah.

Laura
 
I actually love that feature. Cuz I go, it makes me rethink, right, oh, you know, I didn't say I was just very direct to the point, I need to add some like thank you and I appreciate you whatever and then it automatically goes 'ding' thumbs up, good job and I'm like alright, Cool.

Nic  
You're not a total jerk. Yeah. All right, that's fun but,

Laura
So, that comes from big data, huh.

Nic
that's all big data yeah for sure. And, you know, because like me but like, just thinking about that process right how do you get a computer to say, Hey Laura is being a jerk. You know, tell her to stop being a jerk. That seems easy, but it's like okay is it short words, is it short content, is it the actual words themselves, and you have to basically have to have an algorithm that determines that, right, like we talked about on the show sarcasm is impossible. You know it's one of the hardest things in the world for big data to understand and you just total side note I feel like big data we're talking about like they're like, watching us all the time, big data, and data that that big data is going to get us, you know, but maybe that's true actually in some ways I do love that it's,

Laura  
I do I think soon there's gonna be an app that tells me and you, for being high enough energy or or smiling enough or right. The podcast. Yeah.

Nic
You're not laughing enough.

Laura
The algorithm I checked with the big data to tell if compared to other podcasts, if we're on point or not.

Nic 
Yeah, Yeah, you know, and, but that's the thing like big data has, it's everywhere. It is actually in like learning that speech learning right where you like talk to text is the thing that is a data driven application,  and it gets better and it has to listen to your voice, to really get the baseline for it and it's still not perfect. My wife does it every now and then some of the texts are hilarious, you know, she's like, Yeah, like, she'll say kiwi and pick up some kiwis and it comes out like, pick up some wiwis. I'm like okay all right I can't do that that's not. Let's not do that, you know it's it's not perfect but that's where we're going with it and I think that's pretty neat.

Laura 
Well, you said that big data doesn't understand sarcasm, or let's just say that programming you say I don't understand sarcasm yet but I really feel the AutoCorrect knows all about sarcasm.

Nic
 
Yeah, yeah, that's true. Absolutely true. It is. It's like oh wow autocorrect is funnier than I am, that's, that's hurtful.

Laura
Yeah, that's, that's on purpose.

Nic
Yeah, yeah, that's gotta sting. But yeah, I, I honestly think that they're gonna get there though, that's the thing that's really crazy, like it's, it's getting but you know it gets better all the time you think about how big the beginning of voice to text is was hilarious and now we're at a much better spot, and even having that idea that it has to listen to your voice first is a new thing well not new anymore but that was like, oh yeah, this will help make this process better. So, yeah it's it's in everything I'm trying to think like, even like you know pulling up searches like what, what will this person, like, and so a lot of what the data is doing now is going straight to what you specifically like, you know your own personal algorithm, we all have one, and you could just go to Google News and scroll through it and see what it pops up with you know and some of it will be big news, you know, but I'm sure. Laura yours and mine are a little different. I don't think you have any NFL preseason stuff coming up on yours.

Laura 

Gosh I hope not. I know people will complain about like, oh the bots are listening and then they show it like you talk about something they show it to you I'm like man that's frickin convenient.

Nic  
I know. It does get it does get scary sometimes I remember like I told my wife I'm like man my ears are ringing like I just get this ringing in my ears really annoying. And she's scrolling through her phone, and, like, an ad for tinnitus, like a tinnitus therapy shows up and you're like okay that's, that's, they are listening. It is crazy. But that's what they're trying to do is like oh yeah so ringing your ears, try this, you know.

Laura
Timely.

Nic
Yeah quite timely, but it is a little freaky too. There's a little bit about it that makes me nervous but I mean we're here already, it's already here.

Laura
Oh, We're here already and it's beyond what we're even talking about.

Nic
Oh yeah, exactly, yeah.

Laura
Have You Ever Have you read, Hit Refresh, by Satya Nadella the CEO for Microsoft.

Nic 
No, but tell me about it.

Laura 
Yeah, it's pretty amazing, I love the way he talks about leadership and all that but also there's, I really loved he talked about like some of the things they're working on. And, you know, one of the future challenges and it crosses over with environmental is the storage of all this data right so it's not the cloud is kind of like fiber optics, it's a cloud virtual but there's a, there's a real footprint that's associated with it. And so the next big challenge hurdle is like when you start, like I don't know if you've ever been into a server room or know about like networks but they have to be air conditioned because they generate heat right. And so the challenge is, what do you have that much storage of data is, is how do you keep it cool and then so also keeping it cool actually can help the batteries, be smaller. So, you know there are lots of companies working on how do we create. So, you know how we already get so much, so much like our phones now, we have way more than the first IBM computer yeah right yeah storage, they're working on nanotechnologies which will like once they unlock that, then we're not even talking about big data, we're talking about. That's, you know what, that's creepy. We're gonna be talking about they'll be able to, you know map genomes and DNA sequences in, in a matter of seconds. You know things that, so there's like a next level of big data that's just on the verge of somebody figuring out, data storage, you know.

Nic  
Yeah, and my you know like quantum computer and everything you know this stuff comes up in my feed is talking about like Microsoft or is it Microsoft or Google their their time crystals that they actually say they created I can remember which one of them did it but that kind of stuff is just mind blowing trying to read about it, even with a science background, sometimes you're like, what you know is quite complicated, but really, really fascinating.

Laura 
Yeah, it's actually exciting and scary and Terminator all at the same time.

Nic  
Yeah, for sure, and you know I had a project actually where they were, we went to a secure facility and they actually had that server room, And the way they kept the cool was actually with cold water, which was great, which was weird. Like, it sounds like why would you put water and electronics next to each other very bad idea, and generally yes but like they were, they basically had, they weren't like next to each other, they were just in the same room. And so it's just the way that they could keep things cool, They could read they could basically route the water wherever it needs to go to dissipate heat. And as long as you put it behind a different, you know, I think at this point it was glass or something like that and it was separate from the actual servers themselves but that's the challenge with Bitcoin right all the cryptocurrencies, is the storage of all of this data, you can't just like you said it, it has to go somewhere, and it's carbon footprint not great.

Laura  
Right I love, I actually love that the last you know the last crash was sort of associated with that, in the fact that people were like wait a minute, this is not sustainable and now the ones that are coming out of that are the sustainable companies, the ones who said their net zero or net negative which is really cool.

Nic  
Yeah. And you know, technology is one of those things that, you know, technology is driven by, you know, consumer demand a lot of times you know and when people say things like, I want this to be efficient, I need it to be better, competition, does that as well you know it's like, it's actually probably more important, where it's like, oh, I if they are doing this then we have to do it. And so that we don't get behind, and if we can do it better than them, then more people will come to us, you know as the electric cars that's why you see that come up is because Tesla was like, we're doing this. And everyone else was like this is dumb and then they realize people want it. And now, Bam, and so hopefully the same thing happens with big data, too, so they get better at storing that data.

Laura 
Yeah, absolutely. But, let's hear from our guests who are going to talk to us about some real life, current Big Data occasion in the environmental space.

Nic
Cool.

[Interview with Dr. Laura Hoffman and Dr. Aaron Lien starts]

Nic  
Welcome back to EPR today we have a very special treat for you we have two guests, Dr. Laura Lopez-Hoffman and Dr. Aaron Lien, who are both professors from the University of Arizona, they are working on the development of an online NEPA database called NEPAccess and we're really excited to have him here. Thank you guys for being on the show.

Dr. Laura Hoffman
Thanks.

Dr. Aaron Lien
Thank you. Happy to be here.

Dr. Laura Hoffman
Yeah, this is great.

Nic
Yeah, so we always start with the basics on our interviews right so I want to know what prompted your interest in environmental science and how did you get to where you are today, Laura, I'll start with you.

Dr. Laura Hoffman  
Well actually it goes back a long ways, grandfathers great grandfather's ecologist founders of Ecological Society of America so it's sort of in my family. On my stepfather's side, so it was just clear what I was going to do.

Nic
That's cool. Aaron?

Dr. Aaron Lien
 
Yeah, I'm a first generation college student in my family but I grew up outside of Seattle small town outside of Seattle, mountain climbing, backpacking just spent my whole youth outdoors and that kind of got me interested in conservation and natural resources management, and I've been doing it my entire career.

Laura 
So we have a lot of listeners that are new to the environmental world so students and even some career changers, and a lot of what you do seems to sort of bridge the gap between commercial work industry and academia, which is awesome. Have you noticed that this is kind of a shift that's happening, there's a little bit more collaboration between the industries and the universities.

Dr. Aaron Lien 
Yeah well I would say that, certainly, you know, from our perspective at the university is we see a lot more encouragement from the university administration but also in our work and what we're trying to do to work with stakeholders work with industry partners work with local communities and communities beyond, you know, Tucson, to engage them in our research because we want the work that we do to make a difference. We're not just sitting here in our offices doing research for our own edification our own benefit we want to see what we're doing, hopefully make a difference in the world and that's really a big part of why we're doing this work with NEPA. So yeah, I think it's becoming more and more common for folks in the academic world to be engaging outside of academia.

Laura 
I think that's really great to see. It's awesome that you have a really good dedication to mentorship of women and minorities and first generation students that's really awesome. Do you have advice for our listeners about how to incorporate diversity into their programs are things that you've found to work really well.

Dr. Laura Hoffman 
That's an interesting question. It's a very good question so advice to somebody who wants to have a more diverse team. Yeah, well, obviously, a diverse team is really important because you have people coming from all different backgrounds with all different perspectives. And I know that people say that, but it's actually true. I over the years have worked with so many people from such different backgrounds and that's what this project is about really working with all sorts of different backgrounds. And so, one of the things that I really most love is a student who came from a background, different from mine because my, you know I have family members in this field for years, but getting them excited in, like, somebody who has a real interest but they never kind of had the way to do it. And so the biggest thing that I say to people starting off as it's okay, this, you don't have to be perfect. Your first step doesn't have to be perfect, and don't make the you know the perfect the enemy of the good. And so that's my main kind of advice for people from different backgrounds, in terms of putting a team together. Just put a team together and, You know support people and give them a chance to go, and then they go.

Laura 
Yeah, that's great, I love that. I love bringing in young people from all different spaces and just, I like to push them to the edge and say okay, now you're up.

Nic

Yep, Now you've got to fly.

Laura
Awesome.

Nic 
Yeah 100%. And Laura you you run a lab that endeavors to use science to inform environmental governance and policy which is just really cool it's gonna be a lot of cool stuff that you guys are gonna hear on the show today so. So how have you seen data science impact policy.

Dr. Laura Hoffman 
So, thank you for that question. I think that it gets at the heart of what we're trying to do. So we're very interested in NEPA, and what we're doing is we're using cutting edge data science, to try to do two things. One is to understand how well NEPA has worked over the last 50 years. And then the second thing is to figure out ways for making NEPA work more smoothly and efficiently moving forward. And so, data science is really important. You mentioned that we were working on the NEPAccess database. And actually, we call it, maybe sounds like a nerdy term but we call it a knowledge discovery and engagement platform. Knowledge sort of knowledge sort of being the database part right. So you can go there and you can get knowledge from it, but the discovery and engagement go way beyond what somebody would normally think of as a database. So we really envision it as a one stop shop for all things NEPA, where you can go and you can get data so you can get knowledge but you can also sort of discover new things, so we're building these cool algorithms that help you kind of look at NEPA and get data from NEPA or look at NEPA processes in new and different ways so discover new things about how NEPA has worked that nobody would have ever thought possible. for example, you might be able to look across the entire country for one kind of action and connect it, even though you're crossing agencies. And so, discovering new things about it and then engagement is a super easy, one stop shop for people across the country of all sorts from the local person who's worried about a project in their backyard to an environmental professional, and so we're really excited and stuff like this wouldn't be possible without data science.

Nic
It is, it is really, really cool stuff.

Laura 
And Aaron, you're an environmental, social scientists with a focus on solving environmental governance challenges, There's been a strong element of understanding human behavior and decision making in your work so can you tell us a little bit about that.

Dr. Aaron Lien  
Yeah, so I'm really interested in how policy interfaces with people's attitudes and beliefs and preferences for natural resources management or how we use natural resources, how we're managing forests and range lands and other spaces. And so I kind of approach things from you know doing kind of interview work and survey work from working with people to understand where what they think about things and then to look at the policy structures and see how those things interface with one another because policy influences what people think and what people think influences policy right so this is a two way street. And if we look at things just from one direction we probably miss some of the connections there and some of the things that are driving outcomes, but the other thing that I do in my work and that Laura and I do working together is we also think about things from both the social side, and the kind of ecological or environmental science side of things so we don't only need to look at kind of like how policy structures work and how people think about things but also how the environment responds to people's preferences into our management actions and vice versa. And so we need to think of these things as connected systems, and that's a big part of what NEPA really does so NEPA is way ahead of its time because it's written in the preamble to the law, that this is about people's influence on the environment and our place in the world right and so understanding humans impacts on the natural environment, and looking at it as a holistic thing, and that's what NEPA is all about. So by improving access to NEPA documents, access to the NEPA process we can really help people. We hope we think Lauren, improving the outcomes that we see on the ground, through this right now maybe somewhat confusing or opaque NEPA process that we have to make it more accessible to people.

Nic  
Right, yeah. And it's, it's a great segue to diving in to NEPAccess, and it's, it's a really cool program because it's literally, where Laura and I meet on the environmental spectrum, you know, we're solving NEPA issues with big data, which is just an it's an amazing thing. So you've developed this is a tool that helps you search through NEPA documents, and I believe it has it all the EISs published since 2012, which as an NEPA nerd is just great like it really is so cool to rummage around in it and I couldn't help it be impressed by the scope and scale of this project. So, you know, like you say, you can search by industry location client even your own name which is, I'm not saying I did it, I'm just saying it's possible to do it. Can you walk us through what the product is what each of your roles are and what your goals are for it. Dr. Laura.

Dr. Laura Hoffman 
Yeah, yeah. So actually now, I think since you last looked at it, we have almost 80% of all EISs since 2002. I believe. So we're moving, So we're moving fast so walking through it just like taking the knowledge discovery and engagement framework so the knowledge is we're trying to collect all EISs from 87' forward, and we have pretty good coverage now from 2002 forward, and then making those available in one place I think is huge, makes it simple for people to find stuff. Right now the EPA has a lot of stuff but then there's stuff on different agency websites so we're trying to collect it all in one spot and then we're processing each of the documents so that it's full text searchable. So it's a bit of an advantage in the past, because things weren't necessarily full text search where you could go into the document and search any word. The other thing that we're doing now that we hope to have by the end of the year is a little bit more comprehensive metadata than is in the EPA site. So for example, one of the things that a lot of people want to know is, you know, not just the state. But who are the leading cooperating agencies, and that's not in the EPA metadata, and so we're extracting that right now. And so that's basically what we hope to have by the end of the year is pretty good coverage from 87' on. I don't know if we can make it by the end of this calendar year, but we hope soon. And then, a little bit more metadata, and sort of a good coverage. Moving forward, we want to have other kinds of NEPA documents, and then more sophisticated stuff that allow you to get into that next part which is the discovery and engagement for the discovery is. So, what some of the data science challenges, and if I'm going too long. No, no. So, one of the challenges is turning text into data. And so that's a huge field of data science called natural language processing, where you understand how people write and talk in natural language and you turn that into data. And so it's really, it's actually pretty hard to do. And a lot of work has been done say like on movie reviews to turn, you know, what somebody says about a restaurant on Yelp or a movie review, and those are often short documents. And so one of the really cool, cutting edge things about this project, even for the field of natural language processing is they're not used to working with such long documents. So that's like we're pushing the envelope of data science, by doing this project which is I think really cool.

Nic  
That's incredible. Yeah, I mean that's, that's, I'm almost overwhelmed. So is that why this process is hard, is it because the data is overwhelming, is that what,

Dr. Laura Hoffman 
yeah. So the first problem that we had was that there wasn't a single repository of data and so we've been collecting that we've had students for the last year scraping every nook and cranny on the web, to try to find where anything is. The second challenge is that turning text into data is the challenge. The third challenge is that it's so long. The fourth challenge. The fourth challenge is that EISs aren't necessarily done in the same order. So like, they have to have standard information but it's not in the same order, so you have to train the machine how to find something. So our data science, or not natural language process expert, Steven Bethard, you know, he had this great analogy like you usually expect the bathroom in the house to be near the bedrooms or sometimes sort of off the living room, But sometimes somebody puts it like in the garage and you have to have a machine that can go through 10,000 houses really fast and figure out where the bathroom is. Yeah, and so that's the next challenge because we would like people to be able to do this discovery part which is to zoom in and be like, I just want to find this sort of section I want to find the preferred alternative section and I want to be able to compare that across agencies for one certain kind of action, and like look at how that's changed over time, and we can do that for you but we first have to find that preferred alternative section. So that's the next challenge. And then the other challenge is we also, in terms of engagement, we want to be able to go in and from documents sort of extract how much engagement there was what kind of people engaged. And then even the comments like what they had to say about it. And so, you know, again going back to the movie reviews. You either like a movie or you hate it and you're pretty clear about it right right right right. So in a comment on an EIS somebody might say something like that mountain is just a pile of rocks, mining it's our hope and future. That sounds like a very positive comment. And somebody else might say, That mountain is more than a pile of rocks, it's our hope and future. And so your data science algorithms for movie reviews would pick those both up as positive comments, and yet they're saying the opposite. Yeah, so that's another part of super challenging data science it's called sentiment analysis and it's really exciting. And we have some students who've been working on that nothing yet to share, but to distinguish between those two super sophisticated.

Nic 

Yeah, and it's something that seems really obvious and easy it's like trying to teach sarcasm though is one of the hardest things to do, right, because it's the words seem good, but they're not. And that's a really hard challenge.

Laura 
Yeah, me of my Google email will now tell me what the tone is of my email before I send it, so that seems similar to that.

Dr. Laura Hoffman 

So that's exactly what Google is using called sentiment analysis, and we're going to try to do it on EIS comments.

Laura
That is cool.

Nic  
 That's super cool. Wow. So Aaron, in your role in this process, is it, are you doing more of the data side of things are you doing something different.

Dr. Aaron Lien  
I'm not a data scientist so I'm not doing the natural language processing or the database development sort of work, and all of that as a social scientist and policy expert, I'm kind of bringing that expertise to the project to understand how the NEPA process works and think through how stakeholders of different types might engage with the tools so what do they actually need to do the things that you know would be helpful for engaging in the NEPA process and so you can think about things like, you know, we're working on some tools to be able to map EISs and put them on a landscape and so, you know, some an EIS might say well this project takes place in Pima County, Arizona. Well Pima County, Arizona is about the size of Connecticut and Rhode Island. Right, so that's not useful. So to make that useful you need to be able to drill down further. And so figuring out those things that make these tools really useful for people that are engaging in the NEPA process either consultants or the general public, or NGOs, federal agencies, So that's kind of my role in this project is as we start developing these tools thinking through what we really need to do to make them useful tools and not just tools that are there.

Nic 
Right, right. Yeah, it's always a pretty important part of the process and, you know, I can't help but wonder like what actually prompted your interest, developing the NEPAccess like, what was it about the environmental policy what drew you to it.

Dr. Laura Hoffman 

So I'll take that one. So I'm an environmental scientist Aaron's an environmental, social scientist so we're slightly different, so I'm used to looking at change on landscapes and thinking about what's happened in the past and what might happen in the future. And, you know, I was always aware of NEPA as a super big trove of information. When I started at the UofA as a very junior person, I started working with the Dean of the Law School, and I was really shocked as somebody who likes to work with data sets and big data sets that there wasn't even a single repository of NEPA documents. So, as an environmental scientist who likes to work with data, I'm not a computer scientist, I'm not a data scientist but I'm an environmental scientist, that was sort of shocked that this major statute which really has a lot of influence on our environment, had no data behind it. And so, no data to, you know, I mean like you couldn't get the data out of it and some of that data in it like some of the water quality reports that the consulting companies do that you can't find that data anywhere else. So that was a shock to me, and then it was also shocked that you couldn't really just track across projects and sort of look at NEPA as a process. So, it just sort of got my nerdy side, the data aspect of it to sort of piqued my nerdy side a little bit. I mean I already knew it was a super important statute, I just the nerdy side of me came out.

Dr. Aaron Lien 
To add a little bit to that from the policy side from policy scientists or a social scientist perspective, you know NEPA is the cornerstone law of US Environmental Policy, and it's been mimicked by most states most states have their own kind of version of NEPA that they use and some are more rigorous than others but there is something there in terms of environmental review in most states and then many other countries have mimicked NEPA as well, then as it stands now, we have very, very little information on what NEPA actually does in terms of efficacy and, you know, does it improve environmental outcomes. If it does, in what ways. So is it effective. How do we make it more effective, how do we analyze the outcomes of NEPA. This is the information we don't have the part of the reason we don't have it is because it's just so hard to do. You have all of these NEPA documents are scattered around in different agencies not compiled, there's no systematic way of analyzing them they're all very long so if you do want to do an analysis, you know, right now, and analysis in the published literature of 20 EISs is a really big data set. But that's a really tiny data set, if you're looking for systematic knowledge, right, so that's a another core challenge as part of why I'm so interested in this work.

Nic 
Yeah, And it's this whole thing is really, really, I mean I can't overstate how impressive it is, but I don't want to go to your heads, you know, but I do want to hear a little bit more about what you guys think we'll be able to do with NEPAccess that really couldn't do before.

Dr. Laura Hoffman 
Yeah, so no it's not going to go to our heads, thank you for being careful of that. Just because it is, It's a pretty hard problem so we actually get a lot of suggestions from folks, could you do this or could you do that, and we've often thought about that already, but it's not as easy as people may think we just trying to get some more sophisticated meta data is really tricky, so a really obvious thing is, you know you'd like to know who the leading cooperating agencies are, and we have to sort of extract the cooperating agencies. We'd also like to know what kind of action, it is, there's nobody who's checking and saying here's sort of 20 different kinds of, you know, typical actions and what does this EIS represent you and so that's really useful knowledge, and we're trying to do it, and I should just say that when you build an algorithm, you actually have to give the machine, something that's true. And then it starts to do use machine learning to come up with an algorithm that you know you can use what's true on this known data set and then you can compare it to unknown things, but how do you give it that true thing you have to do it yourself by hand in person. So we have a small army of undergrads and Laura was mentioning that earlier the mentoring, and they are like super excited. We have a bunch of undergrad NEPA nerds at the University of Arizona like you wouldn't believe who will spend hours and hours trawling through these things to try to figure this out so that the machine learning can go into these more sophisticated things. So in the future, we would like it to be able to, we're using something called a network analysis so it's just all sorts of different kinds of connections. So we want people to discover connections that they may not have even thought of or to be able to compare, you know, one kind of action across the country over the years and see how the analysis changed and the alternatives that people were considering, you know maybe changed over the years, we want people to be able to look up their own names and see who worked together on different EISs, or to look and see what kinds of groups are commenting, and if that changes from agency to agency or over the years, I mean all sorts of kinds of connections, about how NEPA's working and how people are engaging in it that you would never have imagined. And then we'd like it to be, you know, we'd like to help fulfill the vision of a very science based, but also a kind of a dialogue between the government and the public about the environment, so we hope that some of our tools will make it easier for the public to engage in NEPA, you know if the public knows that there's one place where they can find information and perhaps one place where they can engage in NEPA, and make comments, make it easier for the public to do research about similar projects of interest and reach out to other interest groups, an easy place for people to comment and then you know if we can get our algorithms like Laura was talking about earlier, you know going well then, then agencies can get really quick feedback about what people are thinking and what the major concerns are. So we really do hope that it's a one stop shop, and then it's more than just knowledge but it's really like, let's discover new things let's figure out how to do things better, faster, quicker, more efficiently, with a more engaged approach.

Laura
That's awesome.

Nic
I'm speechless.

Laura
How big is your team.

Dr. Laura Hoffman
So right now we have at any given time over the last three years we've had about 20 people. I'd say we've always had about five to seven undergraduates working on this, and we're just super proud of them and they're the backbone of our work. We have to lead data scientists and then a number of other data scientists working on this. And I just want to give a shout out to this, our data science team, because we have top notch people, new people who have worked for the defense, you know, in defense applications, The mapping is useful in defense applications, and we've got Suda Ram. One of our lead scientists has worked a lot with business with really high powered companies doing business applications and then Steven Bethard, our natural language processing person has worked with defense related applications. And obviously we can't compete to the kind of funding that they could get from, you know, from business or from defense, but they're doing this because they've gotten excited about NEPA, and they care about the environment, and they care about people being engaged, and so we're super lucky to have like people who could be doing something else in the industry, but who are willing to stay in academia and to work on this problem.

Laura 
That's fantastic. I'm also a career coach and I just co authored with my, with my one of my students who works with me. The article about how to fill a policy and regulation gap. And so now I have another resource to send them to. And you're doing it. This is how you fill your gaps get involved in projects like these. So that's really cool. I imagine most of the feedback you've been getting is really supportive, have you gotten anything that has kind of surprised you.

Dr. Laura Hoffman  
Well, we are working really hard as Aaron was talking about to make it user friendly, and so you can go to the platform right now and register to login, so it's NEPAccess, NEPAccess.org, and you can ask to register and then we'll send you a login, and we just ask that you give us feedback. And so we have been getting feedback that you know some things just don't make sense, or it's not easy to use, or people aren't finding, immediately what they want. One of the things that we've been having trouble with is people been looking for EISs from the 90s and in even though it says on the website we have really good coverage from 2002 they somehow miss it. And so that's a common kind of concern that we've been having from people, and you know, it's hard to, there's just certain things about a website that as a developer, you don't know what somebody else will stumble on. So we have a really good user experience person, Aaron's working kind of at the high level to make sure it's useful for people, but we have a team that is working like specifically like does this button here make sense on the side and if not, how could I change the label so that you know the word record may mean one thing to somebody. And so that's the basic feedback is it's just people like it, but we just want to make it super easy and intuitive to use.

Nic  
Yeah, yeah, it really is and I wanted to ask you that. So it's a really ambitious project, so that means it has to be hugely collaborative but I'm wondering, do you have any challenges going forward that you'd like to talk through.

Dr. Laura Hoffman 
Yeah, it is hugely collaborative as we mentioned we have everybody on the team from lawyers to natural scientists to social scientists and data scientists, our biggest challenge moving forward is mapping. We want to be able to take those EISs from the past, and then also figure out how to move forward and map, where the impacts are, and sort of make ecological footprints. You know one of the nice things about NEPA is the cumulative impacts assessment, and yet that's almost impossible to do to date with the sort of existing technology. And so, that mapping is really really important and it's not just mapping it's like how sophisticated Do you want it to be. Do you want just a sort of a polygon on a map or do you want to be able to come back and say, here's the polygon for this kind of impact to this degree, it's a separate different polygon, you know, that's the water impact here's the bird population impact it's. And so, to make all of those polygons, that's really, I think that's our biggest technical challenge. Aaron, I know that you agree with me.

Dr. Aaron Lien 
Sorry, I'm just waiting for the fighter jets to stop flying over my house. But yeah, I totally agree the mapping is our kind of current, I think the big technical challenge that we're working on and then, you know, in addition to the some of the things Laura was saying about you know what we actually hope to do with with the mapping and cumulative effects and things just to talk a little bit about the challenge and why it is so hard. You know EISs can mention lots of different places, depending on what they're about right and so we need to be able to differentiate between the places where the project where a given project is taking place versus, You know, if it's referring to another project in a different place then you're bringing in data from another place and and so that's not where this project is and we need to be able to find a way to do that easily with the computer because we can't just sit and read every EIS right and yeah for the map that is, and so that you know those sorts of challenges as what we're trying to figure out with machine learning and natural language processing to be able to map these so we really can get at some of these big cumulative effects questions, and then also going forward with EISs is in terms of, you know, the discovery element of this and engaging stakeholders, helping communities engage in the NEPA process so when NEPA processes are ongoing, they can see you, where the expected impacts are and think about how they might engage in public comments based on those geographic impacts.

Laura 
Great. So this is a huge, huge project, and I think it's going to be incredibly valuable. How are you funded for the project, are you funded for the project are you continuing to get funds, how's that working.

Dr. Laura Hoffman 
Thank you for asking. So we had an original startup research grant from the National Science Foundation to do this work. And, you know, the National Science Foundation is really interested in improving cyber infrastructure and data science infrastructure. And so we've created a basic infrastructure that can help answer a lot of questions about the environment in the US, there's still a lot of work to do. So one of the exciting things is, You know the example I mentioned earlier about hydrology, about making that data about water quality available would be super helpful. And that's something that you know would really help a lot of fields from hydrology to conservation biology to public health. And so we do hope to, you know maybe convinced the NSF National Science Foundation and NIH to give us more funds. So we'll be applying for more funds from them but we're also looking for private foundations and donors and even partnering with in collaborating with private industry as well. So we're definitely in the new phase of looking for funding we've close to completing our first round of funding and we're happy with what we've produced but we're really excited. We've got a lot of ideas for moving forward.

Laura 
Well good luck with your next phase. So what is next for you.

Dr. Laura Hoffman 

So next phase is just completing off the platform that we have now, trying to get as many documents going back to 87' as we can, and then just having a little bit more metadata, like lead and cooperating agency, cooperating agency and type of action and then starting to look for new opportunities and new collaboration so we can keep this going.

Nic 
Oh man, I, you know, very rarely speechless but you can hear me struggling here, I, it's really cool, it's a really really neat thing and really neat program and I'm really really excited to see it implemented and then where it goes like both of those things are exciting for me so I want to thank both of you for being here. We're really close to time but I don't want to let you go without asking you if there's anything else you'd like to mention about the project or anything else you're doing before you go,

Dr. Laura Hoffman 
Just to thank you guys. I mean we are you know you're an AP is one of our main user groups and so that you think that what we're doing is useful Nic, either just keeps us going for another couple months, maybe, you know, in two months, you could tell us again but it's really helpful. It's really helpful to know that folks out there the professionals in the field, see the value in what we're doing. So thank you very much for, for having us on today and for being excited about this as much as we are.

Nic
Yeah, absolutely, for sure, and Laura thank you as well. Yeah.

Laura 
Yeah, thank you both for being here. I think this is an excellent project, you have a lot of tasks on your hands, but I think it sounds like you've got a qualified team and while you're more than qualified to lead this project so I think it's a noble endeavor and you're gonna do a lot, you know, just having that historical data, that you can search through and it elevates everything everyone does, going forward, so I think it's really great. So thank you just so much for being here and sharing with us today.

Dr. Laura Hoffman
Thank you, thank you guys both.

[Outro]

Laura
That's our show. Thank you, Laura, and Aaron for joining us today and sharing more about your project, it sounds really cool can't wait to see how it comes out and changes things for the better. So please be sure to check us out each and every Friday, and also don't forget to subscribe, rate and review. Catch you next time.

Nic
See you, everybody.

Transcribed by https://otter.ai


Intro
Shout outs
Nic and Laura discuss Big Data
Interview with Dr. Laura Hoffman and Dr. Aaron Lien starts
Dr. Laura Hoffman and Dr. Aaron Lien talk about NEPAccess
Drs. Hoffman and Lien also talk about Big Data's role in NEPAccess
Drs. Hoffman and Lien discuss Academia's part in NEPAccess
Outro