November 01, 2020
DataCafé
Season 1
Episode 9

US Election Special

DataCafé

More Info
Share

DataCafé

US Election Special

Nov 01, 2020
Season 1
Episode 9

DataCafé

What exciting data science problems emerge when you try to forecast an election? Many, it turns out!

We're very excited to turn our DataCafé lens on the current Presidential race in the US as an exemplar of statistical modelling right now. Typically state election polls are asking around 1000 people in a state of maybe 12 million people how they will vote (or even if they have voted already) and return a predictive result with an estimated polling error of about 4%.

In this episode, we look at polling as a data science activity and discuss how issues of sampling bias can have dramatic impacts on the outcome of a given poll. Elections are a fantastic use-case for Bayesian modelling where pollsters have to tackle questions like "What's the probability that a voter in Florida will vote for President Trump, given that they are white, over 60 and college educated".

There are many such questions as each electorate feature (gender, age, race, education, and so on) potentially adds another multiplicative factor to the size of demographic sample needed to get a meaningful result out of an election poll.

Finally, we even hazard a quick piece of psephological analysis ourselves and show how some naive Bayes techniques can at least get a foot in the door of these complex forecasting problems. (Caveat: correlation is still very important and can be a source of error if not treated appropriately!)**Further reading:**

**Article**: Ensemble Learning to Improve Machine Learning Results (https://bit.ly/34MW3HO via statsbot.co)**Paper**: Combining Forecasts: An Application to Elections (https://bit.ly/3efx5nm via researchgate.net)**Interactive map:**Explore The Ways Trump Or Biden Could Win The Election (https://53eig.ht/2TIlAvh via fivethirtyeight.com)**Podcast:**538 Politics Podcast (https://53eig.ht/2HSkwCA via fivethirtyeight.com)**Update US polling map:**Consensus Forecast Electoral Map (https://bit.ly/2HY1FWk via 270towin.com)

*Some links above may require payment or login. We are not endorsing them or receiving any payment for mentioning them. They are provided as is. Often free versions of papers are available and we would encourage you to investigate.*

*Recording date: 30 October 2020*

Thanks for joining us in the DataCafé. You can follow us on twitter @DataCafePodcast and feel free to contact us about anything you've heard here or think would be an interesting topic in the future.

What exciting data science problems emerge when you try to forecast an election? Many, it turns out!

We're very excited to turn our DataCafé lens on the current Presidential race in the US as an exemplar of statistical modelling right now. Typically state election polls are asking around 1000 people in a state of maybe 12 million people how they will vote (or even if they have voted already) and return a predictive result with an estimated polling error of about 4%.

In this episode, we look at polling as a data science activity and discuss how issues of sampling bias can have dramatic impacts on the outcome of a given poll. Elections are a fantastic use-case for Bayesian modelling where pollsters have to tackle questions like "What's the probability that a voter in Florida will vote for President Trump, given that they are white, over 60 and college educated".

There are many such questions as each electorate feature (gender, age, race, education, and so on) potentially adds another multiplicative factor to the size of demographic sample needed to get a meaningful result out of an election poll.

Finally, we even hazard a quick piece of psephological analysis ourselves and show how some naive Bayes techniques can at least get a foot in the door of these complex forecasting problems. (Caveat: correlation is still very important and can be a source of error if not treated appropriately!)**Further reading:**

**Article**: Ensemble Learning to Improve Machine Learning Results (https://bit.ly/34MW3HO via statsbot.co)**Paper**: Combining Forecasts: An Application to Elections (https://bit.ly/3efx5nm via researchgate.net)**Interactive map:**Explore The Ways Trump Or Biden Could Win The Election (https://53eig.ht/2TIlAvh via fivethirtyeight.com)**Podcast:**538 Politics Podcast (https://53eig.ht/2HSkwCA via fivethirtyeight.com)**Update US polling map:**Consensus Forecast Electoral Map (https://bit.ly/2HY1FWk via 270towin.com)

*Some links above may require payment or login. We are not endorsing them or receiving any payment for mentioning them. They are provided as is. Often free versions of papers are available and we would encourage you to investigate.*

*Recording date: 30 October 2020*

Thanks for joining us in the DataCafé. You can follow us on twitter @DataCafePodcast and feel free to contact us about anything you've heard here or think would be an interesting topic in the future.