Intellectually Curious

Research Reimagined: Papers You Can Talk To

Mike Breault

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 5:10

Justin Ross, a professor of public finance and economics, co-authored a new empirical working paper (alongside Whitney Afonso and Denvil Duncan) and built a local Model Context Protocol (MCP) server to accompany it. This MCP provides a structured interface that allows readers to interact with the paper's underlying data using natural language via a Large Language Model (LLM).  Integrating Model Context Protocol (MCP) servers into research papers could act as a "positive referee productivity shock" that significantly speeds up the peer review process.  We dive deep!


Note:  This podcast was AI-generated, and sometimes AI can make mistakes.  Please double-check any critical information.

Sponsored by Embersilk LLC

SPEAKER_01

You know that feeling when you try to open like an old zipped software file from maybe five or ten years ago? You uh you finally get it unzipped, you click run, and you are just immediately hit with endless red flashing missing dependency errors. I mean, have you ever felt that sheer technological despair?

SPEAKER_00

Oh, absolutely. It is a very specific type of agony. You know? Because you have the data right in front of you, but without the right environment to run it, it's just completely unreadable.

SPEAKER_01

Exactly. Well, today we are taking an incredibly optimistic deep dive into our sources. So we're looking at a Substack post by Professor Justin Ross, an article on model context protocols, and some Wikipedia history. And we were looking at a brand new tech standard that is, well, curing that exact despair in the academic world.

SPEAKER_00

Aaron Powell Yeah, we are focusing on a new empirical working paper co-authored by Justin Ross along with Whitney Afonso and Dumble Duncan. And they've done something practically unheard of here. They attached a custom model context protocol, or MCP, directly to their research.

SPEAKER_01

Right. And I read that in the sources and immediately just hit a wall. What exactly is an MCP, mechanically speaking? Because I mean we know it's an open standard created by Entropic, but that doesn't really tell me how it actually works.

SPEAKER_00

Well, think about the underlying problem it solves first. Traditional replication packages in science. So like the raw data and code behind a paper, they have a terrible interface problem.

SPEAKER_01

Oh, because of the formatting.

SPEAKER_00

Right. For you to verify a researcher's work, you have to decode their messy code, install their specific software, and basically figure out why some random variable is so crucial to the whole study.

SPEAKER_01

So if I want to see if their math checks out, I basically have to learn their personal coding quirks.

SPEAKER_00

Exactly. And the MCP fixes that. The official documentation actually describes it as a USB-C port for AI. It's a standardized translator.

SPEAKER_01

Oh, wow. Okay, a universal plug.

SPEAKER_00

Yeah, instead of an AI assistant trying to guess the syntax of a specific database, the MCP gives it that universal plug. It translates natural language into specific, coded commands that the database instantly understands.

SPEAKER_01

That is so cool. Before we get into how Ross actually used this universal plug, though, a quick word. This podcast is sponsored by Embersilk. Need help with AI training, automation, integration, or software development, uncovering where agents could make the most impact for your business or personal life. Check out Embersilk.com for AI needs. So looking at Ross's paper, he built a local server that exposes over 20 analytical tools. But the sources throw around terms like balance tests and fixed effects regressions.

SPEAKER_00

Yeah, it gets a bit dense.

SPEAKER_01

Right. To a general reader, that just sounds like academic alphabet suit.

SPEAKER_00

That is a totally fair point. Essentially, those are just complex statistical checks to make sure the data isn't biased or skewed by outside factors. And Ross set it up so you don't actually need a statistics degree to run them.

SPEAKER_01

Wait, really? How does that work?

SPEAKER_00

Well, by hooking a large language model to the paper's GitHub, you can literally just type replicate table one for the specific survey sample.

SPEAKER_01

I'm stuck on how this prevents the AI from just hallucinating the math, though. Because I mean ChatGPT makes up numbers all the time. How does the server stop that?

SPEAKER_00

Because the AI isn't actually doing the math. That is the genius of the MCP. The LLM is literally just acting as the conversational interface.

SPEAKER_01

Oh, I see. So it just passes the instructions along.

SPEAKER_00

Exactly. It takes your plain English question and hands it off to those 20 constrained verified tools that Ross built into the server. The server runs the actual hard-coded math and hands the correct answer back to the AI to show you.

SPEAKER_01

Wait, so if the data is instantly interactive and mathematically sound, doesn't that completely eliminate the months of back and forth you usually see in peer review?

SPEAKER_00

Oh, absolutely. It strips the waiting time entirely out of the scientific method.

SPEAKER_01

Aaron Powell, which is usually a nightmare, right?

SPEAKER_00

Yes. Normally a reviewer asks an author to drop a subgroup or add a control variable. The author reruns it, writes a response, the editor waits, the reviewer rereads. I mean, it takes months, sometimes years.

SPEAKER_01

Aaron Powell But with this, the reviewer can just ask the server to run that alternative test themselves instantly.

SPEAKER_00

Precisely. And Ross is already pushing this forward to make it even easier. Yeah. He is currently developing a simpler URL-based version of the MCP, so you won't even need to download files to your computer.

SPEAKER_01

That makes it so much more accessible.

SPEAKER_00

It really does. And he's also submitted a grant proposal at Indiana University for a dedicated campus server to pilot this system for other researchers.

SPEAKER_01

This is just a massive, hopeful leap forward for human knowledge sharing. I mean, the PDF solved how to make a scientific argument easy to store, but the MCP solves how to let you truly investigate it.

SPEAKER_00

Yeah, we are moving toward a world where science isn't just a static document you consume, but a dynamic environment you explore.

SPEAKER_01

Which raises a really wild thought. If every data set becomes an interactive, conversational AI server, it makes you wonder if future generations will even know how to read a static chart or if they'll only know how to interview one. It is an incredibly exciting frontier. Well, if you enjoyed this podcast, please subscribe to the show. Hey, leave us a five star review if you can. It really does help get the word out. Thanks for tuning in.