AI & Marketing Research with Dr. Eva Wolf
Not another AI news podcast. This is a research radar — a twice-weekly briefing that surfaces peer-reviewed studies on AI and marketing, tells you what the evidence actually says, and helps you decide what's worth a deeper read.
AI & Marketing Research with Dr. Eva Wolf
AI Ads, Trust & When Simple Beats AI: 2 Research Signals
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Here's the uncomfortable question this week. Is the AI tool you're paying for actually better than something free from 2004?
SPEAKER_01That's the thread running through today's papers. AI is reshaping how commercial influence works, and whether newer always means better is suddenly an open question.
SPEAKER_00We screened 140 papers. Two made the radar.
SPEAKER_01Quick caveat: this is a first-pass research briefing, not a final academic review. We'll tell you what the papers suggest, what they don't prove, and which ones deserve a deeper read.
SPEAKER_00Both papers are preprints. We'll flag that as we go. Okay, let's get into the first one.
SPEAKER_01Paper one. Every major AI platform, Google, Microsoft, OpenAI, is already a commercial channel, and most of your audience has no idea how deep that goes.
SPEAKER_00That sounds like hype until you look at what's actually underneath it.
SPEAKER_01Right. Two researchers, Chu and May, map the entire landscape of commercial influence inside AI systems, not just the labeled ads, everything.
SPEAKER_00The invisible stuff.
SPEAKER_01Exactly. They reviewed public documentation from six platforms as of May 2026. Google AI Overviews, Copilot, Chat GPT, Meta AI, Perplexity, Claude, and they built a taxonomy.
SPEAKER_00What does it look like?
SPEAKER_01Four levels. Level one, a product gets mentioned. Level two, information is framed to favor a product. Level three, the AI nudges you toward an action. Click by Make a Booking. Level four. Over time, the AI shapes what you actually want.
SPEAKER_00Level four is the one that keeps me up at night.
SPEAKER_01Because it never looks like an ad.
SPEAKER_00It looks like a preference. I just like this brand. No, the AI built that.
SPEAKER_01And here's the nuance. Right now, all six platforms keep paid ads visually separate from AI answers, labeled boxes, above or below the response, not mixed in.
SPEAKER_00So they're not doing the scary thing yet.
SPEAKER_01Not officially, but research already shows that when ads are woven into AI responses, even in experiments, people can't tell a paid recommendation from a genuine one.
SPEAKER_00The separation is one policy change away from collapsing.
SPEAKER_01And there's a finding buried in this paper that marketers should sit with.
SPEAKER_00What is it?
SPEAKER_01Organic chat GPT referrals to e-commerce sites are already converting at higher rates and generating more revenue per visit than paid social ads.
SPEAKER_00Wait, that matters.
SPEAKER_01No ad buy, no campaign. The AI just mentions you, and the buyer converts better than from a paid click.
SPEAKER_00Because the AI gave them a recommendation, not an ad.
SPEAKER_01That's the inference, yes.
SPEAKER_00Plain English payoff. AI is already a major commercial channel, and the brands winning it aren't running ads. They're getting cited. Money Move. If you're not actively working on generative engine optimization, getting your brand into AI answers organically, you're already behind. I'd redirect a slice of paid social budget there this quarter.
SPEAKER_01That's a real shift. Worth thinking through before you move money.
SPEAKER_00Agreed, which is why the Friday action is small. Pull up chat GPT and perplexity. Search for the problem your product solves. If you don't appear in the answer, that's your GEO starting point.
SPEAKER_01Evidence check, pre-print, and entirely theoretical. The four-tier framework is proposed, not empirically tested, and that e-commerce conversion number comes from a cited third-party data set, not the author's own research.
SPEAKER_00So the framework is smart, the direction is clear, but we're not looking at controlled experiments.
SPEAKER_01Not yet.
SPEAKER_00Radar verdict. Read now. Even as theory, this is the most complete map of AI commercial influence that exists, and it's directly useful for anyone planning AI channel strategy.
SPEAKER_01Paper 2. This one's going to feel counterintuitive.
SPEAKER_00Good. I'm ready.
SPEAKER_01Researchers tested six methods for identifying the right expert reviewers for scientific proposals. Six methods, including GPT-4, OMINI, and transformer-based embeddings.
SPEAKER_00And the AI one?
SPEAKER_01No, a statistical method from the 90s one.
SPEAKER_00Say that again.
SPEAKER_01TFIDF, term frequency inverse document frequency. It counts how often specific words appear in a document. That method correctly identified a true expert in its top 25 recommendations nearly 80% of the time.
SPEAKER_00And GPT-40 mini?
SPEAKER_01Just over 51%.
SPEAKER_00That's nearly 30% points.
SPEAKER_01On a benchmark of 435 proposals and 379 reviewers, not a tiny test.
SPEAKER_00Okay, why? Why does the simple thing beat the smart thing here?
SPEAKER_01Because in specialized fields, tiny differences in vocabulary matter enormously. Subtopic names, niche terminology, LLMs smooth those differences away. They're built to generalize meaning.
SPEAKER_00Which is exactly what you don't want when the whole job is finding someone who knows this exact thing.
SPEAKER_01Right. The generalization that makes LLMs great at broad tasks makes them worse at narrow matching.
SPEAKER_00Let me translate that for marketing. If you're targeting a specialized audience, B2B technical buyers, medical professionals, legal teams, and you're using an AI recommendation engine.
SPEAKER_01You might be worse off than keyword matching.
SPEAKER_00Which nobody wants to hear after buying the AI platform. Plain English payoff. In specialized domains, the AI that sounds smarter may actually be matching worse, and a keyword counter might outperform it. Try this by Friday. If you're running an AI-powered recommendation or audience matching tool, ask your vendor one question. What's your accuracy versus a simple keyword baseline? If they look confused, that's your answer.
SPEAKER_01Evidence check. This study was conducted entirely in astronomy. One field, one organization's peer review cycle. The leap to marketing audience targeting requires real inference. And GPT-4, O Mini, isn't the most powerful LLM available. Larger models might perform differently.
SPEAKER_00So don't throw out your AI tools, but don't assume more expensive means more accurate.
SPEAKER_01That's fair.
SPEAKER_00Radar verdict watch list. The finding is surprising and the methodology is rigorous, but we need to see this replicated in marketing adjacent context before it changes your stack. Okay, so here's what I think is actually happening this week. Both papers are telling us the same thing from different angles. AI is powerful, it's reshaping everything, and the way it works is not what most people assume.
SPEAKER_01Paper 1 says AI commercial influence is deeper and more invisible than the ad labels suggest. Paper 2 says AI's intelligence can be a liability when precision matters more than generalization.
SPEAKER_00Here's the tension. And simultaneously, the research is saying, slow down. Ask what the AI is actually doing. Ask whether it's actually better.
SPEAKER_01The through line is auditability. Paper one argues trustworthy AI advertising requires influence to be attributable and measurable. In paper two, the winning method was transparent. You can see exactly why TFIDF picked a reviewer.
SPEAKER_00The black box is not a feature, it's a liability. Here's the playbook from this week. One, search for your brand in Chat GPT, co-pilot, and perplexity this week. If you're not in the answers, GEO is your next priority. Two, if you're labeling AI assisted content, do it voluntarily. Don't wait for regulators. Transparency is your edge while trust is still fragile. 3. If you're running any AI-powered matching tool for content, for audiences, for anything specialized, ask your vendor for a baseline comparison. Make them show you what they beat.
SPEAKER_01Evidence check on all of that. Both papers today are preprints. The taxonomy in paper one is theoretical. The benchmark in paper two is from one domain. Use them to decide what to test, not what to blindly believe.
SPEAKER_00Small experiments, not big bets, not yet.
SPEAKER_01Links to both papers are in the show notes. Read the originals before making major decisions.
SPEAKER_00See you Thursday. And if something from this episode changed how you think about an AI tool you're using, I genuinely want to hear it.