AI Chatbot Trust, Cold-Start Ads & AI Disclosure: 3 Research Signals Artwork

AI & Marketing Research with Dr. Eva Wolf

Not another AI news podcast. This is a research radar — a twice-weekly briefing that surfaces peer-reviewed studies on AI and marketing, tells you what the evidence actually says, and helps you decide what's worth a deeper read.

All Episodes

AI & Marketing Research with Dr. Eva Wolf

AI Chatbot Trust, Cold-Start Ads & AI Disclosure: 3 Research Signals

June 01, 2026

0:00 | 17:14

Is everything we assume about chatbot design — the personalization, the warm tone, the friendly AI — actually doing what we think it's doing? This week, three studies landed on the radar that challenge assumptions baked into nearly every conversational AI and ad tech strategy right now. The findings are counterintuitive enough to warrant a pause and an audit. In this Research Radar Brief, Dr. Eva Wolf reviews 3 recent AI marketing research papers covering conversational AI trust and reliance, cold-start ad personalization using large language models, and the effects of AI disclosure on brand authenticity. This is a first-pass research briefing, not a final academic review. Papers are assessed for relevance and rigor, but findings should be treated as signals to investigate further — not settled conclusions. What you'll learn: - Why personalizing your AI chatbot's explanations may actually reduce its persuasiveness when used alone — and what happens when warmth is added - Why higher AI literacy did not make users more skeptical of AI advice — and what that means for tech-savvy, B2B audiences - How Walmart used an LLM to generate ad ranking weights from creative content before a single click — and the real-world results from their deployment - Why AI-generated visuals without disclosure can damage brand trust, and why disclosing AI use acts as brand insurance rather than a trust differentiator Papers covered: 1. Personalized to Persuade: The Effects of Contextualization and Warmth on Trust and Reliance in Conversational AI Source type: Preprint (not yet peer-reviewed) Access: Full text reviewed Source: https://arxiv.org/abs/2605.31275v1 2. LLM-HYPER: Generative CTR Modeling for Cold-Start Ad Personalization via LLM-Based Hypernetworks Source type: Preprint (likely peer-reviewed venue — formal status uncertain) Access: Full text reviewed Source: https://arxiv.org/abs/2605.31275v1 — see show notes for correct link 3. Opening AI: A Study of Transparency's Impact on Brand Authenticity and Trust in Visual Advertising Source type: Master's thesis (not peer-reviewed) Access: Full text reviewed Source: Link in show notes Full show notes, transcript, and citations: https://bigplans.media/episodes/ai-chatbot-trust-cold-start-ads-disclosure-research-2026-06-01 DISCLAIMER: This episode is a first-pass research briefing produced by an AI-generated avatar trained on Dr. Eva Wolf's research framework. It is not a substitute for reading the original papers. Two of the three papers covered today are preprints or theses and have not completed formal peer review. Findings should be treated as early signals, not settled evidence. -- This is a first-pass research briefing, not a final academic review. Read the original papers before making major marketing or business decisions. AI & Marketing Research Radar is produced by BigPlans Media. Subscribe wherever you listen to podcasts.

SPEAKER_00 0:12

You're listening to Evita, an AI-generated research briefing avatar trained on the research framework and methodology of Dr. Eva Wolfe, marketing professor, AI researcher, and founder of Big Plans Media. Every day, Evita scans emerging research in AI, marketing, consumer behavior, psychographics, and business strategy to identify the most relevant developments, opportunities, and risks worth watching. These daily radar reports are designed to help busy professionals stay informed without having to read hundreds of research papers themselves. And every Friday, join Dr. Eva Wolf live for her personally recorded weekly AI marketing radar roundup, where she breaks down the biggest stories, explains what actually matters, and shares practical insights and strategic implications for marketers, educators, entrepreneurs, and business leaders. Now here's today's radar report. Here's the uncomfortable question this week. Is everything we assume about chatbot design, the personalization, the warm tone, the friendly AI, actually doing what we think it's doing? Because three papers landed on my radar, and together, they're telling a story that should make any marketing team stop and audit their assumptions. We screened 356 papers, three cleared the full text bar and made the radar. Quick caveat: this is a first pass research briefing, not a final academic review. Every paper today has full text access. I'll tell you what the papers suggest, what they don't prove, and which ones deserve a deeper read. Okay, let's get into the first one. Paper one. You've been told a personalized warm AI chatbot is more persuasive, more trustworthy, more likely to get customers to act. This paper says, not so fast. Researchers ran a clean two by two experiment. Four groups. Some got a personalized AI, some got a warm AI, some got both, some got neither. And in every case, the AI was arguing against an expert opinion the participant had already received. Here's what happened. Personalization alone made the AI less persuasive, not more. When the AI tailored its explanation to your background without warmth, people pushed back harder. Add warmth to that personalization? Persuasion came back. But only to roughly where it started. No features. You're spending real effort to get back to neutral. Okay, here's the piece I actually can't stop thinking about. People who knew the most about AI, highest AI literacy in the study, trusted the chatbot less. Made sense, but then they followed its advice more and were more persuaded by it. I'm telling you, knowing AI can be flawed did not make these people more skeptical in practice. It made them more likely to go along with it. That is the finding that keeps me up at night. And here's the other thing. Across all four conditions, people relied on the AI over the human expert. Regardless of tone, regardless of personalization, people just deferred. Plain English payoff. Personalization and warmth in your chatbot are not the persuasion dials you think they are. And your customers are probably following AI advice over expert advice no matter what you do. Money Move. Audit your chatbot's personalization features. Because if reliance is happening anyway, you might be overinvesting in expensive personalization that isn't moving the needle. Try this by Friday. Take your existing chatbot flow and ask your team honestly. Do we have evidence our personalization settings are changing outcomes? Or are we just assuming? Schedule the audit. That's step one. Evidence check. Preprint. Not yet peer review. The scenario was a fictional budgetary decision, not a real purchase. Single controlled interaction. Directional finding, not settled science. Radar verdict. Read now. The AI literacy paradox, more knowledge, more compliance, is counterintuitive enough that any team designing conversational AI should know about it before their next build. Paper 2. This one's for anyone who's ever launched a new ad and watched it underperform for two weeks because the algorithm had nothing to learn from yet. That cold start problem? Walmart just ran a 30-day live test on a fix for it. Here's the setup. When a brand new ad goes live, ranking systems don't know who should see it because nobody's clicked it yet. So they guess conservatively. New ads get buried. Performance tanks. Then slowly over days or weeks, the system learns. The researchers built a system. The idea? Instead of waiting for clicks, feed the new ads text and images into a large language model. Find past ads that look and sound similar. Let the AI generate predicted ranking weights before a single person has seen the ad. Think of it like this: a new employee walks in on day one. Instead of letting them fumble through their first week, you show them five examples of past successful campaigns and say, here's how those performed. Now judge this one. That's what the LLM is doing. The results. In offline testing, this approach ranked new ads 55.9% better than the previous best cold start method. And in the live 30-day A-B test on Walmart's homepage, it matched the performance of the main production ranking system, the one that's been learning from real clicks for months. That second part is the one that got me. New ads, no history, performing at the same level as a warm trained production model. That's not an incremental improvement. That's collapsing weeks of ramp up time into launch day. And the way they kept it fast at serving time, all the expensive AI computation happens offline before the ad goes live. By the time a shopper loads the page, the weights are already there. No latency hit. Plain English payoff. AI can now read a brand new ad's creative and predict who should see it before any real person has clicked it. So new campaigns hit the ground running instead of crawling through a learning phase. Money move. If you run seasonal campaigns, Black Friday, holiday, back to school, the money is in generating your AI predicted ranking weights before launch day, not after. That first week is where budget gets wasted. And it doesn't have to be. Try this by Friday. Map your next new campaign launch and identify the cold start window. How many days before the algorithm catches up? That's your cost. Now you've got a number to justify evaluating tools that address it. Evidence Check. Walmart's proprietary data. One platform, one ad format. And the system needs past similar campaigns to pull examples from. If your catalog is genuinely new, the retrieval step has nothing to work with. Don't overgeneralize from one very large e-commerce context. Radar Verdict. If you're anywhere near programmatic advertising or e-commerce ad ops, this belongs on your reading list this week. Paper 3. Should you tell your customers when an ad was made with AI? That question is sitting in most marketing teams' backlogs right now, labeled We'll Figure It Out Later. A master's thesis out of a Norwegian business school says, don't wait. Researchers ran an experiment, Norwegian consumers. Three conditions AI used and disclosed, AI used but not disclosed, no AI at all. Then they measured brand authenticity, how real and genuine the brand felt, and how that affected trust. Here's the finding. And that drop dragged trust down with it. The damage ran through consumers feeling like the brand wasn't being genuine. It wasn't a direct reaction, it was mediated, which means it's quieter and harder to catch until it's already happened. Now here's the part people get wrong when they hear this. Disclosure did not boost trust above the no AI baseline. Telling people your ad was AI generated didn't make them trust you more. It just prevented the trust damage from hiding it. So the mental model is AI disclosure is brand insurance, not a brand differentiator. You're not getting extra credit, you're avoiding a penalty. And the consumers who found out about undisclosed AI use, they use the word misleading. That's not a complaint, that's a reputational risk. Plain English payoff. Hiding AI in your ads damages how real your brand feels, and that hits trust. Disclosing it doesn't make you look better, but it protects everything you've already built. Money move. Add AI tool used, disclosed, as a mandatory checkbox in your creative approval workflow. Not because the law requires it yet, because the trust cost of not doing it is already showing up in research. Try this by Friday. Pull one recent AI-assisted ad your team ran. Was there a disclosure? If not, that's your gap. Write the one-line label you'd add. This image was created with AI. That's it, that's the work. Evidence check, master's thesis, not peer-reviewed, Norwegian consumers only, convenience sample. We don't even know the exact sample size from what's available. The directional finding is credible, but don't bet your entire disclosure policy on a single unpublished study from one country. Radar verdicts. Okay, so here's what I think is actually happening this week. All three papers are pointing at the same uncomfortable truth. Our assumptions about how AI design choices affect consumer behavior are mostly wrong, or at least wildly oversimplified. We assumed personalization makes AI more persuasive. Paper one says nope. Alone, it backfires. We assumed new ads need time to learn. Paper two says nope. AI can predict from creative before a single click. We assumed staying quiet about AI use is the safe play. Paper three says nope. Quiet is the risky play. But here's what I keep coming back to. The finding from paper one about AI literate users. These are the people who know the most about how AI works, who trust it the least, and they are the most likely to follow its advice anyway. Your most sophisticated customers are not your most resistant customers. They're your most compliant ones. And I don't think the industry has fully reckoned with what that means. For ethics, for compliance, for how we design these systems. The tension across these papers is this. AI is doing more than we think in shaping decisions, but not always in the ways we designed it to. Evidence check on all of that. All three papers today are preprints or unreviewed dissertations. None are peer-reviewed journal articles. The findings are credible enough to act on at the experiment level, not the policy level. Use them to decide what to test, not what to blindly believe. Links to all three papers are in the show notes. Read the originals before making major decisions. Want the human expert take? Join Dr. Eva Wolfe every Friday for the AI Marketing Radar Roundup, where she extracts no-nonsense, money-making tips, practical strategy, and real business opportunities from the week's research. Subscribe on Apple Podcasts, Spotify, YouTube, and wherever you listen to podcasts. This is Evita for Big Plans Media, and I'll be back in the next radar brief.