"Just Predicting The Next Word": How Our Own Brains Resemble AI Artwork

Heliox: Where Evidence Meets Empathy 🇨🇦‬

We make rigorous science accessible, accurate, and unforgettable.

Produced by Michelle Bruecker and Scott Bleackley, it features reviews of emerging research and ideas from leading thinkers, curated under our creative direction with AI assistance for voice, imagery, and composition. Systemic voices and illustrative images of people are representative tools, not depictions of specific individuals.

We dive deep into peer-reviewed research, pre-prints, and major scientific works—then bring them to life through the stories of the researchers themselves. Complex ideas become clear. Obscure discoveries become conversation starters. And you walk away understanding not just what scientists discovered, but why it matters and how they got there.

Independent, moderated, timely, deep, gentle, clinical, global, and community conversations about things that matter. Breathe Easy, we go deep and lightly surface the big ideas.

All Episodes

Heliox: Where Evidence Meets Empathy 🇨🇦‬

"Just Predicting The Next Word": How Our Own Brains Resemble AI

January 28, 2026 • by SC Zoomers • Season 6 • Episode 23

0:00 | 34:32

Send us Fan Mail

📖 Read the companion article

📡 Now available for broadcast on PRX

Are you building sentences like an architect—nested grammatical trees and clean constituents? Or are you laying down the next available brick, relying on linear probability-driven chunks that “shouldn’t” exist as units at all?

In this Deep Dive, we unpack new evidence that the brain represents certain non-hierarchical language structures (like VERB + PREPOSITION + DETERMINER: “sat on the”) as real cognitive objects. The findings converge across priming experiments, eye-tracked reading, and natural conversation data—suggesting that everyday speech is often optimized for speed under the “now-or-never bottleneck.”

We end with the provocative mirror this holds up to AI: if humans often speak by surfing probability, what does that mean for how we judge next-word prediction models?

References

Evidence for the representation of non-hierarchical structures in language

Series:

The Predictive Mind: How Your Brain Cheats Reality (And Why That's Brilliant)

(S6 E23) "Just Predicting The Next Word": How Our Own Brain Resembles AI Jan 28, 2026

(S6 E28 ) The Brilliant Laziness of Being Human: Why Your Brain Refuses to Plan Ahead (And That’s Actually Perfect) Feb 7, 2026

(S6 E30 ) Your Brain Is a Time Traveller (And It's Been Lying to You About the Past) Feb 11, 2026

This is Heliox: Where Evidence Meets Empathy

Independent, moderated, timely, deep, gentle, clinical, global, and community conversations about things that matter. Breathe Easy, we go deep and lightly surface the big ideas.

Support the show

Disclosure: This podcast uses AI-generated synthetic voices for a material portion of the audio content, in line with Apple Podcasts guidelines.

We make rigorous science accessible, accurate, and unforgettable.

Spoken word, short and sweet, with rhythm and a catchy beat.
http://tinyurl.com/stonefolksongs

Speaker 1: 0:25

This is Heliox, where evidence meets empathy. Independent, moderated, timely, deep, gentle, clinical, global, and community conversations about things that matter. Breathe easy. We go deep and lightly surface the big ideas.

Speaker 2: 0:49

Okay, let's start today with a question that sounds simple on the surface, but if you actually stop and try to answer it, it might just break your brain a little bit.

Speaker 1: 0:59

It's one of those ones.

Speaker 2: 1:00

It is. When you speak, I'm talking about just normal, everyday, casual talking over coffee. What are you actually doing inside your head?

Speaker 1: 1:08

And just to clarify, we aren't talking about the mechanics of moving your mouth or, you know, vibrating your vocal cords. We are talking about the invisible engineering, the cognitive machinery.

Speaker 2: 1:20

Right. The blueprint phase. Yeah. How are you building the sentences that come out of your mouth? Are you an architect carefully constructing this complex, multi-layered scaffolding of grammar rules?

Speaker 1: 1:31

Drawing up hierarchical trees, checking the structural integrity of every clause before you speak.

Speaker 2: 1:36

Exactly. Are you doing all that or are you just grabbing the next available brick and slapping it down in a line because it just kind of fits?

Speaker 1: 1:45

It's the classic Lego versus beads on a string debate. And it's, I mean, it's one of the oldest fights in linguistics. For a long, long time, the experts really thought they had this solved. They were team architect all the way.

Speaker 2: 1:58

All in on the complex scaffolding.

Speaker 1: 2:00

100%. But today, we are looking at something brand new. I mean, this was literally released yesterday, January 21st, 2026.

Speaker 2: 2:08

That is fresh. We are practically breaking news here on the show.

Speaker 1: 2:11

It's a paper published in Nature Human Behavior, which is a huge deal, by the way, titled Evidence for the Representation of Non-Hierarchical Structures in Language. It's by Ying-Win Nielsen and Morton Christensen. And honestly, it's a bit of a bombshell.

Speaker 2: 2:27

A grammatical bombshell. I love it. So welcome to the Deep Dive. Today, we are taking this brand new research, stacking it up against the absolute titans of linguistics. And we're talking about the people who wrote the textbooks, you know, the Chomskys and the Goldbergs.

Speaker 1: 2:41

The heavyweights.

Speaker 2: 2:41

The heavyweights. And we're going to try to figure out if we are actually sophisticated grammarians or if we're just really, really good at predicting the next word.

Speaker 1: 2:50

It sounds like such a subtle distinction, doesn't it? Tree versus line. Architect versus bricklayer. But the stakes here are actually huge. This isn't just about grammar. It's about how the human mind represents language.

Speaker 2: 3:03

How it works, fundamentally.

Speaker 1: 3:04

Exactly. It challenges decades of established theory. the generative folks, the constructionist folks, they all agree at their core that language has deep, complex, hierarchical structure. And this paper just comes along and says, well, actually, sometimes we just cheat. I love a good cheat code. It makes life so much easier.

Speaker 2: 3:25

But before we get to the cheating and before we explain why our brains might be taking these shortcuts, we have to understand the rules we're supposedly breaking. Right. You mentioned the architect model. Walk us through that traditional view. When a linguist says language is hierarchical,

Speaker 1: 3:40

what does that actually mean? Okay, so for the last, what, 60 or 70 years, the dominant view in linguistics has been that language is hierarchical. That's the architect model. The idea is that sentences aren't just strings of words like beads on a necklace. Just one after another. Right. They are groups of words that fit together into bigger groups. These groups are

Speaker 2: 3:58

called constituents. Okay, constituents. It's a very academic word. Help me visualize what a looks like in the wild.

Speaker 1: 4:05

Sure. Let's take a classic simple sentence. The cat sat on the mat. In your brain, according to the traditional theory, you don't just process the cat sat in a straight line one by one like a ticker tape.

Speaker 2: 4:17

You don't.

Speaker 1: 4:18

No. You instinctively group the cat into a box. Let's call it a noun phrase that's a constituent. Then you group sat on the mat into a bigger box called a verb phrase and inside that box you have another smaller box on the mat which is a prepositional phrase.

Speaker 2: 4:33

So it's like

Speaker 1: 4:34

nesting dolls. Exactly. That is the perfect analogy. The mat is inside on the mat, which is itself inside sat on the mat. It's a tree structure. You have the trunk, which is the whole sentence. The

Speaker 2: 4:45

big branches, the smaller branches. Yeah. And finally, the leaves, which are the words themselves.

Speaker 1: 4:50

And the fundamental rule of this generative grammar, the rule that has governed the field for decades, is that you deal with the units. You respect the boxes. You respect the dolls.

Speaker 2: 5:02

Meaning you can't just slice through the middle of the boxes. You can't just, what, ignore the boundaries?

Speaker 1: 5:07

Precisely. You can't just take a slice out of the middle that ignores that hierarchical structure. A generative linguist, someone like Chomsky or Pinker, would say that a sequence like sat on the isn't a thing.

Speaker 2: 5:20

It's not a real unit.

Speaker 1: 5:21

It's not. It's just a coincidence that those words are next to each other. Sat belongs to the verb part, the action part of the sentence. on the starts the prepositional phrase, the location part. They are neighbors. They live next door to each other in the sentence, but they aren't family. They don't belong to the same immediate group, the same nesting doll.

Speaker 2: 5:40

Neighbors, but not family.

Speaker 1: 5:42

I like that. So in the traditional view, if I were to look inside your brain with some kind of magical microscope, I wouldn't find a file folder labeled sat on the bear.

Speaker 2: 5:51

I'd find a folder for sat and a folder for on the, but never the two combined.

Speaker 1: 5:56

Correct. Because storing the combination sat on the would be inefficient and, well, illogical, according to the hierarchy. It crosses a boundary. It violates the architecture. It's like taking half of one Lego piece and half of another and saying, this is a new piece. It's not.

Speaker 2: 6:10

It's just broken. But now, enter Nielsen and Christensen. They're coming in and saying, actually, the brain might not care about your family tree. The brain might not care about your carefully drawn boxes.

Speaker 1: 6:23

Precisely. This is where we get to the linear view. The beads on a string model. The authors argue that while we can do the complex tree stuff, we obviously can. Otherwise, we couldn't understand poetry or legal contracts or complex nested logic a lot of the time. Our brains are just looking for a shortcut.

Speaker 2: 6:41

The cheat code.

Speaker 1: 6:42

The cheat code. They call it non-hierarchical structure. Basically, it just means linear chunks.

Speaker 2: 6:46

Linear chunks. Okay, so this is the beads on a string model we mentioned at the top.

Speaker 1: 6:49

Yes. Think about a phone number. If I tell you a phone number, 555-0199, you aren't building a grammatical tree of that number in your head.

Speaker 2: 6:59

No, of course not.

Speaker 1: 7:01

You aren't analyzing if 0199 modifies 555. You aren't looking for the noun phrase or the verb phrase of the phone number. You're just remembering the sequence. 5, 5, 0, 0, 1. It's flat. One thing after another. No hierarchy at all.

Speaker 2: 7:18

And this new paper suggests we do the exact same thing with language. that we treat actual words with meaning and grammar, like digits and a phone number.

Speaker 1: 7:26

Specifically with sequences that traditional grammar says shouldn't exist as units. They focused on a very specific and very common pattern, verb preposition determiner.

Speaker 2: 7:34

Verb preposition determiner. That sounds a bit abstract. Give me a concrete example so we can anchor this in reality.

Speaker 1: 7:39

Easy. Something like sat on the low or ran to or looked at the or went in the.

Speaker 2: 7:43

Okay, let's pause there. Because I think for a normal person, not a linguist, but someone just listening to this on their commute, This is the real crux of the argument. Why is sat on thy such a rebellious, controversial example? To me, it sounds like a totally normal part of a sentence.

Speaker 1: 8:00

It does.

Speaker 2: 8:01

I probably see sat on thy or looked at the 10 times a day without even thinking about it.

Speaker 1: 8:05

It sounds normal because it is frequent. But grammatically, structurally, it's an impossible object. If you draw that tree structure we talked about, sat is the action. It's the main verb.

Speaker 3: 8:16

Right.

Speaker 1: 8:17

On the is the beginning of the location, the prepositional phrase. They are on different branches of the tree. There is no single constituent, no single box or nesting doll that combines a verb, a preposition, and a determiner while leaving out the noun.

Speaker 2: 8:31

Because the noun is the most important part of that second phrase. You can't just have on the full floating there without the map or the chair or the floor.

Speaker 1: 8:40

Exactly.

Speaker 2: 8:40

It's just a setup for the noun. It's incomplete.

Speaker 1: 8:43

It's totally incomplete. Sat on the is a fragment that crosses the border. It leaves the all-important noun just hanging there. According to traditional theory, whether you're a Chomsky-ingenerativist or a constructionist like Adel Goldberg, your brain just shouldn't store sat on thee as a single unit. It serves no grammatical purpose.

Speaker 2: 9:04

It's a bridge that stops halfway across the river.

Speaker 1: 9:06

Perfect. It's like buying a sandwich but grabbing the bottom slice of bread, the meat, and the lettuce, but leaving the top slice of bread on the counter. It's not a sandwich.

Speaker 2: 9:15

It's a mess.

Speaker 1: 9:16

It's a mess. And traditional linguistics says your brain only recognizes sandwich's complete constituents. This paper says, no, actually, if I see that messy pile of meat and lettuce enough times, I'm going to create a mental file for it.

Speaker 2: 9:29

So familiarity beats grammar.

Speaker 1: 9:31

That's a great way to put it. The brain sees the pattern sat on or went to the so often that it just groups them together as a chunk, completely ignoring the fact that it violates the grammatical tree. It's a purely statistical shortcut.

Speaker 2: 9:43

So we are cheating the system. We're ignoring the rules of sentence architecture and just memorizing a sequence because it's convenient.

Speaker 1: 9:51

Yes. And that is a direct challenge to the idea that language is fundamentally hierarchical. If we have mental representations for these impossible structures, then the hierarchy isn't the only game in town. The brain is maintaining a second set of books, so to speak.

Speaker 2: 10:07

A shadow ledger of grammar. I love that. Okay, but how do we know this isn't just a theory? I mean, it's easy to say maybe the brain does this, but you can't exactly open up someone's skull and look for a file labeled sat on that. You cannot. How did Nielsen and Christensen actually prove that we have these illegal chunks in our heads? Because this is a nature journal. They need hard data. You can't just vibe your way into nature.

Speaker 1: 10:33

They do need hard data, and this is the cool part, the really clever part. They use a technique called structural priming.

Speaker 2: 10:38

Priming. I've heard of this in psychology. Like, if I show you a picture of a bunny, you're faster to recognize the word carrot later because the concept is already warmed up in your brain.

Speaker 1: 10:47

It's a very similar concept, but applied to the architecture of grammar. The idea is that if you use a certain sentence structure, your brain keeps that structure active for a little while. It's like a song stuck in your head.

Speaker 2: 10:59

Okay.

Speaker 1: 11:00

If you hear a waltz, da-da-da, da-da-da-da, you're likely to tap your foot in that same rhythm. In language, if I make you say a sentence with a specific structure, you are statistically more likely to use that same structure in the next sentence you say.

Speaker 2: 11:14

It gets stuck in your head, so you repeat it.

Speaker 1: 11:15

Or you process a similar structure faster, and that's the key they used. So if they can get people to process these linear chunks more quickly by exposing them to them first, it proves the chunk exists in the brain as a real thing.

Speaker 2: 11:29

Because you can't prime something that isn't there.

Speaker 1: 11:31

Exactly. If the chunk didn't exist, if sat on the, was just three random words to your brain, you couldn't prime it. It would be like trying to prime someone to tap their foot to static noise. There's no pattern to latch onto.

Speaker 2: 11:42

Okay, so walk me through the experiment. They had a lot of people for this, right? It wasn't just like 10 undergrads in a big smit.

Speaker 1: 11:49

No, no, they did it properly. N equaled 497 for the priming experiments. That's a very solid sample size for this kind of cognitive work. They set up what's called a phrasal decision task.

Speaker 2: 12:01

Phrasal decision task. What does that look like for the participant? Paint the picture for me.

Speaker 1: 12:05

Okay. You're sitting at a computer. A sequence of three or four words pops up on the screen for a fraction of a second. Your only job is to decide as fast as you can. Is this a possible part of a sentence? Yes or no?

Speaker 2: 12:19

Simple enough.

Speaker 1: 12:20

But here's the trick. They would show you a prime, a sequence that fits that illegal pattern, verb preposition determiner maybe ran to the you see it you hit yes then immediately after they show you a target sequence that follows the same exact pattern like walked in and they measure how fast you respond to that second one speed and accuracy if your brain has primed that linear structure from seeing ran to the you should be faster and more accurate at processing walked in but here's the critical catch and this is where the science gets really rigorous they had to make sure the priming wasn't happening because of the constituents. The legal textbook approved

Speaker 2: 12:57

structures. Right. Because ran to the end, walked in it. Both have verbs and they both have prepositional phrases starting. Maybe the brain is just priming verb phrase. Maybe it's just priming the box, not the specific illegal string. Exactly. That's the alternative explanation the

Speaker 1: 13:13

architects would use. So the researchers controlled for that very carefully. They set up the experiment so that the only connecting factor between the prime and the target was this specific linear non-hierarchical sequence.

Speaker 2: 13:25

How did they do that?

Speaker 1: 13:26

By mixing and matching everything else. They stripped away the ability for the hierarchy to explain the speed boost. They made sure the constituents didn't match up in a way that would explain the result. The only thing that stayed consistent was that illegal verb prep determiner pattern.

Speaker 2: 13:40

And the result. Did it work?

Speaker 1: 13:42

It worked. They successfully primed these linear structures. The study says explicitly. We showed that it is possible to prime such linear structures even in the absence of constituents.

Speaker 2: 13:52

That's the smoking gun.

Speaker 1: 13:53

That is the smoking gun. It poses a massive challenge for accounts of linguistic representation that rely solely on those tree structures.

Speaker 2: 14:01

So despite what the textbooks have been saying for half a century, the participants' brains were treating verb preposition determiner as a real repeatable unit. They were surfing the pattern. They saw randed, and their brain said, oh, I know this shape. And then when walked and appeared, they said, yep, same shape. Got it.

Speaker 1: 14:20

That's it. They proved that the brain has a file for these illegal chunks. It treats them as valid objects of computation.

Speaker 2: 14:27

But wait, I can hear the skeptics now. I can hear the diehard Chomskines screaming at their speakers. Okay, so you trick some people in a lab with a rapid-fire computer task. You force them to look at weird fragments.

Speaker 1: 14:41

This isn't real language?

Speaker 2: 14:42

Right. That doesn't mean this is how I actually read a book or talk to my friends. Lab settings are artificial.

Speaker 1: 14:47

And that is a totally fair critique. It's the first one you should make, actually. Lab tasks can be weird. You're under pressure. You're pushing buttons. You're looking at words in isolation, which is why the researchers didn't stop there.

Speaker 2: 14:58

Good.

Speaker 1: 14:59

They went looking for external validity. They wanted to see if this happens in the wild.

Speaker 2: 15:03

The wild being. Ease dropping on people at Starbucks.

Speaker 1: 15:07

Essentially, yes. Well, legitized versions of that. They did two massive corpus analyses. First, they looked at eye tracking data. They used a large data set with about 68 participants just reading sentences naturally while a special camera tracked their eye movements.

Speaker 2: 15:22

I love eye tracking studies because we think we read smoothly, you know, like a camera panning across the landscape, but we don't, do we? Our eyes are all over the place.

Speaker 1: 15:30

No, we are so jerky. Our eyes jump. Those movements are called saccades. And then they stop or fixate on a word. And when you read something that is hard to process, your eye stays there longer. It's a direct measure of cognitive effort. If your brain is struggling to build the tree, your eye freezes for a few extra milliseconds.

Speaker 2: 15:48

Right. If I'm reading a complex word like anti-dice establishmentarianism, I'm going to stare at it for a solid second. If I see the word duh, I just breeze past.

Speaker 1: 15:57

Precisely. So they looked at how long people's eyes fixated on these linear chunks. The hypothesis was if the brain treat sat on the arm as a single prefabricated unit, a chunk, you should read it faster than if it were three separate disconnected words that you have to build into a tree.

Speaker 2: 16:15

If you have to be an architect, it takes time. If you just grab a pre-made brick, it's fast.

Speaker 1: 16:19

That's exactly what they found. The frequency of that illegal chunk predicted how fast people read it. The more common the linear sequence was in the language, the faster the eye moved right over it. It's like the eye was sliding over a grease track.

Speaker 2: 16:33

Wow. So even when we're just, you know, sitting on the couch reading a novel, our eyes are skipping along these prefabricated bricks. We aren't building the cathedral word by word. We're just walking past the wall.

Speaker 1: 16:43

A wall made of these pre-sab sections. And then, just to be absolutely sure, they looked at conversation, at actual speech.

Speaker 2: 16:49

Yeah.

Speaker 1: 16:50

They analyzed the switchboard corpus.

Speaker 2: 16:52

Switchboard. That sounds so retro, like something from a sci-fi movie in the 80s.

Speaker 1: 16:57

It is glorious. It's a massive database of recorded telephone conversations from the 1990s. The researchers just paired up random people from all over the country and gave them a topic to chat about.

Speaker 2: 17:08

So it's real unscripted talk.

Speaker 1: 17:10

The definition of unscripted. It's messy. It's real. It's full of ums and ahs and interruptions. It's human. They analyzed speech from 358 speakers from this database, and they looked at the duration of the words. How long in milliseconds does it take to physically say sat on the?

Speaker 2: 17:30

And let me guess, if it's a chunk, a single unit, we say it faster. We just kind of rush through it.

Speaker 1: 17:35

Significantly faster. When you say sat on the, you blend it all together. It becomes almost one word phonetically. It's a process called reduction. Think about how we say out of the.

Speaker 2: 17:43

Out of the.

Speaker 1: 17:44

Or I don't know.

Speaker 2: 17:45

I don't know.

Speaker 1: 17:45

Exactly. I don't know. That compression is physical proof that the brain has grouped it. If out of and the were completely separate items on different branches of a grammatical tree, we wouldn't smush them together quite so predictably. The fact that we compress them suggests they are being retrieved from memory as a single lexical bundle.

Speaker 2: 18:03

So we have the lab proof from the priming experiment, we have the reading proof from the eye tracking, and we have the speaking proof from the old 90s phone calls. The brain is definitely doing this. Which brings us to the really big question. Why? The so what? Yeah, the so what. Why would the brain ignore the beautiful logical rules of grammar to memorize a meaningless chunk like sat on a white sheet?

Speaker 1: 18:29

This brings us to the core reason. The source material points to a really fascinating concept called the now or never bottleneck. This is something Christensen, one of the authors on this paper, wrote about with Chater back in 2016.

Speaker 2: 18:41

The now or never bottleneck. That sounds dramatic, like an action movie title. In a world where neurons must fire.

Speaker 1: 18:47

It is dramatic for your neurons. The idea is that speech happens incredibly fast. Syllables are flying by you at a rate of about 4 to 5 per second. The acoustic information, the actual sound waves hitting your ear, fades from your short-term memory in milliseconds.

Speaker 2: 19:01

So if you don't process it now?

Speaker 1: 19:03

It's gone forever. You can't rewind a live conversation. Your brain is in a constant race against time to process information as it comes in. Or lose it.

Speaker 2: 19:13

Okay, so the brain is under a serious time crunch. It's not sitting in a leather armchair contemplating the structure of the sentence with a pipe in its hand. It's on the battlefield.

Speaker 1: 19:24

It's in the trenches. And in that high-pressure situation, if the brain had to build a perfect rule-abiding grammatical tree for every single sentence you hear, drawing all the lines and checking all the boxes, it would get overwhelmed. It would be too slow.

Speaker 2: 19:38

It would fall behind.

Speaker 1: 19:39

You'd lose the thread of the conversation. You'd still be processing the start of my sentence while I'm already finishing the next paragraph.

Speaker 2: 19:44

I see. It's like that chef analogy again. Imagine a chef in a crazy dinner rush. Orders are flying in. You don't have time to put the flour on a scale and measure it to the milligram for every single pancake. No way. You grab a handful. You know what a handful feels like? Handful becomes a unit of measurement.

Speaker 1: 19:59

That's it. These linear chunks sat on the handfuls of flour. They allow the brain to process three words for the price of one. It relieves the cognitive load. It clears the bottleneck. Instead of processing verb plus preposition plus determiner, the brain just processes chunk A.

Speaker 2: 20:16

So we aren't being lazy, we're being efficient.

Speaker 1: 20:19

Well, in evolution, laziness is efficiency. Why build a complex structure if a flat line works just as well for the task at hand? Why walk up the stairs if you can slide down the banister? But this creates a tension. You mentioned the titans earlier. This finding creates a lot of friction with other major theories, like construction grammar.

Speaker 2: 20:38

Now, construction grammar, that's the idea that we learn constituents that have a specific meaning, right? Like idioms, kick the bucket means to die. You learn that as a whole chunk because it means something special. It's a specific construction.

Speaker 1: 20:51

Right. Theorists like Adele Goldberg and Michael Tomasello argue that we learn chunks that are meaningful. Happy birthday to you is a chunk. Once upon a time is a chunk. They have a function. But here's the rub, and this is the really subtle, important part of this new study. Okay. Sat on the goes doesn't mean anything special on its own. It's not an idiom. You can't look up SAT in the dictionary. It's just structural glue. It's meaningless. It is meaningless. And yet this study proves we learn it as a chunk anyway. That suggests that the brain is a pure statistical machine in this regard. It doesn't just care about meaning. It cares about raw probability. It learns that on the do usually follows SAT. So it just staples them together purely because they show up together so often.

Speaker 2: 21:36

All about the stats.

Speaker 1: 21:38

It's statistical learning. It's the classic words and rules idea, getting a serious challenge from patterns and probabilities.

Speaker 2: 21:44

You know, this reminds me so much of the predictive text on my phone. My phone doesn't know what a sandwich is. It has no concept of hunger or bread or peanut butter.

Speaker 1: 21:53

None at all.

Speaker 2: 21:54

It just knows that if I type peanut butter and the next word is probably going to be jelly. It's just playing the odds.

Speaker 1: 22:00

That is precisely the comparison the researchers make. And that leads us to the really fascinating and maybe slightly unsettling connection to artificial intelligence.

Speaker 2: 22:10

I was hoping we'd get here. Because when you talk about predicting the next word based on probability, that is literally the definition of a large language model like ChatGPT or Gemini or the others.

Speaker 1: 22:23

It is. And look at the source material reference 52 in the paper discusses how large language models demonstrate the potential of statistical learning. For a long time, linguists and philosophers have criticized AI. They say, oh, it doesn't understand language, doesn't have the deep grammatical tree in its head.

Speaker 2: 22:39

It's just a stochastic parrot.

Speaker 1: 22:40

A stochastic parrot just guessing the next word. Right. It's just doing math. It's not doing thinking. I've heard that a million times.

Speaker 2: 22:47

I've said that myself. It's just autocomplete on steroids.

Speaker 1: 22:50

But this paper completely flips that criticism on its head. It suggests that humans are doing something very, very similar to the AI.

Speaker 2: 22:58

Whoa, wait a minute. That's a bit of an ego check.

Speaker 1: 23:00

Think about it. If we are processing a huge amount of everyday language by using these flat, linear chunks based purely on probability and exposure, just predicting the next likely sequence, then maybe the AI isn't doing it wrong. Maybe the AI stumbled onto the actual way humans communicate, or at least a big part of it.

Speaker 2: 23:21

That is a twist. We thought AI was dumb because it ignored the rules and just looked at the stats. And it turns out our own brains are ignoring the rules and looking at the stats. We are biological LLMs.

Speaker 1: 23:35

To an extent, yes. Now, to be clear, humans can do the hierarchical stuff. We can handle complexity that an AI might still struggle with. If I say the dog that chased the cat that ate the rat that stole the cheese is brown, you can figure out which animal is brown. The dog. the dog, that requires holding the dog in your memory while you process all the nested stuff about the rat and the cheese, and then connecting dog all the way back to brown at the end. That requires a tree structure. You have to build a scaffold for that.

Speaker 2: 23:59

Right. The dog is brown. Everything else is a nested loop inside of that main structure.

Speaker 1: 24:04

Exactly. So we have that capacity. We are amazing architects when we need to be. But this paper argues that for a lot of our day-to-day processing, for the simple, fast stuff. We're relying on these flat statistical sequences. We are surfing the surface structure,

Speaker 2: 24:21

not diving deep into the tree every single time. We save the deep diving for the complex stuff and use the autocomplete for the routine stuff. And it changes how you see yourself, doesn't it?

Speaker 1: 24:31

I like to think I'm this rational, logical creature carefully structuring my thoughts, but maybe a lot of the time I'm just executing a script. Run program, greeting, run program, coffee order, run program, small talk about weather. It's all prepackaged chunks. It definitely highlights the efficient brain over the logical brain. There is a reference in the paper, Tremblay et al., about the processing advantages of lexical bundles. Lexical bundle is just a fancy word for these chunks. The advantage is speed. If you treat, I don't know, as one word, I don't know, you save precious cognitive energy for the important stuff, like figuring out why you don't

Speaker 2: 25:07

know or what you should do about it. So let's look at the bigger picture here. You mentioned this challenges the generative and constructionist views. Are these theories dead now? Is Chomsky canceled? Do we, you know, burn the textbooks? No, no, I wouldn't go that far. Science rarely

Speaker 1: 25:22

works like that with a clean knockout punch. It's not that the old theories are 100% wrong. It's that they are incomplete. Ah, okay. The generative view that we have an innate capacity for hierarchical grammar is still incredibly useful for explaining how we can generate an infinite number of novel sentences. But it can't explain the specific phenomenon. It has a blind spot for why we would

Speaker 2: 25:45

chunk sat on the... So the new theory has to be a hybrid of some sort. Ideally, yes. The field will

Speaker 1: 25:50

probably move towards a model that accounts for both. The dual path idea is very popular in psychology. Maybe we have a slow system that builds the beautiful trees for complex novel sentences and a fast system that just uses these prefabricated linear bricks for common stuff.

Speaker 2: 26:06

Like thinking fast and slow.

Speaker 1: 26:08

Exactly like that. The problem is, historically, the major linguistic theories have fought against the idea of the fast system being structurally important. They've treated it as noise or a performance error. This paper forces them to take it seriously as part of the core machinery.

Speaker 2: 26:23

It forces them to acknowledge the bricks. You can't just talk about the cathedral's design. You have to acknowledge that a lot of the walls are built from these prefabricated blocks.

Speaker 1: 26:32

Exactly. And it validates the usage-based approach, to some degree, the idea that we learn language by using it, not just by accessing a pre-installed rulebook. But it pushes that idea even further by saying we learn structural habits that are totally meaningless, not just meaningful ones. We learn the rhythm and statistics of language, not just the logic.

Speaker 2: 26:53

It's really fascinating to think about how this applies to learning a new language. I remember when I tried to learn Spanish, I was so obsessed with the rules. Conjugate the verb, find the direct object. I was trying to be an architect.

Speaker 1: 27:05

Right. Building every sentence from scratch.

Speaker 2: 27:08

But the people who learned it fastest were the ones who just memorized whole phrases. Como se dice. They didn't analyze it. They just used the chunk.

Speaker 1: 27:16

That is a perfect application of this. There is a lot of research cited in the broader context of this field that suggests second language learners often rely more on these linear chunks at first because they don't have the deep tree structure mastered yet. It's a crutch. But what this new paper suggests is that even native speakers, the experts, never stop using them. We never graduate to only using trees. We keep the training wheels on forever because they are fast.

Speaker 2: 27:44

The training wheels are actually a Ducati motorcycle. They get us there faster.

Speaker 1: 27:48

Oh, that's a good way to put it. We assume complex is better, but sometimes simple and fast is just smarter, evolutionarily speaking.

Speaker 2: 27:55

So we've got the Lego versus bees debate. We've got the now or never bottleneck forcing our brains to be speedy. We've got this mind-bending connection to AI and statistical learning. It really feels like we are seeing a shift in how we understand the mind moving away from computer program logic toward pattern recognition machine.

Speaker 1: 28:14

That is definitely the trend in cognitive science right now. And this paper is a massive piece of evidence pushing in that direction. It emphasizes that context and probability drive our brain just as much as, if not more than, abstract rules do.

Speaker 2: 28:29

Let's circle back to the methodology for a second, just because I think it's important for listeners to understand the rigor here. You mentioned the experiments were pre-registered. Why does that matter? Why is that such a buzzword in science right now?

Speaker 1: 28:42

Ah, good catch. Pre-registration adds a crucial layer of trust. In science, especially in psychology, there's a historical problem, a temptation to hack. You run an experiment, you collect a ton of data, you look at it, and you see a weird blip over here that's statistically significant.

Speaker 2: 29:00

And then you pretend that's what you were looking for all along.

Speaker 1: 29:02

Exactly. You rewrite the history of the experiment to fit the result. It's like shooting an arrow at a barn wall and then drawing the bullseye around wherever it landed.

Speaker 2: 29:11

Okay, I see the problem.

Speaker 1: 29:11

Pre-registration stops that. You have to write down exactly what you are going to do, how you will analyze it, and what you expect to find before you collect a single data point. You upload it to a public server. It's locked in. It's like a scientific contract. They called their shot.

Speaker 2: 29:26

They called the corner pocket. I'm going to sink the eight ball on the corner pocket using a linear chunk.

Speaker 1: 29:31

And they hit it. And the fact that they did this for all four of their priming experiments, laid it all out in advance, and still found the effect across nearly 500 people, and then backed it all up with the corpus data, it makes it very, very hard to dismiss this as just a fleek or noise, a robust signal.

Speaker 2: 29:50

And that switchboard corpus. I just can't get over the idea of analyzing 90s phone calls. I picture people with those giant cordless phones with the pull-out antennas walking around their kitchens talking about Seinfeld.

Speaker 1: 30:01

It's a goldmine for linguists precisely because it captures disfluencies. The stutters, the restarts, the ah we make when we talk.

Speaker 2: 30:09

Why are stutters and ums useful? I thought those were just errors.

Speaker 1: 30:12

Traditional grammar theories hate them. They treat them as performance errors to be ignored. But in a linear chunking model, those errors are data. They tell you where the chunks are breaking down or being joined together.

Speaker 2: 30:24

How so?

Speaker 1: 30:25

Think about where you pause. You usually pause between the chunks, not in the middle of one. You wouldn't say the cat sat on the mat.

Speaker 2: 30:33

Right, you'd say the cat sat on the mat.

Speaker 1: 30:34

Exactly. You pause after sat on that while your brain is loading the next chunk, the mat. Once you start a chunk, you tend to zip right through it. The pauses reveal the boundaries of the bricks. The is like the cement between the bricks.

Speaker 2: 30:47

I love that. The is the cement that makes me feel so much better about how many times I say I'm just laying cement.

Speaker 1: 30:55

You're just waiting for the next lexical bundle to load from your memory. It's a sign of real time processing.

Speaker 2: 31:02

So as we start to wrap this whole thing up, let's try to synthesize this for the listener. We started with the big question. Are we architects or are we bricklayers?

Speaker 1: 31:10

And the answer seems to be we are architects who aren't afraid to buy prefabricated walls from Ikea.

Speaker 2: 31:15

Laughs. I love that. That's perfect.

Speaker 1: 31:17

We can build from scratch. You have the hierarchy. We have the blueprints for building a mansion. But most days for most conversations, we prefer the prefab chunks because they save energy and crucially time. And this paper by Nielsen and Christensen provides the first really solid proof that those prefab chunks, those linear sequences like verb preposition determiner, are a fundamental part of our mental furniture.

Speaker 2: 31:43

They're real cognitive objects, not just a coincidence.

Speaker 1: 31:47

Not a coincidence at all. And that challenges the titans of linguistics to rewrite the rulebook or at least add a very, very big chapter on cheating.

Speaker 2: 31:55

A chapter on efficiency.

Speaker 1: 31:56

A chapter on efficiency. It forces us to accept that our language faculty is flexible, it's statistical, and it's constantly driven by the urgent need to beat that now-or-never bottleneck. It's adaptation at its finest.

Speaker 2: 32:09

It's a bit humbling, isn't it? We like to think we're so special with our complex recursive grammar. But a lot of time, we're really just predicting the next brick.

Speaker 1: 32:18

It is humbling, but it's also kind of amazing. The fact that our brains can seamlessly integrate these two completely different systems, the deep logical architect and the fast statistical bricklayer, and that it works so well. It's a miracle of biology. We are surfing the waves of probability and somehow managing to write poetry and sign contracts and fall in love.

Speaker 2: 32:42

And host deep dives.

Speaker 1: 32:43

And host deep dives.

Speaker 2: 32:45

So here is the final takeaway for you listening at home. Next time you find yourself stumbling over a sentence, or maybe flowing perfectly through one without even thinking, remember this conversation. Your brain isn't just following a rigid rulebook. It is surfing. It's grabbing clusters of words, these lexical bundles, and tossing them out in a linear stream, hoping the listener catches them.

Speaker 1: 33:05

You're not just a grammarian. You're a statistician, making bets every second.

Speaker 2: 33:08

I like that better. I'm a statistician. All right, before we go, I want to leave you with one final thought, one thing to chew on. We talked about how we are just predicting the next chunk, a lot like AI. If our brains rely this heavily on these flat, linear, prepackaged sequences rather than deep, original structure, how much of what we say is original thought and how much is just habit?

Speaker 1: 33:30

That is the scary question, isn't it? If I'm just triggering the sat on the file and in the middle of file, am I actually constructing a new idea from scratch or am I just assembling a collage of things I've heard a million times before?

Speaker 2: 33:44

And if we were just predicting the next chunk, who is actually driving the conversation? Is it you? Or is it just the statistics of the English language flowing through you?

Speaker 1: 33:53

That is something that will keep me up at night.

Speaker 2: 33:55

Something to mull over on your commute. Thanks for listening to The Deep Dive. We'll see you next time.

Speaker 1: 34:00

See you then.

Speaker 2: 34:03

Thanks for listening today. Four recurring narratives underlie every episode. boundary dissolution, adaptive complexity, embodied knowledge, and quantum-like uncertainty. These aren't just philosophical musings, but frameworks for understanding our modern world. We hope you continue exploring our other podcasts, responding to the content,

Speaker 3: 34:24

and checking out our related articles at heliocspodcast.substack.com.