WEBVTT 00:00:20.239 --> 00:00:25.519 You are sitting in the pale glow of your monitor late at night. 00:00:25.519 --> 00:00:28.559 You are trying to do something simple. 00:00:28.559 --> 00:00:32.960 Maybe buy plane tickets or access your bank account. 00:00:32.960 --> 00:00:39.679 You submit your password, but before you can proceed, you have to pass a test. 00:00:39.679 --> 00:00:43.679 A little box has appeared. 00:00:43.679 --> 00:00:52.560 Inside it, letters and numbers are twisted, warped, and smeared together like a note left out in the rain. 00:00:52.560 --> 00:01:00.719 Below it you read, "I'm not a robot." So you lean in. 00:01:00.719 --> 00:01:02.399 You squint. 00:01:02.399 --> 00:01:05.519 Guess that the strange loop is a Q. 00:01:05.519 --> 00:01:09.439 And the jagged line might be a five. 00:01:09.439 --> 00:01:13.920 You type it out and you check the box. 00:01:13.920 --> 00:01:17.439 Because you are indeed not a robot. 00:01:17.439 --> 00:01:21.439 But you failed. 00:01:21.439 --> 00:01:24.640 Another set of blurry letters appears. 00:01:24.640 --> 00:01:33.280 Back in 1950, the legendary mathematician Alan Turing proposed the imitation game. 00:01:33.280 --> 00:01:39.359 A test where a computer tries to fool a human interrogator into thinking that it is a real person. 00:01:39.359 --> 00:01:44.959 For decades, this was the benchmark of artificial intelligence. 00:01:44.959 --> 00:01:46.959 The human was the judge. 00:01:46.959 --> 00:01:49.280 The machine was the subject. 00:01:49.280 --> 00:01:54.959 But at the dawn of the new millennium, the machine started to judge us. 00:01:54.959 --> 00:02:01.680 I'm Daina Bouquin, and this is Lore in the Machine. 00:02:01.680 --> 00:02:06.640 The year is 2000. 00:02:06.640 --> 00:02:12.479 The internet is rapidly expanding, but it is being overrun by automated programs. 00:02:12.479 --> 00:02:18.639 Bots, masquerading as humans to spam message boards and hijack accounts. 00:02:18.639 --> 00:02:29.680 Luis Von Ahn, a 22-year-old graduate student at Carnegie Mellon University, and his professor Manuel Blum decide to build a wall to keep the bots out. 00:02:29.680 --> 00:02:33.360 Von Ahn had grown up in Guatemala City. 00:02:33.360 --> 00:02:37.599 When he was eight, he asked his mother for a Nintendo. 00:02:37.599 --> 00:02:43.439 She bought him an 8-bit computer called the Commodore 64 instead. 00:02:43.439 --> 00:02:47.039 He learned to program because there was nothing else to do with it. Von Ahn and Blum created a program called CAPTCHA. 00:02:47.039 --> 00:03:01.199 It stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart. 00:03:01.199 --> 00:03:05.840 It is the imitation game in reverse. 00:03:05.840 --> 00:03:14.879 To convince the computer that you are human, you need to decipher distorted text that the bot's artificial eyes can't comprehend. 00:03:14.879 --> 00:03:23.360 The wall works, but every time you squint at those warped letters, it takes about 10 seconds of your life. 00:03:23.360 --> 00:03:36.159 And 10 seconds may feel like nothing, but with over 200 million CAPTCHAs solved across the globe every day, humanity was spending half a million hours a day typing gibberish. 00:03:36.159 --> 00:03:40.400 And Von Ahn finds this unsettling. 00:03:40.400 --> 00:03:44.400 He sees a vast, invisible factory. 00:03:44.400 --> 00:03:48.240 Millions of hours of our most precious resource. 00:03:48.240 --> 00:03:50.240 Human brain cycles. 00:03:50.240 --> 00:03:52.319 Being frittered away. 00:03:52.319 --> 00:03:56.879 He wants to harness this time for something meaningful. 00:03:56.879 --> 00:04:03.039 And it turns out the perfect opportunity was waiting inside one of the world's most famous newspapers. 00:04:03.039 --> 00:04:11.840 At the time, a massive effort was underway to digitize the historical archives of the New York Times. 00:04:11.840 --> 00:04:15.439 This was a staggering undertaking. 00:04:15.439 --> 00:04:22.959 The archive dated all the way back to 1851 and contained over 13 million articles. 00:04:22.959 --> 00:04:31.920 To do this, computers were using optical character recognition, or OCR, to scan and read the printed materials. 00:04:31.920 --> 00:04:34.560 But there was a problem. 00:04:34.560 --> 00:04:39.759 The old newspaper pages were worn, and the paper had degraded. 00:04:39.759 --> 00:04:42.160 The old ink was faded. 00:04:42.160 --> 00:04:46.240 The OCR software was failing. 00:04:46.240 --> 00:04:50.480 It simply could not read about 20% of the scanned words. 00:04:50.480 --> 00:05:03.120 So Von Ahn adjusted the CAPTCHA system to take those unreadable words from the Times archive and ask you to squint at them. 00:05:03.120 --> 00:05:10.000 Instead of one string of gibberish, reCAPTCHA started giving you two words. 00:05:10.000 --> 00:05:14.560 One was a control word that the system already knew. 00:05:14.560 --> 00:05:21.279 The other was a mystery, a scan snippet from an old article that had stumped the computer. 00:05:21.279 --> 00:05:25.199 Then the machine made a silent deal with you. 00:05:25.199 --> 00:05:30.560 If you typed the control word correctly, it assumed you were human. 00:05:30.560 --> 00:05:35.920 And if you were human, it assumed your guess on the mystery word was probably correct too. 00:05:35.920 --> 00:05:46.000 It took your answer, compared it to the answers of other humans passing through the same gate, and once enough people agreed, the mystery was solved. 00:05:46.000 --> 00:05:52.639 You didn't know it, but you were working, word by fuzzy word. 00:05:52.639 --> 00:06:00.720 You were helping decode the New York Times archive, rescuing stories from the 1850s that the machines couldn't read. 00:06:00.720 --> 00:06:06.079 Like an article about a pigeon shooting challenge in December 1865. 00:06:06.079 --> 00:06:11.360 Some local pigeon shooting champion from Jersey City wanted to take on England. 00:06:11.360 --> 00:06:16.720 These are details nobody thought to save but got saved anyway. 00:06:16.720 --> 00:06:19.519 One blurry word at a time. 00:06:19.519 --> 00:06:40.240 In 2009 Google saw the incredible power of this crowdsourced labor and bought reCAPTCHA. They plugged your ten second shifts into Google Books. An ambitious project to digitize tens of millions of books and create a massive library for the world. 00:06:40.240 --> 00:06:47.519 By 2018, over a billion people had unwittingly helped create one of humanity's greatest digital archives. 00:06:47.519 --> 00:06:51.519 We were tricked into building something wonderful. 00:06:51.519 --> 00:06:56.879 But we were also teaching the machines to read. 00:06:56.879 --> 00:07:00.160 So the test to see if you were human evolved. 00:07:00.160 --> 00:07:06.240 It wasn't just showing you twisted letters to make digitized books readable anymore. 00:07:06.240 --> 00:07:14.240 It started feeding you pictures of crosswalks, of traffic lights, and storefronts from Google Street View. 00:07:14.240 --> 00:07:19.519 By clicking those squares, you were training their image recognition systems. 00:07:19.519 --> 00:07:25.680 The machines needed to learn what a crosswalk looked like so a car could decide whether to stop. 00:07:25.680 --> 00:07:31.519 They needed to know where the landmarks were so the map could tell you where you are. 00:07:31.519 --> 00:07:36.160 The machines didn't just need to read, they needed to see. 00:07:36.160 --> 00:07:41.360 But then the images started to disappear too. 00:07:41.360 --> 00:07:46.079 Google released reCAPTCHA version 3 in 2018. 00:07:46.079 --> 00:07:49.439 And it doesn't ask you anything at all. 00:07:49.439 --> 00:07:53.040 There is no puzzle. 00:07:53.040 --> 00:07:55.120 No box to check. 00:07:55.120 --> 00:07:57.680 Instead, it watches. 00:07:57.680 --> 00:08:05.040 If you are on a site that has implemented version 3, it's invisible. 00:08:05.040 --> 00:08:07.680 But it sees you. 00:08:07.680 --> 00:08:12.000 It tracks how your mouse moves across the page. 00:08:12.000 --> 00:08:21.839 Whether you scroll in smooth arcs or jagged stops, the rhythm of your keystrokes, the way a human hand hesitates. 00:08:21.839 --> 00:08:26.000 It has seen enough of us. 00:08:26.000 --> 00:08:35.279 Billions of people performing their humanity over and over has taught it exactly what being human looks like. 00:08:35.279 --> 00:08:38.080 It no longer needs to ask. 00:08:38.080 --> 00:08:41.679 We proved we were human. 00:08:41.679 --> 00:08:44.399 And now the machines can too. 00:08:44.399 --> 00:08:50.879 I'm Daina Bouquin, and this is Lore in the Machine. 00:08:50.879 --> 00:09:00.000 If you enjoyed this episode, please rate and review and subscribe wherever you listen to podcasts. 00:09:00.000 --> 00:09:01.840 Thanks for listening.