1 00:00:20,239 --> 00:00:24,559 Speaker: You are sitting in the pale glow of your monitor late 2 00:00:24,559 --> 00:00:25,519 at night. 3 00:00:25,519 --> 00:00:28,559 You are trying to do something simple. 4 00:00:28,559 --> 00:00:32,960 Maybe buy plane tickets or access your bank account. 5 00:00:32,960 --> 00:00:38,399 You submit your password, but before you can proceed, you have 6 00:00:38,399 --> 00:00:39,679 to pass a test. 7 00:00:39,679 --> 00:00:43,679 A little box has appeared. 8 00:00:43,679 --> 00:00:49,679 Inside it, letters and numbers are twisted, warped, and smeared 9 00:00:49,679 --> 00:00:52,560 together like a note left out in the rain. 10 00:00:52,560 --> 00:01:00,719 Below it you read, "I'm not a robot." So you lean in. 11 00:01:00,719 --> 00:01:02,399 You squint. 12 00:01:02,399 --> 00:01:05,519 Guess that the strange loop is a Q. 13 00:01:05,519 --> 00:01:09,439 And the jagged line might be a five. 14 00:01:09,439 --> 00:01:13,920 You type it out and you check the box. 15 00:01:13,920 --> 00:01:17,439 Because you are indeed not a robot. 16 00:01:17,439 --> 00:01:21,439 But you failed. 17 00:01:21,439 --> 00:01:24,640 Another set of blurry letters appears. 18 00:01:24,640 --> 00:01:30,879 Back in 1950, the legendary mathematician Alan Turing 19 00:01:30,879 --> 00:01:33,280 proposed the imitation game. 20 00:01:33,280 --> 00:01:37,200 A test where a computer tries to fool a human interrogator 21 00:01:37,200 --> 00:01:39,359 into thinking that it is a real person. 22 00:01:39,359 --> 00:01:44,079 For decades, this was the benchmark of artificial 23 00:01:44,079 --> 00:01:44,959 intelligence. 24 00:01:44,959 --> 00:01:46,959 The human was the judge. 25 00:01:46,959 --> 00:01:49,280 The machine was the subject. 26 00:01:49,280 --> 00:01:53,760 But at the dawn of the new millennium, the machine started 27 00:01:53,760 --> 00:01:54,959 to judge us. 28 00:01:54,959 --> 00:02:01,680 I'm Daina Bouquin, and this is Lore in the Machine. 29 00:02:01,680 --> 00:02:06,640 The year is 2000. 30 00:02:06,640 --> 00:02:10,159 The internet is rapidly expanding, but it is being 31 00:02:10,159 --> 00:02:12,479 overrun by automated programs. 32 00:02:12,479 --> 00:02:17,680 Bots, masquerading as humans to spam message boards and hijack 33 00:02:17,680 --> 00:02:18,639 accounts. 34 00:02:18,639 --> 00:02:23,199 Luis Von Ahn, a 22-year-old graduate student at Carnegie 35 00:02:23,199 --> 00:02:27,120 Mellon University, and his professor Manuel Blum decide to 36 00:02:27,120 --> 00:02:29,680 build a wall to keep the bots out. 37 00:02:29,680 --> 00:02:33,360 Von Ahn had grown up in Guatemala City. 38 00:02:33,360 --> 00:02:37,599 When he was eight, he asked his mother for a Nintendo. 39 00:02:37,599 --> 00:02:42,479 She bought him an 8-bit computer called the Commodore 64 40 00:02:42,479 --> 00:02:43,439 instead. 41 00:02:43,439 --> 00:02:46,400 He learned to program because there was nothing else to do 42 00:02:46,400 --> 00:02:47,039 with it. Von Ahn and Blum created a program called CAPTCHA. 43 00:02:47,039 --> 00:02:58,800 It stands for Completely Automated Public Turing Test to 44 00:02:58,800 --> 00:03:01,199 Tell Computers and Humans Apart. 45 00:03:01,199 --> 00:03:05,840 It is the imitation game in reverse. 46 00:03:05,840 --> 00:03:09,840 To convince the computer that you are human, you need to 47 00:03:09,840 --> 00:03:13,919 decipher distorted text that the bot's artificial eyes can't 48 00:03:13,919 --> 00:03:14,879 comprehend. 49 00:03:14,879 --> 00:03:19,919 The wall works, but every time you squint at those warped 50 00:03:19,919 --> 00:03:23,360 letters, it takes about 10 seconds of your life. 51 00:03:23,360 --> 00:03:28,000 And 10 seconds may feel like nothing, but with over 200 52 00:03:28,000 --> 00:03:32,000 million CAPTCHAs solved across the globe every day, humanity 53 00:03:32,000 --> 00:03:36,159 was spending half a million hours a day typing gibberish. 54 00:03:36,159 --> 00:03:40,400 And Von Ahn finds this unsettling. 55 00:03:40,400 --> 00:03:44,400 He sees a vast, invisible factory. 56 00:03:44,400 --> 00:03:48,240 Millions of hours of our most precious resource. 57 00:03:48,240 --> 00:03:50,240 Human brain cycles. 58 00:03:50,240 --> 00:03:52,319 Being frittered away. 59 00:03:52,319 --> 00:03:56,879 He wants to harness this time for something meaningful. 60 00:03:56,879 --> 00:04:00,400 And it turns out the perfect opportunity was waiting inside 61 00:04:00,400 --> 00:04:03,039 one of the world's most famous newspapers. 62 00:04:03,039 --> 00:04:09,280 At the time, a massive effort was underway to digitize the 63 00:04:09,280 --> 00:04:11,840 historical archives of the New York Times. 64 00:04:11,840 --> 00:04:15,439 This was a staggering undertaking. 65 00:04:15,439 --> 00:04:21,040 The archive dated all the way back to 1851 and contained over 66 00:04:21,040 --> 00:04:22,959 13 million articles. 67 00:04:22,959 --> 00:04:26,879 To do this, computers were using optical character 68 00:04:26,879 --> 00:04:31,920 recognition, or OCR, to scan and read the printed materials. 69 00:04:31,920 --> 00:04:34,560 But there was a problem. 70 00:04:34,560 --> 00:04:38,800 The old newspaper pages were worn, and the paper had 71 00:04:38,800 --> 00:04:39,759 degraded. 72 00:04:39,759 --> 00:04:42,160 The old ink was faded. 73 00:04:42,160 --> 00:04:46,240 The OCR software was failing. 74 00:04:46,240 --> 00:04:50,480 It simply could not read about 20% of the scanned words. 75 00:04:50,480 --> 00:04:58,399 So Von Ahn adjusted the CAPTCHA system to take those unreadable 76 00:04:58,399 --> 00:05:03,120 words from the Times archive and ask you to squint at them. 77 00:05:03,120 --> 00:05:08,399 Instead of one string of gibberish, reCAPTCHA started 78 00:05:08,399 --> 00:05:10,000 giving you two words. 79 00:05:10,000 --> 00:05:14,560 One was a control word that the system already knew. 80 00:05:14,560 --> 00:05:19,439 The other was a mystery, a scan snippet from an old article 81 00:05:19,439 --> 00:05:21,279 that had stumped the computer. 82 00:05:21,279 --> 00:05:25,199 Then the machine made a silent deal with you. 83 00:05:25,199 --> 00:05:30,000 If you typed the control word correctly, it assumed you were 84 00:05:30,000 --> 00:05:30,560 human. 85 00:05:30,560 --> 00:05:33,439 And if you were human, it assumed your guess on the 86 00:05:33,439 --> 00:05:35,920 mystery word was probably correct too. 87 00:05:35,920 --> 00:05:39,759 It took your answer, compared it to the answers of other 88 00:05:39,759 --> 00:05:43,439 humans passing through the same gate, and once enough people 89 00:05:43,439 --> 00:05:46,000 agreed, the mystery was solved. 90 00:05:46,000 --> 00:05:52,319 You didn't know it, but you were working, word by fuzzy 91 00:05:52,319 --> 00:05:52,639 word. 92 00:05:52,639 --> 00:05:56,800 You were helping decode the New York Times archive, rescuing 93 00:05:56,800 --> 00:06:00,720 stories from the 1850s that the machines couldn't read. 94 00:06:00,720 --> 00:06:04,959 Like an article about a pigeon shooting challenge in December 95 00:06:04,959 --> 00:06:06,079 1865. 96 00:06:06,079 --> 00:06:10,000 Some local pigeon shooting champion from Jersey City wanted 97 00:06:10,000 --> 00:06:11,360 to take on England. 98 00:06:11,360 --> 00:06:16,079 These are details nobody thought to save but got saved 99 00:06:16,079 --> 00:06:16,720 anyway. 100 00:06:16,720 --> 00:06:19,519 One blurry word at a time. 101 00:06:33,600 --> 00:06:35,759 In 2009 Google saw the incredible power of this crowdsourced labor and bought reCAPTCHA. They plugged your ten second shifts into Google Books. An ambitious project to digitize 102 00:06:35,759 --> 00:06:39,600 tens of millions of books and create a massive library for the 103 00:06:39,600 --> 00:06:40,240 world. 104 00:06:40,240 --> 00:06:44,399 By 2018, over a billion people had unwittingly helped create 105 00:06:44,399 --> 00:06:47,519 one of humanity's greatest digital archives. 106 00:06:47,519 --> 00:06:51,519 We were tricked into building something wonderful. 107 00:06:51,519 --> 00:06:56,879 But we were also teaching the machines to read. 108 00:06:56,879 --> 00:07:00,160 So the test to see if you were human evolved. 109 00:07:00,160 --> 00:07:03,680 It wasn't just showing you twisted letters to make 110 00:07:03,680 --> 00:07:06,240 digitized books readable anymore. 111 00:07:06,240 --> 00:07:10,720 It started feeding you pictures of crosswalks, of traffic 112 00:07:10,720 --> 00:07:14,240 lights, and storefronts from Google Street View. 113 00:07:14,240 --> 00:07:18,000 By clicking those squares, you were training their image 114 00:07:18,000 --> 00:07:19,519 recognition systems. 115 00:07:19,519 --> 00:07:23,600 The machines needed to learn what a crosswalk looked like so 116 00:07:23,600 --> 00:07:25,680 a car could decide whether to stop. 117 00:07:25,680 --> 00:07:30,000 They needed to know where the landmarks were so the map could 118 00:07:30,000 --> 00:07:31,519 tell you where you are. 119 00:07:31,519 --> 00:07:36,160 The machines didn't just need to read, they needed to see. 120 00:07:36,160 --> 00:07:41,360 But then the images started to disappear too. 121 00:07:41,360 --> 00:07:46,079 Google released reCAPTCHA version 3 in 2018. 122 00:07:46,079 --> 00:07:49,439 And it doesn't ask you anything at all. 123 00:07:49,439 --> 00:07:53,040 There is no puzzle. 124 00:07:53,040 --> 00:07:55,120 No box to check. 125 00:07:55,120 --> 00:07:57,680 Instead, it watches. 126 00:07:57,680 --> 00:08:03,839 If you are on a site that has implemented version 3, it's 127 00:08:03,839 --> 00:08:05,040 invisible. 128 00:08:05,040 --> 00:08:07,680 But it sees you. 129 00:08:07,680 --> 00:08:12,000 It tracks how your mouse moves across the page. 130 00:08:12,000 --> 00:08:17,120 Whether you scroll in smooth arcs or jagged stops, the rhythm 131 00:08:17,120 --> 00:08:21,839 of your keystrokes, the way a human hand hesitates. 132 00:08:21,839 --> 00:08:26,000 It has seen enough of us. 133 00:08:26,000 --> 00:08:32,080 Billions of people performing their humanity over and over has 134 00:08:32,080 --> 00:08:35,279 taught it exactly what being human looks like. 135 00:08:35,279 --> 00:08:38,080 It no longer needs to ask. 136 00:08:38,080 --> 00:08:41,679 We proved we were human. 137 00:08:41,679 --> 00:08:44,399 And now the machines can too. 138 00:08:44,399 --> 00:08:50,879 I'm Daina Bouquin, and this is Lore in the Machine. 139 00:08:50,879 --> 00:08:57,120 If you enjoyed this episode, please rate and review and 140 00:08:57,120 --> 00:09:00,000 subscribe wherever you listen to podcasts. 141 00:09:00,000 --> 00:09:01,840 Thanks for listening.