The Test Set by Posit
A Posit podcast for data science junkies, anomaly hunters, and those who play outside the confidence interval. Hosted by Michael Chow, with co-hosts Wes McKinney & Hadley Wickham.
The Test Set by Posit
The Wonder-Driven Builder — with Paige Bailey
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Paige Bailey is a developer relations engineering lead at Google DeepMind. She's a geophysicist-turned-AI-engineer who was once told by her professors that building open-source libraries was a waste of time. We talk about her path from planetary science to TensorFlow, why statisticians have a hidden edge in the age of AI, and what it means to be a curious generalist when the cost of building software is approaching zero. Bonus: installing solar-powered silent-film birdhouses as street art in San Francisco.
What's inside
- From planetary science to TensorFlow, before it was GPU-capable
- Geophysicists as early GPU adopters
- The professors who said open-source wasn’t “real science”
- Building silent-film birdhouses as San Francisco street art
- Hiding Gemini API tests inside whimsical side projects
- The right-tool-for-the-job case for mixing AI models
- Why “taste” is the skill that matters when code costs nothing
Welcome to the Test Set. Here we talk with some of the brightest thinkers and tinkerers in statistical analysis, scientific computing, and machine learning. Digging into what makes them tick, plus the insights, experiments, and OMG moments that shape the field. On this episode, we sit down with Paige Bailey, DevRel Engineering Lead at Google DeepMind. She started out coding text based games from Byte Magazine onto her Apple II computer, which is a real throwback. Her website mentioned she makes weird things digitally and IRL, like a bot that emails her, roasting her taste in music, or solar powered silent film birdhouses as street art in San Francisco. She runs us through her background in geophysics, where she was introduced to Python and open source early. She discusses how she weaves a variety of Google APIs and services together to make things fun, weird, and educational. And I left feeling like my use of AI could be so much weirder, so I'm so excited for people to listen to this interview. Paige Bailey, welcome to The Test Set. Awesome. Thank you for having me. Yeah. So we we're here in San Francisco, and you you mentioned you took a Waymo in, which seems very I did. Is very on brand. Yes. I love that. Just by way of introduction, so you're a developer relations engineering lead at Google DeepMind? Correct. And you've had a incredible career of doing a lot of things, which I really wanna get to. Because I always forget, I wanna make sure to say I'm Michael Chow, the host. And I'm joined by Hadley Wickham, Chief Scientist at Posit, and Isabel Zimmerman, incredible Posit software engineer. And we're so happy to have you on Excellent. Here in San Francisco. Here in beautiful San Francisco. I know we we talked a little bit before you came on, and I thought maybe you could just catch us up on your career starting with your love of text adventures. Go there and. Absolutely. So so I got my first computer. We rescued it from being thrown away around the time that I was eight or nine years old. It was an Apple II. And the there weren't a ton of games that were available for the Apple II at the time, but certainly not ones that you could like go to GameStop and purchase. So I was beholden to all of the games that came with this computer that just missed getting thrown away. And also a whole bunch of books that talked about how you might build your own software programs. Oh. And old magazines like Copies of Byte, which had additional like games that you could type in. So my first programs were all text adventures, kind of like either transcribing choose your own adventure books or like building my own. And that just sort of made computers feel like a fun thing. And loved open source as well, like, can pretty much give my entire career to open source. Yeah. Yeah. Yeah. Do you remember that era of computing? I do. Okay. I do remember it. And we we had like a printer where the paper, you know, is one big Yeah. Yeah. Yeah. With the perforated edges. Yeah. Yeah. And you have to rip it off. Yeah. Yeah. It's rough, you know, as a child, I feel like. And it had such a cool a cool tone too. Like that Yeah. The printers. When it was working. Yeah. It was like humming a tune. Yeah. Yeah. Exactly. Isabel assuming you ever got a magazine where you like typed in? I did not. I think my first intro to computer games was like Bugdom. Anything from Pangea Software, I I spent time in. As well as Myst. Myst. Myst was love Myst. There was a Yeah. Like, number one game, like, my brother and I would sit around for hours and go through all the puzzles. It was amazing. Myst, Riven, Exile, like, definitely, definitely all known and loved. Yeah. All your games came on CDs, then? Or Yeah. Yeah. All CDs. Yep. All CDs. And Riven, I remember Riven in Exile had like four CDs that you had to like put in in order Yeah. As you explored the islands. Yeah. For real. I I'm ashamed to admit I never got past what seems like the very beginning of Myst. I just traveled around pulling levers and Yeah. I never knew where the levers went to or what was happening. So I feel like you two really tapped into some other dimension of Myst that I totally pages and, like, learning this learning the lore of Cirrus and Achenar. It like I feel like it was my first intro to plot twists. And I was like, this is mind blowing. Like, I I didn't realize you could experience a game like this and have such an emotional reaction to all of it. Reaction to all of it. Exactly. And that you could kind of choose your own adventure when it came to the ending too. So you could like pick one of the brothers or you could pick you could pick Atris. I think that was his name or something. But it was but it was awesome. Like, strong recommend. It holds up. I have to go back. Yeah. I think I'm ready. I think I've like matured enough that I can I'm ready to tackle Myst in the way it was meant to be played. Cool. Cool. Well, now if you get stuck, you can go to go on the Internet. And yeah? Or just get Claude Code to play the game for you. Yeah. Yeah. I'm really curious. When did you how did you get involved with computer programming? Yeah. Well, so so creating text adventures, that was kind of the first thing. And then also, when I went to I went to Rice University in Houston, and there was a class called CAAM 210, which a friend was taking taught by kind of this majestic bearded human named Steve ***. I'm not sure if you encountered him Hadley, but he was like wonderful wonderful human, Usually taught cam two ten using MATLAB. But for whatever reason, like this semester was teaching it using Python. Like Python plus a lot of the numeric and scientific computing packages that were kind of earlier days at the time. And I remember like hanging out with my friend who was going through a lot of the a lot of the tutorials and like trying to bang their head against the wall to to start using Python and got really enchanted by this idea that people would build things and then like release them out into the world for zero Oh, cool. Yeah. So so that was I would say like started programming, just text adventures, making mods to like Grand Theft Auto. Not really like thinking anything of it when I was younger, and then I got to school and like Python was kind of the first thing that that helped me understand open source. Yeah. Cool. And were you would you say you're doing a lot of sort of data science work out of school or what kinda? So so it's really it's really fascinating. Like, I feel like there was this period of time where a lot of people were doing data science y style things, but the word did like, the term did not exist. So so for open source, like, my background, I was doing kind of geophysics, specifically planetary sciences research. So I interned at the Laboratory for Atmospheric and Space Physics and Southwest Research Institute Antonio and was doing a lot of work to understand lunar ultraviolet. So there were these spacecraft that had a whole bunch of data that would get pulled down to help understand lunar albedo or some of the other some of the other things, like metrics that the spacecraft were capturing. And the expectation was that you would analyze them using something called IDL, which was more painful than MATLAB. And but I had I had like, Python existed. And so that was that was a little bit easier. And so so I ended up if I needed to visualize data or or kind of understand it instead of like trying to bang my head against the wall and understand IDL, apologies to anybody who might have been working on IDL. I I just used Python and started working on it like that. And we yep. And we were also really fortunate in Texas in the sense that, like, just down the way in Austin, there were a whole bunch of people that were also, like, nerds into open source libraries for scientific computing. So like the Enthought Canopy dudes and, you know, the Anaconda team now, though they had previously had another name, I think Continuum Analytics or something similar. So so feel like I've really looked into it, honestly. It's kinda hard to imagine now, but at the time, it was such a crazy idea to like that you might use, like, free software to do serious work instead of paying, like, money. Exactly. Yeah. Like, now it just seems like, it is kind of, like, weird. Right? We're just using all this free stuff that people make from out of the goodness of their hearts and give to the world. One hundred percent. Especially in the oil and gas like, the geophysics oil and gas industry world because so much of that software is proprietary, like even today. Like there there's something called Petrel, the the one of the software applications for understanding subsurface data. And I think it's like a hundred thousand dollars a seat per year or something. So like like crazy crazy amounts. But most of the things that you do on Petrel, if not all, you could easily do with just like Python and open source libraries. Oh, interesting. Yeah. So so it does feel like a really it feels so strange to even think about it right now of like how different everything was back then. Like, it was also it was also the situation where I remember going to my professors and being like, hey, really like, you know, like programming. I think this is really cool. Like, we should totally have libraries for doing earth sciences related things. And my professors being like, no, that's not real science. Like, why are you wasting your time? Like, nobody needs like, if you're if you're writing libraries, like, that's not actually contributing to your career. Actually a waste of time. Yeah. Yeah. But like tooling versus doing Yeah. The actual research. Yep. Yep. Now, like, now it's if you're not, you know, writing code or like using using code to analyze data, you're probably not doing this. What are you doing? Yeah. I'm really curious. You were somebody who has been in the AI space kinda like before it was cool, and I'm putting that in air quotes. Like, how did you make that jump from, you know, I'm doing, like, geo stuff to now I'm I'm doing AI. I'm doing deep learning, working on things like TensorFlow? Yeah. So so geophysics geophysics is like I always feel like we are the the the sort of the the more stay at home type kind of earth scientists. Right? Like, we're the we're the ones that care about computers and that are that are trying to find more efficient ways to analyze data. And the geophysic world actually adopted GPUs before most everybody else for things like velocity modeling, like fluid dynamic simulations, like all sorts stuff. Even before the gaming industry, like, kind of like caught on to GPUs. And so so I was doing that a lot for some of the the geophysics work just like trying to bang my head against CUDA. And when TensorFlow was first released, don't I don't think most people remember, but it was CPU only. So you're doing kind of like, you're doing analysis across like large numbers of CPUs. Not really thinking about GPUs until later on. And so that was something that I could do that was helpful of just kind of like, hey, you know, like, you're thinking about accelerators now. I know CUDA. Like, I can help with that. And it's a it's a useful way to to kind of collaborate with the team, I think. Like if somebody releases an open source library and you contribute to it or you like build something with it and talk about it, then they get a flavor of what it would be like to work with you as a person. There's such a strong pipeline of people from like physical sciences like that to it feels like AI. There's the whole conference about ex astronomers, now data scientists. It's so surprising to see how all this all this comes together. Yep. Linear algebra and like also, you know, has the same JAX has the same spec as NumPy. So and a lot of the libraries have the same, like, general flavor of scientific computing libraries. So like SciPy for scientific computing processes. So so I think there's there's definitely a lot of love between the the kind of physics and mathematics world and the JAX world. Also, like all of the JAX machine learning frameworks authors came from mostly mathematics backgrounds, which is quite cool. Yeah. It's interesting. It seems like you've done a lot of both development and product management and DevRel engineering. Yep. When did you know this was kind of the work that you really wanted to be doing? How did this work come together? Yeah. I I think I I again, just like, I feel so stupidly lucky. Like the in the sense that I've I've just always done whatever I thought would be useful and interesting. And now we're in this really happy place where a long time ago, people would kind of artificially put people into buckets of like, oh, you're an engineer, you're a product human, you're, you know, a design human, you're like these these other things and you could only wear this one hat, which is really frustrating if you like doing a whole bunch of stuff. And now we're in this place where it's just like, oh, you want to you want to do data visualization and you want to do like, you want want to understand large scale data sets and you also want to be able to to build tools in order to do all of the above. Like go for it. Like that's you can you can be this auteur of anything you want to you want to enable. So so I I didn't have like any sort of design other than like I'm gonna work on things that I feel will be important and useful and interesting. And when I was a product person, I was exclusively a product person on developer tools or models. So like highly technical, like lower level sort of sort of things. So I was still writing code every day. Yeah. Yeah. It does it does really feel like like Claude Code and similar tools are, like, such a win for people who have just got, like, weird interest in lots of places. Exactly. Like, it's and and, like, suddenly all of these things that were, like, seventy percent finished on your GitHub, like, you can, like, actually get out into the world. It's there's never been a better time to be a curious human who wants to, like, create things. Yeah. Yeah. Yeah. Like, I made an iPhone app. Like, it's just a little timer for talks, but just, something that would be in my head Yep. For years, and, that's like a weekend project now. It's, like, it's really cool. Exactly. I I don't have it in my backpack right now, but there's one of my side projects is using there's something called a cheap yellow display. It's called the CYD ESP thirty two. And it's like this little tiny board with just a visualization screen on top. And to update it, you have to rewrite the firmware. So so it's it's not as flexible as something like a Raspberry Pi with a screen or but I would have never been able to to do anything with it. And now I'm using those to power these silent film birdhouses. So it's like you have a birdhouse, put the little screen on the inside, people look through the little hole, and you can watch the silent films. Oh my gosh. Yeah. Wait. Girl. That's incredible. It's I'm putting them so I'm putting them around San Francisco as art installations Yeah. Like next to the tiny, like, borrowing libraries. Yeah. Oh my gosh. You can you can have, like, site specific films for each location. And then, like, they just have, like, a little battery in them that last Yep. And they're also they're also, like, super, like, low energy requirements. So even if you had kind of, like, a solar panel on top of the birdhouse, like, that would be enough to power it for for quite a bit of time. Yeah. That's really cool. I feel like this really gets into an area that Isabel and me to some small degree spent probably too much time around holding it too deep down, which was every web domain you own for a side project. Yes. And the sheer scale of Fun. Honestly, incredible. Like your website is just such a beautiful, curious, joyful experience. Thank you. And like just the amount of fun side projects and whimsy that you're bringing to all of these like really ******** tools. I'm really curious like, what are you working on? What tools are you building? Whether they're the funkiest things like silent bird houses or what are you excited about right now? Oh, man. So so I also love all of my all of my side projects. Like, they're they're super fun to build. Like, I get a lot of joy out of them, but they're also kind of like sneaky ulterior motive is stress testing some of the features that we have coming out for the Gemini APIs in our models. You like hid vegetables in them for Well, it's the well, so so as an example, we released something just recently, our second iteration of the multimodal embeddings model, which is allows you to embed in the same space, video, audio, images, text, and code. So you can like pull together audio, video, like you could you could type in like Alan Kay and then see like every time Alan Kay utters words in a video. But also like every time he appears in a video, anytime he appears in a photograph, etcetera. And it's it's just really really cool to see that. So that's powering folklore dot dev which was pulling a lot of the the oral histories from the computer history museum into kind of like a contextual space. So you got to see the transcript, you got to see interesting anecdotes extracted, but you also got it grounded and kind of like, well, when they say this computer and they were able to do this on this computer, like, what kind of compute power does it have? How does that compare to your mobile device? Like, what would it cost in today's dollars? Those sorts of things. And it's the same with the stacks dot dev, which was pulling in all of the Byte and Omni magazines and more on the way to to kind of extract out insights using the Gemini models, whether it's like telephone numbers or addresses or whatever it is. And then also putting the covers and the articles in a similar embedding space. So if somebody like wrote in to complain about their experience using a Commodore sixty four, like what other articles are are similar in vibes to that? And the I also thought it was quite cool in the sense that like pulling in, you know, forty years or so of of those magazines and then mapping out the different, you know, places that people were sending letters from or like the stores and those sorts of things. When you visualize them in on a map, you can start seeing like, oh my god. Like in Houston, there were these pockets of people who all cared about this thing. And then, you know, like Sunnyvale or in Cupertino, there were like five people within a six block radius that cared about this thing. And you can start realizing that like, you know, geographic location might have impacted or influenced some of the creations of these products. Yeah. But all of that's like locked away otherwise, and now it's not, which is which is kind of amazing. It's it's interesting to hear it sounds like to you working on models and working so closely with a lot of these tools, like, gives you the opportunity to see, like, this is coming out. What new kinda, like, interesting thing can happen? Like, what can it unlock? Yep. And also the cost changes too. So, like, the Gemini 3.1 Flash Lite model, which came out just recently, ended up kind of like, it's like a ninety percent reduction in the costs associated. So significantly more powerful than you know, Gemini 2.0 Pro or 2.5 Pro, but at a fraction of the cost and much faster. And it's the same with the batch API. Like you can get a fifty percent reduction in costs just by using the batch API. So the cost of maintaining all of these websites over time has actually gone down even though like more features have been added and they've gotten much cooler just because the model prices are also going down pretty exponentially. Are you willing to tell us, like, how much does it cost to run one of those nowadays? Is it, you know, hundreds of dollars to have this all be running on your website? Yeah. So for so so I deploy everything via Cloud Run, which is which is pretty low cost. The and and some, I think, for for all of the things that you see on web page dot dev, now that I've replatformed to use Gemini three point one Flash Lite and Nano Banana two instead of Nano Banana Pro for the for the number of ones that use Nano Banana two. I think the most expensive service is the embeddings, but it's still been less than, like, two hundred dollars to maintain everything over the course of a year. Wow. Yeah. For as many prolific projects that are on there, I am shocked. Yes. That's impressive. It's it's super super cool. And, like, the there are also a lot of open models that are becoming frontier quality. So so I think the the cost is only going to go down more over time. Yeah. Yeah. It's really interesting to see too the like, all the projects that you're kinda like pace of putting out like really interesting projects and sharing out. It seems like the turnaround sometimes is just immediate that you might deploy a project Yeah. In hours. Is that This is true. Like, the the latest one, I haven't added it to my website yet, though this is a good incentive too. It's a traffic cam guesser. So it pulls in all of the live traffic feeds from around the United States for at least like a half dozen cities so far. And then like removes using Nano Banana to any street signs or identifying sites. And then you have to guess which neighborhood the traffic cam is from. And then, like, it gives you kind of GeoGuessr-style hints of like, here's what you should have seen to help guide your choice. And you can also use OAuth to log in and kind of track progress too. My favorite one though is the Wikipedia racer, which is like like I I am just I am enchanted by depths of Wikipedia and Wikipedia rabbit holing. And so having something that will just create Wikipedia rabbit holes for me is is pretty magical. Yeah. I love that. I I think I went on there too and there's like levels of difficulty as well. Oh, yeah. You really decked out it's not point a to b. There's actually you can customize your WikiRace journey. Is that how many cc's your Exactly. WikiRacer has. Yep. And then also you can commission your own adventures. If you want something that's like related to Kanye or like related to related to like meditative Gregorian chants or whatever it is. Like, it will create a Wikipedia race for you specifically specifically for that. And then it will also show you over time as more people try it like what their score is and how their paths diverge from yours. Oh, wow. Yeah. Do you think you've always been into this, like, speed run? Will it bring joy? Like, beautiful form of tinkering? Do you think like like when you're a kid, someone could have guessed this person's gonna speed run apps. Yeah. I oh gosh, no. I grew up in I grew up in like the tiniest little farm town in Texas. So like thirteen hundred people, fifteen miles away from the nearest grocery store sort of situation. And also in university, like, don't think any of my professors would have guessed it. Though they they though they probably would have said something to the effect of like, oh, Paige cares about computers too much. Like, she's gonna be a bad physicist. Like, it's it's like Yeah. So I think the world has changed kind of recently to to appreciate people who are whole humans, who care about not just like one thing, but like lots of things, and who are capable of seeing connections between lots of things. So a couple of favorite anecdotes, like Claude Shannon came up with information theory because he could took a philosophy class. Like there are other scientific breakthroughs that have been inspired by things that are completely outside of the sciences that are in the arts. And then also, as an example, Ted Nelson introduced Alan Kay and his wife because the Alan Kay's wife Bonnie was trying to write the screenplay for Tron. And she ended up basing like one of the characters in Tron on Alan Kay. Oh, wow. And that was the first that was the first screenplay to ever be typed and printed off on a computer because it was using the Alto at Xerox PARC. So it's like like one of the most beloved like entertainment things, Tron, like was only was only possible because of this interplay between like tech plus computers. And last one, Larry Tesler got inspired by zines for the copy and paste feature. Oh, wow. For for Max. Yeah. Right? Like, so so because he was like making zines, cutting out things and like pasting them together and like that inspired copy paste. So I I feel like, you know, the the magic happens in these margins and like people had historically kind of been pushed into these silos and now we're getting to the place where we're bringing back people into appreciating whatever they think they're bringing of Victorian like polymaths who like did all of these different Yep. Fields. It kinda feels like we're heading back in that direction a little bit And also we're, you know, like just doing what enchanted them too. Because they would like play an instrument, or they would like go hang out with their friend who was like doing art in the park, and like the it's, you know, those those things are important. Like, we should all be whole humans, not just like feel like we need to over index on one strength. How do you choose what to try? Like, how do you choose like, oh, I'm gonna stitch a to b. I'm gonna try this tool today. That's how are you figuring out? And like, how many, like, of the things you try? Like, how many do you are successes and like failures? Yeah. So some of some of them, like a lot of the time that the thing that actually gets published is not the thing that I originally wanted to do. I find that there's some sort of constraint in the tool or constraint in the API that I'm trying to call that means that I have to pursue an alternate path. So so as an example, one one thing that I that I had been working on was I also have a whole collection of just like emails that get sent to me every morning. One of which is pulling a new a new painting from the Art Institute of Chicago or like a new art piece from the Art Institute of Chicago describing it like like a rough and tumble Chicago and and like a in a cool way. And then also like peppering in like really interesting insights about the city or the author that you wouldn't expect, which is using the the Google search grounding feature. And then formatting it formatting it nicely. But the reason that that that happened was because the Art Institute of Chicago was one of the only ones that had an API that you could call that did not require kind of like username password and I didn't wanna store that in plain text. So so like that entire experiment was only possible because you could you could sort of ping the API and and get that response back. Like originally, had been like, oh, well, I should pull from a whole bunch of different museums and just do this generic thing. And then this gave it a little bit more of a flavor and personality. Yeah. Yeah. I think that's kinda one of the things I think I've learned from, like, making images of Nano Banana too is sometimes you've just gotta, like, roll with it. Yep. Like, you can't that that initial idea you had, you can't hold on it too hard because you're just just too frustrating to get to and just too good to be open to, like, what what the what the model brings you, what randomness brings you today. Exactly. Like, it's it feels one of the the things that's been most enchanting about AI as as a person who, you know, computer programmers historically have very much wanted to must control everything. Like, I must deterministically know the input and deterministically know the output. And now it's like, it's definitely not that. It's it's kind of like, alright, you can set the constraints or you can attempt to set some constraints. But what you're gonna get is real different. And so you either need to be able to kind of do that in an artistic way that appreciates randomness and serendipity. Or you're gonna get real frustrated and not wanna be a computer programmer anymore. Like, so That does make me wonder, like, if statisticians are kinda like have a bit of a heads up there because you're so used to the idea of like randomness. Yeah. Like, I always like found this idea of like, you know, like sort of quantum physics and it feels to like many people, I think that the foundations of the universe are probabilistic, that's just like horrible. But to me, I'm just like, of course it's that way. Totally makes sense to me. Like, why does it Exactly. Why does it need to be deterministic? Exactly. And and I think it it also is is a lot more inspiring in the sense that, you know, the real world is messy. Like, and and if you if you see something that's unexpected, then it it might inspire you to go a completely different way or to see something in a way that you wouldn't have imagined necessarily. And and it's also just a lot more fun. Like, I I feel like one of the the biggest challenges right now is that, you know, universities have kind of over indexed on problem sets. And now like if you have a problem set, like good luck finding a person that's actually gonna do the problem set as opposed to just like paste it into whatever their favorite chatbot is and get a response back. And so they're they're not quite learning anything unless you kind of set up the constraints for the assignment to be like, alright. Well, today, you have a whole bunch of different hardware. You're going to you build it and explain whatever physical processes are required and and throughout the the building phase. And if you're trying to connect something and it doesn't work, you have to explain why it didn't work in the physics behind that. Like, that's a much cooler assignment as opposed to just like, hey, you know, what is the velocity of a versus b? Yeah. When I like to think about teaching like data science today, it's like, you know, the cost of creating a shiny app or a little interactive applet to explore something is now like zero. Like you should be doing that all the time. Like we don't need to ask these like deterministic, is this right or not questions. We can be like, hey, create a tool to explore this and then tell us what you found. Yep. Or create your own data set. Like, out some like, figure out a really gnarly question that you wanna answer and then find a way to automate the creation of a data set that will help you answer the questions. Yeah. I actually had a really like, I had a use like, I tried this with like, I only realized, like, I flew to California, like, last week when the lines at Houston's airports were getting really, really long. And, like, only at that moment I was like, I need to be scraping the wait times and like recording them in a Parquet file on GitHub. And so I like I did like a little bit on the website to kind of find and I was like, Oh, there's actually a JSON endpoint I can call that gives me nicely formatted JSON. But I was like, oh, maybe, like, maybe there's like a historical API hidden here as well. And I was like, hey, Claude Code, like, can you just, like, try this out? And it came up with, like, fifteen different, like, plausible looking URLs and tried them all out. Like none of them did actually yield the historical data. Like one of them looked really promising, like it was like the API slash date, but then it just returned the same day. But just that kind of like exploration you can do now. Like you don't have to know a ton about APU. You need to know what an API is, a bit about how it works. But that ability to just like iterate, create datasets that matter to you, collect them, install them, think it's really it's really cool. Yep. And if they matter to you, they probably matter to somebody too. In which case, like, suddenly the world is filled with a lot a lot of more datasets that can be used or expanded. I mean, my my plan is to have, like, a very like, a hyper specific, like, IAH, like wait time website. So you can be like, this is what it's looking like today. And the like historical context, like Tuesdays look like this. And you call it Houston, we have a problem. Or like There it is. Yeah. Yeah. Yeah. Yeah. I feel like there's a real critical role here. A good opportunity here. Yeah. Yeah. So we heard a lot about your projects and we learned that your projects actually involve a lot of important things that are coming out. Like you've hidden your vegetables in them and we just didn't know we were eating them. I'm curious how it ties into being a developer relations engineering lead at DeepMind. Are these all connected somehow? A a little bit. Yep. So most of the most of the models that I'm using are DeepMind specific models. So so they would be like the different flavors of Gemini that we have. Mostly not pro, mostly Flash and Flash Lite and and some of the others. The Nano Banana models, Veo for video generation. We also have an open model family called Gemma, which we also expose via the API. So you can either download it and use it yourself or you can kind of use it via our API for zero dollars, which is quite cool. And then also, there are some tinier versions of the Gemini models that are small enough to fit on mobile devices. So we've embedded one within the Chrome browser. We've embedded them in Pixel devices. And you can kind of use the onboard intelligence without WiFi connection for for zero dollars. And that's that's getting better by the day as well. But but so those are kind of the the models that we support. I also mentioned JAX's, like, the machine learning framework and all of the affiliate libraries. And then we also have some labs experiments as well as a tool called Antigravity that's kind of like an IDE for for experimenting with these things. The but all of that is kind of the the products that we would expect developers and increasingly like people who are developer adjacent to use to build things. And I'm I'm actually really really excited about getting information workers to be able to create and to use these tools. So not just not just people who are coming from engineering backgrounds or who who understand software engineering, but like all of the people outside of that world. What's you said information workers? Yep. What's a information worker? Yeah. So it's So it's just me being a barbarian. No. No. No. Excellent question. So so like information workers, I I feel like it's it's such a nebulous term that could apply to to pretty much anybody. But like, people who, you know, are data analysts Or people who might be like working working in the scientists or sciences to like take in primarily CSVs and like do do interesting analysis on them. Like one of the first two things that I use TensorFlow for back when it came out a long long time ago was one of the the projects that they released with it was being able to classify five different kinds of flowers. But you could also just like have five different kinds of things that you wanted to classify and and do do that classification process pretty simply. Just kind of like a a move between the five different kinds of flowers to five different kinds of other things. And so in the earth sciences, as well as in the biological sciences, but there there were a lot of grad students, self included, who were having to hand classify like different kinds of features in reefs or different kinds of, you know, shapes of things on petri dishes or whatever it is. And instead of having, you know, a grad student spend a month like going through and doing that, like you could do it really easily just with the most basic classification task. And so so like all of these people who are working in the sciences, there are tons and tons of those kinds of use cases that are ripe for automation, that tools exist that would save time and energy and accelerate the scientific process that nobody just knows about. Yeah. So so like now that the models are good enough to do this kind of work, just blazing a path for people who are experts to to like automate away the parts that are frustrating and like focus on the parts that are fun. I think like inside businesses too, there's like a lot of people whose jobs like center around Excel spreadsheets and like painstakingly, carefully copying bits from one sheet to another sheet and making sure they line up or like getting information out of a Jira ticket and putting it on a spreadsheet and they've got a PDF and extracting three bits. There's all this stuff that's just like really like no one enjoys doing this. Like it's often like it's high stakes. Like you know it's really easy to make a mistake. It's like no one enjoys doing that and there's just so much potential to Yeah. And it also is like very**** ** the human body too. Because if you're like copying and pasting things, like a lot of people use mice for that, in which case like the the potential for RSIs, the repetitive stress injuries are really high. And so anything that can take away the parts of work that are boring and not fun, I I think is a is a great use for these models. Yeah. Yeah. It's an interest yeah. It's so helpful to hear, like, the different people or like the all the different tasks where people need this and that for something sometimes where it's like it sounds like you're saying also, like, it's gotta be right, but it's also a little bit mundane. Yeah. Like, these tools can really lift. Well, stuff like if, like, we had to do it, like, could write a program to do it and, like, really easily. But if you don't know how to program, you're just completely locked out of this automation. And I think there's still like a lot of questions to answer about how do we make sure they can do this for like slightly unreliably and put checks in places. But it's yeah, it's really exciting. Just a lot of human misery that we can hopefully eliminate. I'm really curious, like, as somebody who builds and uses and, like, combines all of these tools, how you decide to like, what tools you're using and how they're like, how to continue growing and learning and staying current, especially being in a world where things change on the day, on the hour. Like, how are you making decisions about where to go with the tools that you're using? All of these all of these models that get released, they're they're basically like supercharged engines. Right? Like and so, like, one of the one of the things that can be a little bit disheartening to hear sometimes is that people want one model to kind of be the one model that they use. Whereas I think that what we see is that you it's it's a much more effective path to, like, pick a model based on what it's really, really good at or what you need. Like like, if you need a model that has really low latency and that is really, really cost effective and that can get the job done for for a given subset of tasks, like pick that. Don't reach for like the highest end model every single time. And I find like so for the Gemini models, they're really good at analyzing video, PDFs with images embedded in them, audio, like translating between different languages, like doing real time interactions, those sorts of things. And I find that the Anthropic models are exceptionally good at kind of writing in a way that feels like I'm talking to a friend, and also writing code. And so so if I couple together like Gemini's ability to understand video, to generate images, to understand PDFs with like images embedded, and then also Anthropic's ability to like write code and then to maybe like plan out what the work should look like. Then that's a really magical experience. Whereas the if you were trying to use one model to do all of those things, like you would just not have a satisfying a satisfying journey. But that's but it's not I guess answering it would which is just like, like, it is a twenty four seven sort of situation of like constantly trying out new things and seeing what works for you and what works for the different tasks that you care about. It feels exhausting sometimes. Right? Like I've never really felt like FOMO Yep. In my career, but now it feels like sometimes it's almost like paralyzing. Like I can't be like if I'm working, I'm like, my God, maybe there's a new model that just came out that would like do my work for me or like, you know, do what I'm doing now better. It is like it's an exhausting time to be a This is the AI psychosis? Yeah. It's it is definitely like and the the cost of writing software is going to zero. But that means that taste is more important than ever. Right? Like, so the ability to the ability to create things that feel still human and special and serendipitous. Like like, one of the one of the the things that I loved most about, you know, like reading Byte magazine or like whatever whatever else is that I would sometimes find a thing that was something I absolutely needed at the time, but I would have never known to go look for it myself. And now I I feel like with with AI kind of driving, you know, the content that you see on social platforms or, like, the code that gets generated in a specific style or format or tone and that kind of like, again, is forcing people into silos or one way of viewing the world when the the thing that that's really magical is helping people understand that the world is bigger to to see it in a different way. And so so like that, I I try to view the models as just tools. And then part of my job is figuring out how to take their outputs and to craft them in a way that would resonate with with people. That really speaks to me as like somebody who has developed open source and telling people all the time, like, there's no language for and you understand like, you know, it's not R versus Python. It's the right tool for the job. And hearing you say, you know, that applies to models as well. And just trying to deconstruct that in my mind, like, this is a model I should be using for coding. This is the model I should be using for helping me with docs or something like that. Feels very empowering And it that we have these muscles already. Exactly. And it changes it changes monthly. So like the the ability to if not weekly. Right? Like the the ability to to just kind of try things out and to accept that things will change. And then like, again, statisticians have the edge. Like like, and people who have worked with data have the edge because we all understand data drift. So so it's it's just never been a more exciting time to work in this space, but also I I absolutely agree. It is kind of exhausting. Yeah. Yeah. It's so interesting. I know we're getting long and you have to hop a flight to LA for a hackathon and Heck yeah. It's gonna be awesome. It's the it's a generative media hackathon. It'll be like Veo, but also Nano Banana. And also hopefully like the Lyria API as well. So people can create music and like Oh, nice. I also don't doubt that on the flight, there's a high chance you push some kind of app. At time you're in the air, I fully believe that's possible. Well, I I do still need for the the webcam, like, the the traffic cam guesser. I still need to add it to my main website, but I also want to I also want to add more street view data in more cities. Oh, nice. So maybe, like, Houston is next on the list. There's not enough Texas representation, so maybe that's Yeah. I love that. I know we learned one really fun fact that we chatted about a lot too, which was that you have an app that roasts your last song choice. Oh, yeah. On Last.fm. So like the the last so Last.fm, I am also like a super super fan of. Like, I have a Scrobble history that is like a decade deep or or like a so so like if you look at Paige's listening choices like from from the last decade plus, like, there's definitely been an evolution. But the but it it will tell me, like, based on what I've just listened to, what my estimated ages, and then also, like, how pretentious am I? Or like, how basic based on those choices. Yeah. Could you if you had to pitch humanity on why they should get roasted on their last song choice, how would you do that? I well, so so we all take ourselves way too seriously. Like, we and and in all honesty, if you if you're listening to the same thing over and over again or reading the same thing over and over again or using the same model over and over again, like, the part of the joy of being a human is is like learning from new perspectives. So like, if you if you get roasted, like, it's a good incentive to kind of laugh at yourself a little bit and then maybe experiment with something new. Yeah. That's a great takeaway message. That's what I was gonna say. Yeah. I I can't think you know I I'm so inspired by your yeah. Your just ability to, like, code and mix things and just find serendipitous combinations. So can't can't thank you enough for for coming on, and I'm so excited for the next one or a dozen things you you do. Thank you. Thank you so much for having And I I really feel like, you know, there's never been a more more interesting time to care about data and to care about the way in which information is communicated to folks. Yeah. Yeah. Hundred percent. Yeah. Awesome. Thanks. Yeah. Excellent. Thank you. Thank you so much. Yeah. Yeah. The Test Set is a production of Posit PBC, an open source and enterprise tooling data science software company. This episode was produced in collaboration with creative studio, ADJY. For more episodes, visit thetestset.co or find us on your favorite podcast platform.