Infinite Curiosity Pod with Prateek Joshi

The best place to find out how AI builders build. The host Prateek Joshi interviews world-class AI founders and VCs on this podcast. You can visit prateekj.com to learn more about the host.

All Episodes

Infinite Curiosity Pod with Prateek Joshi

LLM Data Frontiers

January 22, 2024 • Prateek Joshi

Curtis Northcutt is the cofounder and CEO of Cleanlab, a data curation platform for LLMs. They have raised $30M in funding from Bain Capital Ventures, Menlo, Databricks, and TQ. He was previously the cofounder and CTO of ChipBrain. He has a PhD in Computer Science from MIT.

(00:07) Data Curation in the Context of LLMs
(01:14) Connection between Language Models and Computer Science
(03:14) Importance of Data Curation for LLMs
(04:06) Challenges in Data Curation for LLMs
(06:09) Confident Learning and its Concept
(09:42) CleanLab and its Role
(12:42) Role of Open Source Datasets and Tooling
(15:08) Balancing Data and Privacy in Regulated Industries
(17:25) Feasibility of Federated Learning
(20:35) Decentralized Compute and Aggregating Compute Clusters
(25:19) Determining Model Size for Data Representation
(27:09) Advice for ML Engineers in Handling Data Curation
(30:20) Rapid Fire Round

Curtis's favorite book: The Bible (in the context of marketing)

--------
Where to find Prateek Joshi:

Newsletter: https://prateekjoshi.substack.com
Website: https://prateekj.com
LinkedIn: https://www.linkedin.com/in/prateek-joshi-91047b19
Twitter: https://twitter.com/prateekvjoshi

Prateek Joshi (00:02.007)
Thank you so much for joining me on the podcast.

Curtis Northcutt (00:05.407)
Hey, Brotique, good to be here.

Prateek Joshi (00:07.256)
All right, let's get right into it. Data curation, obviously you're building a company, but what exactly does data curation mean in the context of LLMs? And why is it crucial for the development of a modern AI application?

Curtis Northcutt (00:25.57)
We're jumping right in. So data curation for LLM specifically is focused on the quality of the data that's fed into the LLM during training time. So a lot of people will download something off the internet, you know, the new llama model or whatever off GitHub. And sometimes you just kind of don't think about the training that happened to create that model. And I think one of the beautiful things to think about, and this is actually a fundamental concept of where we're headed in artificial intelligence in general, is that the same key concepts that were occurring when programming languages were developed in computer science are now occurring in AI and LLM models. Specifically,

How would you build Python today? How could you do it from scratch? It needs to somehow work with a graphical user interface. It has to somehow display on the screen when you run programs. There's a lot to it, right? So you don't write Python from scratch. Actually, it's a history of 40 years, 50 years of development. And you first had the first compiler, which was written in binary. And then there's some kind of system code on top of that. And then some system, and then MIPS, if people have used MIPS. And then there's Verilog. And you're just building up from that.

languages and then you have like your very low-level system languages and then on top of that maybe you get to C finally and then on top of that maybe you write a compiler in C++ and then maybe you get the first version of Python and subsequent versions of languages are often written in prior versions of languages. So why did I share all that when you're talking about data curation for LLMs? The reason is because that paradigm of how we built computer science is now being used as the paradigm for how we build artificial

And specifically, the first language models were very small and simple. In the early days, they were things like LSTMs. And before that, they were just simple classifiers based on one hot encoded vectors that would train something called TFIDF, term frequency inverse. And these things were very simple. They didn't take into any of the context of the language around them. They would just make a prediction based on as if a sentence was like a bag of words.

We built on top of that and we then started taking into account attention and this paper comes out, attention is all you need. And then we start making these things bigger and expanding them, parallelizing them. And then we use those language models that have been trained on a whole bunch of data and we fine tune them to the next one and then we fine tune them to the next one and we keep building. And eventually you get the language models you have today and that process is going to keep going. All right, so now you see the connection between language models and computer science. And now where does the data come in?

Curtis Northcutt (03:14.06)
Every time that we fine-tune a model, we carry all the errors from the data on the first iteration of the model. So every time that you build a new language on top of the previous language, or a new language model on top of the previous language model, you carry in all the error from the previous version. And so that means that any error that was in the beginning ends up being propagated into the future, but we keep adding more error with each iteration. And so the reason why data curation is really important for LLMs

is because they're actually full of errors that have been propagated over many years.

Prateek Joshi (03:49.076)
That's a great context. And when you think about the challenges that we face in curating data for LLMs, what are some of the more prominent ones? And also how are developers addressing these challenges?

Curtis Northcutt (04:06.19)
Oh yeah, this is easy. So if you've ever seen, for example, in a language model specifically, but this is true for image models, text model, like simple text classifiers, tabular data, let's focus on language models, people know them. Say you fine tune chat GPT to take in a sentence and produce a label based on the emotion of that sentence. Then...

When you were training that thing to do that task, you would pass in a bunch of data of a bunch of sentences and the emotion associated with it. Some of the errors would be things like this. You pass in a sentence and it's actually not a sentence, it's just a telephone number, which obviously has no emotion. That would be an outlier, that's out of distribution. Sometimes you'll pass in gibberish, it's not even English. That's not even an outlier, that's not even part of the entire concept, that thing should be completely tossed out. That's like, you just,

bad data, some people call it an outlier. You might also have something that is ambiguous. Ambiguous data is a new class of data that we have been pushing and inventing at CleanLab that reduces the capacity for a model learned on that data to understand the data. So an example of that might be, say your classes are frustrated and anguished and you pass in some text that's like, I'm going through a lot of pain and I'm really frustrated. So is that frustrated or is that

example of something that's ambiguous. And then the final and most obvious example of an error is one that has a label that's just wrong. So you pass in an example that says, I can't wait to go to this party, it's going to be so fun. And then the label is sad. So that's many of the examples.

Prateek Joshi (05:51.793)
And at CleanLab, you have been talking about something called confident learning. And obviously, by the chance to go through it. But for listeners who don't know, can you explain the concept and how it came about?

Curtis Northcutt (06:09.886)
Yeah, this is a good one. So obviously that was my PhD at MIT. And I had this idea about 10 years ago, I'm from Kentucky and education was how I went from, son of a mailman in Kentucky to doing a PhD at MIT.

And so I really believed in the educational system. I worked at edX as one of their first research scientists, which is like the same as Coursera, people know it. And I discovered that the certificates being earned, a large proportion of them, like around 10%, were earned just by creating two accounts and copying all the answers from one after you get all the answers wrong, just rapidly going through getting all the answers wrong and then copying them to another account, pasting them and earning a certificate.

So when I discovered that, I thought, oh no, we need to catch all these cheaters, or we can't actually democratize education because these certificates don't mean anything because everyone's cheating to get them. So I build this ML algorithm to detect cheating and it turns out it doesn't work. It's really unreliable. It produces bad results. The accuracy is low. And I start.

looking through all of the research. I'm in a research lab at MIT, it's very research focused. And I go through all of Google Scholar, I go through everything I can find, I ask all my ML friends who are best and brightest, and none of us can find anything beyond a very simple task in machine learning that works for a very base case with noisy labels. And it turned out that machine learning wasn't ready for real world data, which blew my mind.

And so I'm watching the whole world try to use machine learning. And I went, I worked at Facebook and Google and Microsoft and Amazon. And I was at Oculus for two and a half years as a, like a contingent research scientist, and I'm working with all these groups.

Curtis Northcutt (07:58.654)
And I'm seeing all of them try to use AI for everything customer facing. And yet it doesn't work for real world data. So with that motivation, I was working at the time at the time in Isaac Chuang's group, and if you don't know Isaac Chuang, that doesn't surprise me if you're in data science or machine learning, because he invented the quantum computer. And he was my PhD advisor. And I noticed that a lot of the algorithms and techniques used in quantum computing

NMR quantum computing, which is non-magnetic resonance quantum computing, the same concepts and ideas to figure out what computation is happening on a machine can actually be applied to datasets. And it turned out that we could take the information theoretic ideas from quantum computing and apply them to any arbitrary dataset of any type, images, visual data, tabular data text.

And that's what inspired confident learning. And a lot of the ideas that field embodies are actually inspired by quantum computing. And it turns out that you can have theoretical guarantees to exactly say for certain data sets what all the errors are. And you can even guarantee a ranking over the likelihood of any data point being an error. And so we started building on that and building on that. And before long, we had a open source package which then eventually became the company CleanLab.

for any data set and any ML task could identify what the errors are in the data set and actually improve them and then you get a more advanced AI and that's a very marketing way to put it but basically a more reliable, higher accuracy machine learning model trained on that data set.

Prateek Joshi (09:42.024)
It's very interesting how a premise or a construct from one field is very useful. It happens all the time, but I think connecting those two areas and building something useful is very interesting. The premise is the data you'll find, locate the errors, you'll improve it. In practice, let's say you're working...

with a big company, like a customer maybe, and they have all this years of just data, like most of it is bad, some of it is good. In practice, how does it work? Let's say I have a lot of text documents going back to 2005, for example. How, like what goes in and what comes out, and how do I use that outcome in like a real world setting?

Curtis Northcutt (10:34.89)
Yeah, and it's important to keep in mind that the answer to that question varies based on the development of technology. So the answer now is you just upload a data set and then it just spits out the errors. And that's it. Like literally any data set, if you have data and it's labeled, then you just upload that app.cleanlab.ai and it will tell you all the errors in that data set and even train a better model for you. And that's that. So that's today, 2023 going into 2024, where we were, for example, five years ago.

Prateek Joshi (10:50.768)
All right.

Curtis Northcutt (11:04.944)
was you had to train a model first on the data and then output a bunch of predicted probabilities and it had to be trained in a very specific way. It had to be trained out of sample using cross validation with a large number of folds and that wasn't automated at the time. You had to do that yourself and then using predicted probabilities and the labels you would pass that in and it would tell you what are the outliers, what are the label errors, etc. I think a key point to note here

Curtis Northcutt (11:34.704)
The fact that the input to the confident learning algorithms was not the data itself, but actually the outputs of models meant something really important. It meant that we can apply the Markovian assumption, which basically is just saying that anything where this data came from or what the data was or even the model that generated it doesn't matter. What matters is the outputs of that model. And that allowed confident learning to work with any model.

and any data. And that's why I always emphasize the point that these solutions and these algorithms don't just work for text. They also work for Tesla images, right? Self-driving cars, they work for banks, financial institutions, tabular data. You have got a bunch of columns like an Excel spreadsheet with a label column. Anything that can be represented and input into a machine learning model can be identified for errors and improved.

Prateek Joshi (12:29.744)
amazing. And maybe it's a good stopping point to quickly talk about what CleanLab does. So for listeners who don't know, can you quickly explain what the company does?

Curtis Northcutt (12:42.14)
Yeah, there's sort of two personas.

The 10 year goal is to have anyone be able to build most AI solutions in hours or days, not months or years. So that's where we're headed. To achieve that, you have to recognize where the time and cost is spent in developing AI solutions. If you're familiar with Andrew Ng, he's commonly quoted for saying 80% of the time and cost is on the data engineering and the data quality. That's why we automate that.

that's our first like, you know, you don't go straight to Mars, you go to the moon first. So we go to the moon first and we make sure that we can automate that 80%.

What comes after that is then better auto ML, better model deployment. And the world that we're building is one where you can imagine, for example, a doctor in a hospital who sees, wait a minute, every day, me and my team, my staff, we analyze patients' cardiovascular health, their blood rate, their heart rate, their blood pressure. We analyze their breathing rate, their history, and we look at all that data and then we say, hey, this person needs a check-in for cancer or not.

And when they start to recognize, hey, I feel like there should be a computer system that could build this for me. But the reality is lots of the nurses have made errors, the doctors have made errors, the data has all sorts of outliers, random notes they threw in there. It's a mess. So what would be done today in 2023 without CleanLab is you'd have a team of data scientists that would have to go through and improve all that writing a bunch of ad hoc, Jupyter notebooks and scripts. Then you'd send it off to some labeling company. They get you really good labels. It would cost you a pretty penny.

Curtis Northcutt (14:24.108)
And now, after all this time, effort, and cost, you then have some machine learning engineer who figures out how to train the model, and then a systems engineer figures out how to deploy it. So that's the steps that it takes today. That's what we're automating tomorrow.

Prateek Joshi (14:38.648)
Right. Let's talk about the role of open source. You mentioned CleanLab, you started off as an open source project and now it's a company. When it comes to building foundation model, not just text, but any modality, what is the role of open source datasets in building useful applications? And also more than that, because data curation, it takes time and effort.

If you have the resources and means to take a large data set, clean it, there's less motivation to just put it out there in the open for others to use it. So what's your view on open source data sets and also the tooling?

Curtis Northcutt (15:24.018)
Yeah, so first something if you're listening to this podcast to remember is that whenever anyone asks anyone, what's the role of something and the reality is we haven't discovered all the rules that it has. That means that my answer is limited to where we are today. That being said, I think that a very clear way to think about open source data is as a basis. So you train a model on data that's open source and then you've got some stuff that's closed source.

private that for some reason you can't share its user specific data, it has private information, there are HIPAA compliance issues, there's a million reasons why you wouldn't open source certain data.

So the way you can think about it is find the open source data that's the most similar to your problem and hopefully there's a lot of it. It's not going to be the same. It's not going to be perfect for you, but train a model on that. That does reasonably okay for your task. And then you fine tune that model. Fine. Like if you're not familiar with fine tuning, you know, you do the first 80% training on the open source stuff. And now you've got a model, save that in time and then use that model and start, start training it on your specific closed source data. And now it's sort of fine tunes.

of it just like school. Everybody goes to the same generic K through 12 education in whatever country they're in, but then you go to college and you major in something. And when you major in something that's fine tuning on your specific closed source data. And then if you want to fine tune on an even smaller subset you go to grad school. And if you want to know everything about nothing then you get a PhD.

Prateek Joshi (16:59.826)
Right. So it's, yeah, it's interesting. All right, so you talked about privacy. The last bit, that was funny. All right, if you look at the need for privacy, obviously, you mentioned many industries that are regulated, they, or many reasons you can't just release stuff.

Curtis Northcutt (17:03.948)
I seem to have you lost for words, Pratik.

Prateek Joshi (17:25.268)
So when it comes to data curation, companies and for example, companies in healthcare, they need to balance the need for comprehensive data versus the respect for individual privacy. So how does a company in such a regulated industry do this? And also, obviously, let's say you want to be compliant. How do you get people to...

collaborate because one of the big advantages why, for example, OpenAI works is just the sheer amount or volume of data you have. So how do you get players in a regulated industry to collaborate on creating big meaningful datasets?

Curtis Northcutt (18:07.014)
Yeah, you got sort of two options. This is a very complex question. So let's do it short, and if you want more info, we can go deeper. I think of there as two options. There's either a middleman or there's not. So you either go direct to the supplier or there's a middleman. So if there's a middleman, you could think of that like I'm a healthcare company with data. I can train that data myself, except I'm a healthcare company. I don't have a bunch of ML researchers, engineers. So I can go direct to an ML company and have them do it for me, or I can go through a middleman.

The middleman would be like a consulting firm. That's like your Deloitte, your McKinsey, or there's some smaller ones who are fantastic, a lot of people don't know about, but like Berkeley Research Group, West Monroe, Alvarez and Marsal, these are fantastic tech consulting companies. The healthcare company can go to those tech consulting companies and have them build your solutions for you. What are those consultants doing? They're gonna use software like CleanLab. They're gonna use ML software that they have learned how to use, they're trained to use, that makes their life simple, and they'll solve the problem.

the way that this gets to the question that you're asking is the health care company has health care compliance issues and if you're an ML company, dealing with that is very difficult, especially in early days. And for example, if a founder is listening to this or a startup, someone interested in starting a company or someone just curious how you would do this, it's not easy if you're a seed stage, early stage startup company to go and immediately start working with massive health care, you know, institutions that have HIPAA compliance. So what

is you support the consulting firms. The consulting firm then handles that compliance with you. They have a massive org to handle those kinds of things, tons of lawyers. And then those consulting firms, they need support on the machine learning side. So that's where you support them. Are you gonna get the full check? No, you're not. But are you gonna get a check? Are you gonna be able to move forward, grow your business? Yes, start there, help. If it turns out that eventually you can support that customer directly, you can do that. But in the early days, I think that's a simple way to get by.

option is you go direct to the supplier, right? So in that you have a lot more legislative things to figure out and a lot more regulations.

Prateek Joshi (20:15.556)
Right. And in a situation like this, what's your view on federated learning? Is it a useful premise? Is it practical? And also, can we build a real world system that big companies can use?

Curtis Northcutt (20:35.486)
Yeah, federated learning is a concept that makes sense. It's, but being a concept that makes sense is not the same as being an applied useful solution that everyone is using today reliably.

But I can give you examples. Like, let's go back to what we talked about with confident learning. The outputs of a model are used, not the data itself. So that's an example where every device in the world could be training some very simple machine learning model. And then the outputs of that model could be sent without the data being sent directly. And this gives you a way to do federated confident learning where you have outputs, but no data being shared and from a whole bunch of devices. And you can imagine then taking all those

and then combining them in an intelligent way, and then updating a model and sending that back. It's a very reasonable thing to do, actually building the infrastructure to make that work in a way that people understand and is trustable. That's a totally different issue.

Prateek Joshi (21:35.472)
Right. Another construct that comes to mind, again, it's very early, but people, like many smart people, they're trying to work on decentralized compute, where in theory they want to aggregate many little compute cluster nodes, and then they want to provide that as a service, just like you would buy from AWS or Azure or Google. Now, two-part question. One, is that even feasible?

Can you aggregate compute and provide it reliably? And two, even if it's feasible, are customers willing to buy such a service?

Curtis Northcutt (22:13.534)
It's absolutely feasible and they can be willing to buy it. Think about, like, why do people buy Tesla? It's a great car, but that's not the reason they buy it. They buy it because they believe in a vision. People care about the planet and the Earth. Does it really have that much to do? I mean, it's very, very bad for the Earth to build those batteries. The amount of energy and fuel cost to build one of those batteries is pretty high. Now, most of the time, the batteries, the argument by Tesla is the batteries outlive the cost

earth overall and that's great. But the point is that people don't buy into it because it's just a great car because the Mercedes-Benz is also a great car and a BMW is also a great car. They buy it because they believe in the mission and it happens to also be a great car. So the reason I share that example is because there was a case where a group at a University of Washington, I don't quite remember the professor who led the group, I'm thinking Dan Wield but I'm not,

Curtis Northcutt (23:13.448)
group of faculty at University of Washington who built the first system that could take a bunch of compute off everyone's computers and use it to identify protein structures. And this is pre-deep mind. Now we have much more advanced ways and actually LLMs are doing this like a thousand, probably more than a thousand times faster. But at that time it was a big deal because we were able to identify protein structures much faster by spreading the compute on everyone's computers. And people did it. Pratik, they actually did it. They up, they

they got the program and when their computer was sitting at home and they're not working, they let it find protein structures because they believed in the cause. Now I've thought to myself, why didn't those University of Washington faculty then use that same ecosystem and let people just get paid for their compute? Right? That's a reasonable next step. And I think it's because they're focused on the academic problem. But I think the buy-in was there and that jump with the right sort of email sent and a little bit of marketing and

could have gotten them there. And so that's an example where what you suggested could actually happen in a feasible way.

Prateek Joshi (24:20.116)
Right. Yeah, I think it's especially as the world is clamoring for more compute. And obviously, Nvidia is gearing up to supply all the chips you'll need, but you need cloud service providers to aggregate these hardware offerings and offer the compute services. I think this could be a very interesting premise. And if you look at where

the world is heading in terms of, there's a very interesting paper a while back about training compute optimal LLMs. And basically it talks about the balance between for a given data set of a specific size and shape. There's only a certain number of parameters that the model can have after which it just doesn't mean anything. So in this case, when you talk to your customers about, hey, you have this data, images, text, videos, whatever it is, how do you advise them about

the size of the model that can offer a reasonable representation of the data they have.

Curtis Northcutt (25:29.098)
Yeah, there are two ways to handle it. One is you let them pay for bigger models and two is you don't talk about it at all. You know, you just handle it for them and you're giving them something that provides value and you just obfuscate the fact that you could spend more internally to give them a better result, but they found value with what they're already paying for. Now there's also a meta solution. The meta solution is you, and this is what most people end up doing eventually, but

Prateek Joshi (25:32.952)
No.

Curtis Northcutt (25:59.252)
company, you usually obfuscate this and you just kind of give it one solution fits all. But over time what people end up doing is creating a way for a customer to choose the ideal solution for them on what's called the Pareto frontier, which you can basically just imagine as you have this curve and as you're moving along the curve you can sort of in the beginning you're getting the best performance but you're paying a lot of money and it's taking a lot of time.

And then as you go along that curve, it takes less time, less money, but the accuracy goes down. And so you're trying to optimize for costs, time, and money. And somewhere on that curve is optimal for every customer. And so you just let them choose what's good for them.

Prateek Joshi (26:41.484)
Right. I have one final question before we go to the rapid fire round. And this is mostly centered on ML engineers and rather, how should they train themselves to handle the complexities of data curation in AI? Because I'll sort of the premise, we're near in college, everything is nice and controlled. The data sets are super clean. They're organized for you. So most,

I feel like there's a bit of a shock as you go from an academic environment to a real world company where the data mess is the biggest shock. So if you had to kind of coach or guide a new ML engineer, what would you tell them?

Curtis Northcutt (27:28.222)
Yeah, at a high level, I would just like at this point, we have a whole entire suite of tools available. And I would advise people to actually take a moment and look at what's out there. You're not going to know everything and it's evolving faster than you can keep up with today. And that's something you have to recognize. But I give three key points of advice. One is do not build everything from scratch. It's a total waste of time unless you're just trying to teach and learn. If you're just trying to learn then it's totally worth it.

done. Then go online and try to figure out what's there and don't just search Google. You need to, you have to go to the GitHub repos, see who's following, star them, look at other things in that same space, compare, try three different packages. It takes time and I think it's difficult in a world today where there is so much thrown at you non-stop and everything is in three second bite-sized clips to be able to have the attention span to actually go through say ten different technologies and use the best one.

But if you're willing to invest that effort up front, you will often save yourself a lot of pain in the future the second thing is I Highly recommend that if you're a new engineer You're new to the field that you actually look at some of the online courses because they're pretty good and they've improved a lot There's a course called DC AI Dot in my dot C sale dot MIT dot edu where you can for free learn about data centric AI And it's a great way to get started

And the final thing is we need to recognize that the future of education, of knowledge, is not about knowing things.

It's about knowing how to know things. And that's important to recognize. The best engineer 10 years from now will not be, the best programmer will not be the one who has programmed the longest. It'll be the one who's programmed for a long time, for sure, but also is deeply engaged with new tooling. And if you're using something like GitHub Copilot, which won't, there will be something else in 10 years. Whatever it is, if you are the best in the world at using that,

Curtis Northcutt (29:37.32)
you also happen to be a good coder. The combination of those two will make you superior to someone who does not have AI assistance because now you have all of the code that's ever been written and the access to it if you know how to wield that tool masterfully. And that's important to recognize where you should be spending your time instead of just learning how to code.

Prateek Joshi (29:58.008)
That's amazing. And a good portion of the listeners are either aspiring founders like early stage ML engineers or people who are already running early stage companies. So I think this is a wonderful piece of advice. All right, with that, we're at the rapid fire round. I'll ask a series of questions and would love to hear your answers in 15 seconds or less. You ready?

Curtis Northcutt (30:20.754)
Alright, let's do it.

Prateek Joshi (30:22.068)
All right, question number one, what's your favorite book?

Curtis Northcutt (30:26.606)
Oh, you caught me off guard right from the start. For everything, it's different. But I would say for marketing, the Bible.

Prateek Joshi (30:37.416)
Right, question number two, what has been an important but overlooked AI trend in the last 12 months?

Curtis Northcutt (30:48.214)
Mm.

Curtis Northcutt (30:55.134)
I think innovations in some of the multimodal LMs, but it won't be overlooked next year. It was just overlooked last year.

Prateek Joshi (31:02.704)
Yeah. What's the one thing about data curation that most people don't get?

Curtis Northcutt (31:16.394)
I think people tend to think that you can just fix, if you have like a mass of data thrown into some file somewhere, that it'll just somehow magically know exactly what you want it to be without specifying ahead of time what you want it to be.

Prateek Joshi (31:30.268)
Right. What separates great AI products from the good ones?

Curtis Northcutt (31:37.746)
A great, a good AI product is something that makes the company and the founders money. A great AI product makes the whole world money.

Prateek Joshi (31:48.496)
Amazing, love that. The next question, as a founder, what have you changed your mind on recently?

Curtis Northcutt (31:59.338)
If I could do it all over again, I'm not sure that I would be a founder because I was super naive how difficult it is. But now that I'm here, I'm grateful. I am where I am.

Prateek Joshi (32:10.932)
Right. What's your biggest AI prediction for the next 12 months?

Curtis Northcutt (32:21.058)
Biggest is hard to define, but I think the world has yet to figure out actually the promise of where LLMs and generative AI is headed. My prediction is that we'll have more of an idea of how to use these things in the B2B case, not just the B2C case. So businesses, not just consumers.

Prateek Joshi (32:39.86)
Right. Final question. What's your number one advice to founders who are starting out today?

Curtis Northcutt (32:48.098)
Continue to be naive and continue to go to the gym and take good care of yourself. You have a long journey ahead of you.

Prateek Joshi (32:56.74)
I think that's a brilliant piece of advice about the gym thing. I think physical fitness is something people tend to ignore or they'll think, I'll just put in the first few years. But I think it's absolutely important to, I think that'll help you maintain the mental sanity that you need because it's a long journey. So I think that's brilliant. Curtis, thank you so much for coming onto the show. Love your sharp opinions, your depth of knowledge here. So appreciate you sharing your insights.

Curtis Northcutt (33:25.422)
Thank you, Pratik.

Prateek Joshi (33:27.288)
You got it.