Rui (00:21)
welcome back to Floating Questions. Today is a special one. We'll bring back a guest from our most listened to episode. You likely know Amin as the boy from Fes Morocco who went from the Math Olympiad to MIT, or as the co-founder of the NGO Math and Maroc. But today we're talking about his next chapter. Amin completed his PhD at the MIT Operations Research Center.
and is now a professor at Northwestern's Kellogg School of Management. He's here to discuss an idea that challenges the biggest trend in AI right now. What if the answer isn't more data, but better data? Amin, welcome back to the show.
Amine (01:04)
Thank you, Rui Thank you for having me again.
Rui (01:07)
Yeah, it's been a little bit while since our last conversation, and I'm super excited about your latest work in studying what is the optimal set of data for decision making. could you please give a basic explanation about your hypothesis, the main motivation, a little bit more context about your recent work?
Amine (01:28)
I know I'm going to dedicate the next years of my life in doing research, so I wanted to take some time to reflect on what question really matters.
And my reflection went as follow. I worked a lot on algorithms, designing algorithms to better use data to make decisions. That's one fundamental questions is how to use data and information to make better decisions. what I have noticed is the algorithms themselves got really good now and improvement on them do improve what you do.
but most of the quality of the decisions we make are attributed to the quality of the data itself. And what's reinforced this thinking even more is that now with LLMs, we created this powerful tool that can really extract whatever information is in our data. We know now how to extract it really well. So a big question is,
what data really enables us to make good decisions? So if I care about a specific decision, what data informs best my decision? And I think this is going to be a very important question that's going on because I was actually last year in NRIPS, the biggest AI conference nowadays, it was Ilya from who used to be at OpenAI who said something like, data is a fossil fuel of AI, right? ⁓
And we kind of consumed all of it already. So the models are trained on all this data and it enables us to have these foundational models that know language, know how to communicate, can extract basic human knowledge that's out there. But the question now on what's coming next is not the scale anymore, because we're pretty done with what we could, but what data should we create now?
Say I want to create some AI that is specialized in some specific tasks that I want to accomplish, some intelligence. What does this intelligence have to learn and what data informs it better? And this data would be not just again, like existing databases stored somewhere on the internet, Reddit or whatever, but more observations of our world.
the way I think about it. Let me give an example. I think this would make it a little clearer. Let's say what I care about is I want to design a subway line from Northwestern to O'Hare Airport. and there is no train to O'Hare, so I'd love for that to exist. I want to listen to this and with acting upon it, but...
Rui (04:13)
You
Amine (04:17)
Let's say we're trying to design that. And I'm giving this as an example as a task we want to accomplish. This is a decision-making task we want to accomplish. Let's say I have the most powerful LLM that exists out there. Take now GPT whatever and the path that you have there. And I want to say, OK, I want to deploy the line and I want to know how to
what route should the subway line take so as to minimize the construction costs and to accomplish whatever objective we care about. Maybe you want to minimize the eventual environmental impact, want to minimize the building costs because this depends on the soil nature, the houses we need to destroy and the property we need to buy, the disruption of the traffic. There's a lot of factors involved in there. So a best decision here is a decision that finds
best plan for building my subway line. So this is a well-defined problem and if I ask my LLM to solve it, what it will do is that it would look at all the data out there that we fed it in the internet and tries to estimate kind of all the possible building costs ⁓ across the map of Chicago and then it would
try, maybe if it's really good, use some optimization model out there to find the optimal path, right? But no matter how good my LLMs is, here I assume the best possible superintelligence model. There are things that it doesn't know just because this knowledge doesn't exist, which is if I select a specific part of the city and I want to know the building cost, this is just something that we don't know.
we have to go and study it, we have to go and collect the information. It's data that does not exist yet. So an important question in this application, and this is a concrete thing that actually we already do now, is before doing this building, we need to go ahead and collect relevant data for our decision-making. So we need to go and dispatch our engineers to go some places in Chicago and collect the data.
And the quality of this data is really what's going to define how good is our subsequent decision. Of course, it will depend on how we use this data too, but assuming we have this capacity and which we mostly do now, the real question is then what data should we collect to make the best decision? again, the most important thing I want to outline there is no matter how good the model are, like models are not magic.
If they don't have the right data, they cannot create knowledge out of nowhere. And as a parenthesis, I always find it's kind of fascinating to people like, this is sentient models or they are creative, although I don't know how to create it. Or they create It's really just a function that takes as input something and cannot create something outside of that. It can mix it in multiple ways, but...
If some knowledge does not exist, we did not collect it, it's not going to create it out of thin air. So the key question there then is given that decision making process we care about, what data matters? And this question is quite highly non-trivial. So when I thought about this, I couldn't find.
any algorithm that would, or any approach that would respond to that question. This is again, simple thing. I want to find the shortest path from one point to the other. And I want to know where should I go ahead and collect my data to best inform this decision. I couldn't find much of satifactory answers to that. And I thought, I believe this is an important question and it should be addressed. And to show you how it's non-trivial.
Again, let's say I want to build a subway line from Evanston to O'Hare Airport. These two locations, for those of you who don't know Chicago, are in the northern side of Chicago. And now if I want to build that line and I want to know where to send my engineers, I know intuitively that I should not send them to the south part of Chicago because it's likely irrelevant the decision that I'm trying to make.
maybe also not the extreme north, maybe not the west, something in between. So, which means the answer is non-trivial. There is some strategy to have there. There is some way to link the decision that we wanna make and also what I call the uncertainty. So, which means what do we already know? What do we don't know? Maybe some places were studied before, maybe we have a lot of data on some regions. So, the decision-making, the uncertainty with...
the data and this relationship is quite non-trivial. This is what I was excited to study and I've been studying the past year.
Rui (09:24)
have so many questions. I'm also super excited about this topic because this is something that I've been thinking a lot at work.
I'm curious, how do you define data quality? what is the relationship you are trying to draw between data and the final decision-making
Amine (09:38)
Right, so let's start with the data quality question. So the first, maybe before even talking about quality, the first order question is what data itself, where should we, which part of my system should I go and study? So that's the first question is what's data, which data, maybe I should say it this way.
assuming that once I pick a thing to study and to explore and to collect data on, I will get good quality data.
And now let's assume that I have a certain capacity to collect data, which means I can send my engineers to study the system for six months or for two years. Six months will give me some quality. Two years will give me more. So that's then how we can define this is in how much information about
the unknown, the data captures and how much variance does it not capture? So it's how much it reduces the uncertainty about that part. And a lot of questions, and it's funny you asked this because it's really the second kind of papers where we're doing now is about that is not only we want to answer what data, but also how much data or another way, what data quality should we, how much should we invest in
the quality of the data. So perhaps some parts are less important. You just need the rough knowledge about them and you just need to reach a little bit the uncertainty of what you don't know. But some parts are really crucial. You want to invest more there and you get higher quality. So this is to your first question. And remind me what was the second question. I think I lost.
Rui (11:16)
The second
question is, how do you establish the relationship between data and the final decision part? We know that data is the input for the final decision. But I guess I'm trying to understand, are you trying to maybe assert, ⁓ for this type of problem space, maybe this size of the data would be sufficient for this type of problem?
Is that the angle that you're taking or maybe it's another dimension that you're really looking at.
Amine (11:43)
Right. That's exactly the angle. It's taking a, having, when we try to develop as a general framework where once we're given a well-defined problem or decision-making task, understanding in that context, what data matters to this specific problem. ⁓ And the way we relate it, maybe I can go more in the formality is the data sets informs a given task. If there exists.
an algorithm that extracts that just from the information of this data can find the optimal decision or solve that task. So this means we're trying to look at the fundamental information inside data. Think of this subway example as one use case. The idea is if from the knowledge of these construction costs all over the town that I collected,
knowing that I didn't collect some of them, so there's some uncertainty remaining. That is enough for me to find the optimal solution of the task that I set.
Rui (12:53)
And how do you go about that? Are you looking at incremental information gain out of each data point? How do you approach that part?
Amine (13:03)
That's a great question. Actually, the first somewhat approaches as people try to take, especially in the machine learning community and the more empirical part of things is more this incremental perspective. I want to look at the next data point, how much value that is, does it add? But this is a very important question here because the value of data is really not additive. If you think about it, because let's say.
assume, you know, we can evaluate it by just adding the value of each data point. Now I added a given data point and I said, this is very valuable for my data set. And then I added a second one, but let's say this one is the same as the first one because it's already exists in my data and then the value became zero. So clearly there's a highly non-additivity there. And so the value of data has to be seen as
the collections of data points jointly inform some decision-making task rather than the incremental values. Typical approximative approaches of maybe this is a controversial opinion but I'm quite highly critical of I think I don't think this incremental value can scale to complex problems and actually we demonstrate that that if it's simpler
Problems, sure, but the moment it becomes more complex problem, you really need to take into account this joins aspect. To make things a little more formal than how we address it, the way we address it, and we really want to focus on this joins information parts ⁓ is we would say that data set is informative or we call it sufficient for a given decision making task. If.
And again, this should be defined by two parts. should highlight that. This notion depends on the decision task itself and its structure. So what are we trying to accomplish with the data? And the other important parts is the uncertainty. What do we know? What do we not know? Right. Because again, the value of the data is related to what we know already. If we knew everything, then that it has no value because we know. If we know very little, then...
And depending on what we know, if we know that part, then data on that part is not relevant. so these are the two things that are critical, the task structure and the uncertainty. So once we have these that are well defined, we would call a data set to be sufficient. If there exists an algorithm that once you query the data and you have your outcomes back. you observe now.
the value of these queries from this data, then there exists some way to use this data to make the optimal decision. again, you have some uncertainty that will not be fully revealed. Think of the whole map of Chicago. Clearly, I'm not going to study the whole map. So once I study that part that my data will give me and the remaining sensitivity remains, I am still able to make
the optimal decision on that specific task.
Rui (16:17)
⁓ OK. I have a few questions. The first one is, I think your angle about, ⁓ if we study the incremental information added ⁓ per this sample set, it might not scale well and also may not actually
give a good performance given there is a complex joint relationships among all of these samples, training samples.
I think your point around the complex joint is the one that really hits the point.
⁓ And I think the problem isn't unique to domain, we constantly have to struggle with, wait.
How much data do we have to retain in order to train this model? Even if right now we do some empirical study and OK, we can cut the data amount by this much, the performance doesn't degrade. But we actually don't know why it doesn't degrade. And we don't know whether these data points would surface to become something useful, valuable down the road. It's just right now, given the environment that we're in, it doesn't seem like the performance is degrading.
That's the very practical side of the things I'm like wrestling with and I would like to hear your thoughts on that.
Amine (17:37)
All right. that's so much interesting
the data attribution question, understanding which data points influence things. But the interesting part is
is that it turns out that it's not one data point, it's a collection of data points. And isolating the impact of one data point is very hard, but also misleading in a way.
Rui (18:00)
Hmm.
Amine (18:02)
And to give you an idea, if I take one data point within a given data set, and then I take the same data points within another data set, its value will be completely different, even with respect to the same exact task.
So thinking of the value of single data points might be misleading. It does within a specific data set and a specific task, it does make sense, but it is well defined, but it does not inform us well about the next, the new data that we should be collecting, right? Because let's say you collect one more data point, the most valuable somehow, if you could define that.
Rui (18:25)
Mm.
Amine (18:49)
And then the next one, the most valuable again, you collect this one, then maybe that new data points make your previous one obsolete. And so it turned out that the step you made previously was not the right one, just because you did not plan ahead of what's going to come next. This is where the combinations go like it's a combinatorial problem. The combination is very important. So.
To me, it seems the way to think about it is to think about jointly the next combination, the next batch of data that I should collect. And the other point you mentioned that is important is that typically we don't have one task, but changing tasks in the future. And I think one thing that seems important to think about is
data that jointly informs all of these tasks. Because if you think about it separately, task one, task two, data set one, data set two, there's enough overlap. So you might have two very good data sets on each task, but if you thought about it jointly, you want a single big data set that informs both tasks, you might be much more efficient doing that because they might exist data sets where it's only 10 extra data points, like just a little bit.
more and then you would be able to inform. So the data that informs jointly multiple tasks might be very different from the data that like the separate, it's not the union of the data sets. It's more elaborate than that. So perhaps it should also be thought of that way.
Rui (20:33)
Hmm, interesting. so far, it seems like what you're saying, the work that you have been focusing on is like given a well-defined problem or optimization decision framework, you are trying to define a few things. One.
what kind of data to collect in order to solve this problem, then how are you going to collect the data which include, which will have to factor into account of like financial expense, like, know, oftentimes there's a cost to it and also time expense because how much data you are gonna collect would also determine how much confidence you might have about certain observation.
⁓ And essentially your work is trying to prove maybe from a theoretical standpoint that there is a minimum set of data that you can collect in order for you to optimize the final outcome. Am I describing this correctly or is there some nuance that I should pay more attention to?
Amine (21:41)
Right, you're describing it perfectly.
Rui (21:45)
Okay,
How are you quantifying the uncertainty in the data itself, how that impact your decision quality?
Amine (21:52)
That's a great question. And that's the whole point of the paper actually strong to understand that. And what we find is that, and again, I should mention, maybe put an asterisk to all what I say is we study well-defined settings theoretically, like a well-defined task, well-defined uncertainty. In practice, things are a little more complicated than that. So.
Rui (21:56)
You
Amine (22:18)
The goal is to start developing algorithms and to slowly build into things that are more practical and more closer to the messy reality. But so for now, all this, would say it's a very cool perspective that give us good intuition of what's really going on. So what you find is that a given task and a given uncertainty
jointly define a set or a certain set in the uncertainty space, in the unknown parameter space, we call the set of relevant directions, but the intuition of it is the set of... it's kind of the fundamental information that you need to know. So in your uncertainty space there is a part of it
that is the fundamental information that you really care about. And the result says that as long as your data captures this information, then you will be able to make the optimal decision. So this kind of results, what it allows you to do is then if you have some sort of cost of data collection, then you can try to find
the smallest data set or the least costly data set that captures this information. And the whole challenge and kind of the whole work that we do is in finding this fundamental set of information. How is it defined from the uncertainty and the task structure and how to compute it, how to have algorithms that compute this fundamental information.
Rui (24:04)
Interesting.
So the direction that you're sort of heading into, when we know exactly what we want to optimize, right? But the general artificial intelligence, the goal is like, we just can do a lot of different things, general things all at once. So are you mostly focusing on the scenario where the
that we're trying to optimize for is very clear? Or are you going to try to generalize this to ⁓ a wider sort of environment or context?
Amine (24:41)
I think the first thing that I think is important to realize is, let's say again, we create the best, biggest intelligence, AGI is accomplished. Congratulations, humanity. Still, I don't think we should expect, or it just doesn't make sense to, or it's just not realistic or concrete to expect that this AGI can predict the future and...
just be omnipotent and be a god and know everything. As long as there's something that we did not observe in our world and is not in its theory, and none of us know, it's not gonna know it. Like for example, created AGI. Does it know where my fork is placed in my kitchen? No, unless there's a camera in my kitchen. Or think you have the smartest human that you just recruited in your company.
If you don't explain to them what's going on and what's the problem and give them the data between their hands, they cannot solve your problem. So intelligence is one thing, but wanting to solve a task always requires collecting data on that task that will not go away. So there's no such thing as a machine that can solve every single problem and we're done.
Right? What we can create is a general notion of intelligence, meaning that machine, when given the right data on the problem, can figure out how to solve that problem. That's what we're really trying to do now. But this machine will not just magically discover things that are unseen. We'll have to experiment in our world. We'll have to collect information. Now.
Another twist to your question is, okay, sure, now we create this intelligence and okay, we admit that whatever task we want to solve is not going to collect data on that task. We cannot do much about it, but what if we don't know exactly what task we want to solve and there is a collection of tasks that we think we would be interested in solving. Then in this case, what's important is to define a data requirement for this collection of tasks.
And that's the data that we should collect. But then you run into a trade off, which is quite a natural one is, if I'm super specialized, then I will need less data to accomplish this task. Or another way to put it is within the same data budgets, I can do more as opposed to if I want something to do well in a lot of tasks, then I will need a lot of data.
Or the dual of that is if I have a limited data capacity, then I can do less. And that's just a natural trade off. If we want some highly specialized machine, we can do this more efficiently with less data. If you want a general machine, the trade off is either we can collect more data and then sure we can do that. But if we have the same data budget, naturally we'll not be able to do as well. And the interesting thing also is.
When we care about multiple tasks, then the data requirements will scale depending on how similar are these tasks. And that's also something we try to understand. How data would inform similar tasks. If the tasks are similar, we should expect that the data would not grow that much. This is also something we're trying to if you multiple tasks, what data informs jointly all these tasks.
Rui (28:17)
Hmm.
Interesting.
I think throughout what we have talked to so far, one thing that I grabbed onto is that, one, you don't know what data point in what context would become very valuable. It really depends on your application area or use case. And two, it really depends on how narrow you're defining the task itself.
I wonder how do you see a future where the data pricing, the pricing of the data will shift depending on the information value gain. Like I imagine a lot of data selling and buying nowadays is more about like how much data you have or how specialized that domain of that data is, right?
But I wonder whether there would be a world where we're going to have to price, for example, I give you my data, personal data to do certain research. You obviously get value out of it or get some business even value out of it. How do you price my data? Yeah.
Amine (29:35)
That's a great
question. So a couple of weeks ago, I went to give a talk at UChicago and the faculty host in me does great research in data structure and data ecosystems and data ecology. He calls it data ecology. ⁓ And I was asking him this, he looks at this kind of questions of pricing of data and a lot of people are looking at it now. And he was telling me that it's already happening now, but
really terribly because nobody really understands well how to do it. It's very approximate and I think there's not much rigor in doing that for now. It's just, it's kind of the demand and you see how much they can price people. But this will be an important thing. And I heard this from many people. This will become an important thing. It is the fossil fuel of AI, right? So people will start withholding more data because they realize its value.
And there will be huge data marketplace. because who owns the data can create the intelligence to solve the task. But the question of the pricing of the data will be quite intricate because the value of the data again, because it depends a lot on the applications of different people will evaluate differently and how to price it in that regard is...
to be interesting to say the least. It's going to be a complicated problem and I'm not so sure how it's usually these pricing problems they all come from some equilibrium of how much people are willing to pay but I'm quite interested to see how that happens. But one interesting aspect of it is individual data in the sense that for now we all kind of give up our data for free but I don't think this would last
long, eventually people will start, I mean people realize already, but even people will start doing something about it. Like I feel that all the time, I always say no to the cookies and all this stuff and because I realize like why did I have zero incentive to give my data now? Like what do I gain from that? And they tell you personalization of the ads, but do I really care about having good ads? Personally, I really don't.
I don't even look at the ads. So it's clearly undervalued our data. so soon enough.
Things will change. you know how now everything is priced with ads. If you don't want to pay Spotify, then you will just watch ads. Maybe later on it will be the data. As long as you share your data, then you don't need to pay. This seems to me a very realistic future to be in. Or there might be some different sorts of reward system. For now, the companies just rely on the fact that we don't pay attention too much and most of us just accept.
all the terms and conditions that we share the data.
Rui (32:40)
No, yeah, 100%. Like, for example, even if I chat with chatGPT, since I already paid the monthly membership, like, why do I need to share my data for your better training? If you want better training out of my conversation with your product, that's a lot of my thinking process, right? So then you should compensate me,
So that's definitely a sentiment that I have, but it's also interesting. Like I actually do allow Instagram to track my activity cross app. The reason being is a lot of times I want to do some shopping, but I don't know where to find those items. Or in general, I want like a little bit more niche brand or a little bit like different quality with different look. So I don't want to spend time to search it online. So what I do is that I search on Google for the item that I have.
and I just wait for a day, Instagram will start giving me the advertisement that's relevant to that search. And oftentimes, I actually find whatever they recommended to be something that do want to look at the very least take a look, if not just directly buy something from. So.
Amine (33:53)
But let me ask you this, if you did the same effort, but instead of doing in Google and waiting for Instagram, you directly ask ChatGPT, let's say, would it be as good giving you recommendations or you still think Instagram is better?
Rui (34:03)
Mm-hmm.
I don't know, I haven't really done this comparison, but what's alarming to me is the other day, I actually took a screenshot of this. I think ChatGPT is right now, I think experimenting with also inserting ads into their recommendation.
Amine (34:27)
you
Rui (34:29)
That is terrifying for me. ⁓ I don't know whether they're officially doing this, but you can imagine how this could really influence the way that we perceive advertisement. Before we know this is an advertisement in such an obvious way, now the recommendation engine sort of inserts certain things, that is terrifying for me. And I...
Amine (34:30)
That would be...
present it as
the truth, like the best response. But... Bye!
Rui (34:56)
Yes, yes.
It's more insidious because you are so used to chat with it.
Amine (35:04)
That would be terrible.
I ask you the question is because, you know, when you do this process where you ask, you give Instagram the data and wait for the ads, this ads will be relevant to you, but they will be influenced by the highest paying advertiser. So sweater that is recommended to you is among the pools of maybe relevant ones. It's not necessarily the best for you, but
Rui (35:22)
Hmm.
Amine (35:32)
just the ones that they paid most to advertise to you. So it's not in your best interest. And I thought ChatGPT for now, it doesn't have incentives. So it would give you what it thinks is in your best interest and you can refine it. But if they do make its ads, then it's quite problematic.
Rui (35:52)
Yeah. what I hypothesis is that in the future, we're going to have a device, maybe it's our phone, maybe it's a device that we have never even like understood yet, like what that form of device would be. And that device should have absolute, at least the highest security and privacy sort of like protocol and cutting edge technologies. And you would selectively
⁓ pick what kind of data you want to reveal to all the different parties that you interact with. So then we can store things like passport information, personal identity, medical records, and all the, your interests, hobbies, and all that things. But I can't imagine that type of data getting controlled by a single organization that monopolize on top of the data. I almost feel like there has to be a third party provider
that their whole job is to protect your data and to help you store your data. And then so you can selectively figure how do you review the data to other people and interact in an agent AI system, whatever that means. ⁓
Amine (37:03)
Right, absolutely. This should happen, but the problem is because it's only a handful of companies that have control over these devices and how they're designed. It's not yet as much strong incentives for them to do that. I think Apple's is doing some of that thing.
I say this to myself, I just bought a Google phone, maybe I should have bought an Apple. I advocate for things that I didn't do myself. But yeah, certainly they need to be a push. Like it's something I think about a lot, It's kind of give power back to the people in a way.
Rui (37:37)
Yeah,
it's very interesting to think about future about how do you price your own personal data and how do you really govern your personal data and interact with the rest of the network in a new fashion. ⁓ But I guess coming back to reality a little bit more. Since now you have officially start as a professor at the Kellog Business School. ⁓ You're go I presume you're going to teach many classes in business school, including MBAs. ⁓
How do you translate your mathematical robust optimization and a lot of optimization thinking and knowledge into business strategy? Are there specific case studies that you in general share with the students? How do you really bridge the gap between your highly technical deep knowledge versus the business school? A lot of students don't have that background.
Amine (38:32)
That is a great question that I have to figure out myself. I'm teaching the end of next spring. I'm teaching next spring, actually. I thought before MIT, but this will be my first time teaching MBAs here at Kellogg. And it's quite an exercise in ops here. We are a theoretical group and operations in general. It has empirical, but it is quite a technical field.
Rui (38:35)
Okay.
Mm-hmm.
Amine (39:03)
But we also have to teach MBAs and MBAs don't care necessarily about the math of things, but they care about the insights they will learn from the problem. So doing that translation almost is a very non-trivial task. It's very hard and it requires some skills that not everybody have. And it's something we constantly, or least my colleagues constantly think about.
new to this, so quite excited to do that, but it requires a lot of work. And I think this is generally a teaching skill, maybe more extreme for MBAs because it's really you're someone who does such deep theory and then you're talking to people who are not technical at all, and a lot of them are not engineers or they come from various backgrounds. But I think that exercise is
not only important for the teaching, but also for myself in the sense that if you were able to express these complex ideas into very simple words and extract the main insights, it really indicates that you understood very well what you were addressing. So being able to understand the ultimate level of understanding something is being able to teach it to non-experts.
Rui (40:28)
Interesting. I think maybe one approach would be just to collaborate with folks in industry who are actually doing the optimization work to give like case study that might be the easiest way to motivate the knowledge for why you would like to understand optimization side of the things a little more deeply.
Amine (40:52)
Absolutely. I'm actually, you're almost predicting my thoughts, actually next, next couple of weeks I'll go to Morocco and I want to visit, I arrange a visit with, we have this big company that's managing the phosphates of Morocco, the biggest mineral resource we have. Big player there and their CEO was quite a famous person in Morocco, kind of overturned the company and made it very profitable.
Rui (41:11)
Hmm.
Amine (41:20)
is an operation research PhD and he did lot of very cool stuff. So I thought it would be cool to visit around and have some friends there or some executives who are taking me around and showing me kind of what they accomplished and how all this like going to the ports, going to the mines, seeing what problems they did with every day. And the goal is exactly what you said, trying to see from that perspective what's needed in practice in real life and industry.
Rui (41:23)
Mm. Mm-hmm.
Amine (41:50)
and how do thinking deeply about these problems allow us to do better.
Rui (41:56)
Okay. even in our last conversation, you mentioned like how you're also trying to figure out how to bridge the industry practical side of the application versus theoretical study. And sounds like you're also planning a tour to understand the industry a little bit more. in the next few years, do you still plan to allocate majority of your
time into theoretical research, or you're going to take this framework into that specific company problem space?
Amine (42:28)
Yeah, absolutely. Absolutely. And this is also something I think a lot of academia started thinking about more with all the Trump attacks on the funding cuts and everything. It makes people start thinking how relevant is academia in society and how much we contribute to society. ⁓ And it's a valid question because
Rui (42:38)
Hmm.
Amine (42:55)
A lot of research is not impactful and some of it is impactful. So it makes you think what's great impact to me. And again, maybe from my idealistic naive perspective, but I like to believe that. I do believe that is working, having impact for working industry doesn't necessarily mean having very applied engineering style kind of research that will be useful. That would accomplish things, but
If you're really ambitious and we're hoping to create the neural networks of the future, what Hinton did, we have to dream bigger and think more deeply about the questions, not just stay at the surface of solving the immediate problems and put tape everywhere in the problems. Try to understand industry, understand the problems, model them carefully while trying to take
make these models as close as we can to real life. Obviously they have to be somewhat idealized, more to be able to say things, and then think deeply about these questions and derive more non-trivial insights, more elaborate insights about them. And this is how we can aim to reach higher and have solutions to these industry problems, these real-life problems that are more disruptive and can more give us deeper questions. If you think about
Hinton's work that led to neural networks and ⁓ LLMs and all the frenzy that we have today. When he was doing that, nobody really believed in it. It was more a fantasy and something that was very far from what would be immediately applied. So one could have made the argument, what's the point? This is not impactful research. But it's precisely this kind of ambitious research
that might or might not work, but at least has a solid grounding and a clear vision that actually can lead to more revolutions in the future.
Rui (45:01)
I love that And actually, it's a little bit surprising for me to hear that in academia world, a lot of the research projects might still be too short sighted. And it's patchwork. Because I can see that the industry right because especially a profit driven company that is like going to go in public. They oftentimes to optimize for short term optics.
⁓ I didn't know that in academia such problem also exists. What kind of incentive structure really leads to that type of like choices?
Amine (45:36)
That's a great question, actually. there's some sort of crisis in academia, are thinking about that because the problem is, what is the game in academia? The game is publications. And so you get recruited, you want to get your tenure, you have seven years, six years to prove yourself. And the objective way to prove yourself is the publications. need to get paper out there, you need to get them accepted in venues.
Rui (45:49)
Hmm.
Amine (46:04)
if you, this is all you care about is getting your tenure. There is an incentive of producing fast, get something publishable, put it out there. And a lot of people do that, unfortunately, because can you criticize them? know, people want to have a job, So the incentive structure is like that. And if only the people who evaluate them are not necessarily reading the details of how impact with the work.
For example, let's say you write a very ambitious paper that takes you four years to do. When you struggle to get it published, and then someone will say, oh, this guy has only three papers, but the other has 10. So the lazy reviewer, person who give promotions and whatnot would just naively give, would not really read the ideas and the potential of them and that this is higher quality. They might just...
give the tenure to the person who has this 10 paper. So if you just play the academia game, in the long run, you're not going to be a legend. People will forget about the research you did, but you will get tenure. You will get promoted. You will be able to say, have 100 papers, 100,000 citations, or whatever. So the incentives are not necessarily aligned. But
I to say, I don't want to just criticize because this is also a hard problem to solve. At the same time, universities, when they give tenure, they need some reassurance. You might say, I'm a great researcher. I've been spending five years on this idea because it's a great idea. is it a great idea? Giving tenure to faculty is a big risk for the university. It's a job for life. It's marriage. Marriage is actually not necessarily for
Rui (47:28)
Mm-hmm.
Hahaha
Amine (47:52)
It's more than marriage So it's commitments. So this is why they need some reassurance. And that's what the tenure system was made for. And after tenure you do have the freedom to do things. So some people then get there to make bigger ideas. But the problem is then after 10 years is people also get tempted by the fame and...
immediate reward and they will get a lot of papers they can say I have five papers accepted at NRIPS and they will post it in LinkedIn and everybody will clap so people get tempted by that too it's somewhat the easy way so incentives are not exactly aligned I would say and this is something I get me has to figure out I think it's very important because in the long run we will need yeah we'll need publications to matter
90 % of publications don't matter. There's an argument to be said that you need volume for something to work out. And this is one of the biggest arguments, which is true, but I'm also one of the believer that we need to reduce noise because NERP's conference was 20,000 people this year. 20,000. That's immense. And that's just one of the AI conferences. 5,000 papers accepted, which mean in a year,
Rui (48:55)
Mm-hmm.
Amine (49:18)
So there's three main conferences. let's see. In a year you have 10,000 AI ⁓ papers that appear. I don't think there's 10,000 relevant ideas that come up in one word. So there's a lot of noise and we need a way to reduce that.
Rui (49:36)
Interesting. And I can find the parallel in like industry sort of incentive structure. Like you can do a lot to gain your promotion, to get more salary. Even just hopping between company would allow you to do that quickly. And I think that's a lot of people what I observed.
fundamentally what you should be doing is
to develop your skill, develop your knowledge, develop your ⁓ capability in interacting and communicating with others. ⁓ That's a very difficult thing. I think it takes a lot of self-awareness of an individual to say, okay, I'm optimizing for my own self-development and growth fundamentally, not for an external metric that will get me immediately promoted. I think it's very difficult.
to resist that temptation of just game the system. Because there is reality, you might have to provide for a family, you might feel like you have been poor for too long. There is too much emotional burden around choosing the right thing to do. And so on some level, it doesn't really matter what kind of incentive system that we create, because one way or another, you can game it as long as individually, we don't have the clear clarity of like, what is it important to us?
that we're optimizing for. Is that publication metric? Is that salary metric? Is that job title metric? I think this is long way of going back to my thesis for why I also want to start this podcast. It's just know yourself. If all of us know ourselves a little bit more, maybe
the incentive structure, even though it's a ill-formed one, people can still do the right thing for themselves and for other people.
Amine (51:21)
Yeah, this leads to a deeper question. But maybe one thing I want to say it's easy to be idealistic. I think I'm an idealistic person, but it's also important to be realistic. For me, I really deeply believe I want to do impacts for research, and I believe in the papers I do, and I really try to go the extra way.
to sure they are. And I would never submit a paper that I don't believe in. But at the same time, I know that I have to get the papers published. have to play a little bit of the game. I have to structure my research in a way to make sure things get out on time for my tenure to happen. So I think there has to be a little bit of both. And the more deeper question is, yeah, just think, know ourselves, and think what's our goal in life.
I feel like most people don't really think about that. It's just life is too busy to think about it, which is ironic because at the end of the day, why is it busy? What's the purpose? Why are we doing all of this? So it's busy, but we don't for a path that we don't know where it's really going. And we don't even think where this path is going and take the time. I think this is maybe the biggest crisis of our generation. There's so much.
things that keep us even not, you you have work and everything, but when you finish work, you have your phone, you have Instagram, you have Netflix, you're constantly occupied, that you never have time to think while before generation of our parents and before they go home, they just look at the walls, playing with time to think, you know, to reflect on life, but not anymore. And I think
This is probably the reason why there's so much mental illness and people are not happy and there's more wealth but less happiness and it's just people reflect less on what really matters.