Akshay Tells Us If Machine Learning Is Overhyped

Akshay Agrawal is a PhD candidate at Stanford studying convex optimization. Before that, he was a software engineer at Google, working on the world’s most famous machine learning framework, TensorFlow. He also interned at a self-driving car company called Aurora.

Personal Website: https://www.akshayagrawal.com/

CVXPY: https://www.cvxpy.org

TensorFlow: tensorflow.org

Twitter: https://twitter.com/akshaykagrawal

0:00

What comes to, what can you tell me about software? My name is and I'm a graduate student at Santa Clara university where I study data science, technology and software. And my name is and I spent six years as the head of software at a tech startup for us, you know the self-driving cars, man, that'd be the kind of cool data they drive themselves. That's what I've heard. You know, what I think is really interesting is we hear all this hype about self-driving cars, but. When the day comes, when let's say Elon Musk says, it's ready, are you going to put your life in the hands of a machine? I mean, we have the easy self-driving mode right now, but when it's truly autonomous, how long will it be before? You're like, okay, for sure. The car will drive, I'll close my eyes and take a nap. Sort of like a disciple of villans in terms of things that valued my life. It's sort of like food shelter, Ilan. The black IPS, my family, et cetera. So I feel like I'd be one of the early adopters of his stuff, his self driving car. If I could afford one, I mean, tentatively, I think he's, he himself has publicly said, it'll take about five years. So if that day comes and he's like hands off the wheel, I'll be like, yes, sir. And I'll get, just go for it. Well, there's, there's a lot, there's a lot to unpack with that statement. But I yeah, I would probably not be an early adopter. I think I would wait a couple of years until the kinks had been worked out. Cause I don't want to be one of the kinks. So. So, yeah. I, you know, do you, do you really think that really for us, do you really think they will push out a car that's fully self autonomous, you know, will the FTC allow it will the government, the government at this point, it's probably approved it. They're going to let something on this road that you think will potentially lead to some fatalities, at least in the beginning. I really think that we'll look at the early car adoption curve. You would say the exact same thing about going from horses to cars? These engines could explode. If you crashed into them, they were literally like a house of cards that would break so early technologies are dangerous. And no that I don't know about that. I mean, people always try to connect things that are happening now to things that happened in the 18 hundreds. But obviously things are different. Now we have a humongous government, they have their hand in practically everything. The number one cause of death already is cars that are not autonomous. Right. Right. So, so. I think, do I trust myself to drive a car or do I trust Elon on his his people to, to do it for me. And I think, I think I would trust them more. So my, okay. So I think that having the car self-drive. With a strong safety features is going to greatly benefit all of us. I think the question is at one point, does the government realized that having people on the road is way more dangerous than having cars on the road than having the self driving on the road and they say, you know what, go for it. So I'll try having this ready. It doesn't mean that self-driving is still going to be safer than a person plus self-driving you know what I mean? Ooh. Okay. I see. I would like, I would like keep my eyes open and my hands on the steering wheel. For the first for a long time before I was willing to punch it, I would not be someone who's like sitting. Even if they say it's ready to go. That's so fun. That's a funny thought where w when we're older, you know, we have kids and I'm like, dad, why do you have your hands on the wheel son? You wouldn't understand. I'll say I was from a different era. I call it a Valley where the motto was move fast and break things. So Yeah, I don't know about self-driving cars in a environment where that comes out. That's true. I, you make some valid points. I don't want to, I want wanna, I want to be an early adopter. I don't want to be a boomer of my generation, so it's it's it's something we've got to all, I guess, juggle when we finally get there. So I'm really excited to talk to our guest today. His name is actually Agra wall and he's a PhD student at Stanford studying convex optimization. Before that he was a software engineer at Google working on the world's most famous machine learning framework, TensorFlow. He also entered into a self-driving car company called it. Let's get into it. So actually welcome to the podcast. Thanks Ross. It's great to be here. Cool. So let me tell you guys a quick story. So when I was at my last job, we had this feature that we put out it was just like really hype feature. We'd been, we'd been working on it for some time. And our marketing department is going to make this big push, this big announcement about it. So they blast out this email that said machine learning based and all the engineers were like, we never told them to say that. As a matter of fact, they did that a few times. So anyways, the reason I'm telling you this story is because there are some people who say that machine learning has kind of turned into this over-hyped to marketing term. And there are other people who are saying that. Machine learning is ushering in the fourth industrial revolution. So as someone who's fairly qualified to speak on this Akshay, do you agree with either of these camps? I love that story. I mean, it's pretty obvious to me that it's over-hyped, I think it's pretty obvious to everyone that it's overhyped, including the people who make those marketing, or especially the people who just slap on machine learning based onto like a product that. Has nothing to do with Michigan learning, but something that's overhyped can still be useful. But I mean, I don't know if you tried to count, like what are meaningful, what are meaningful changes or new products brought on by machine learning? I don't know. Can you name some there? Some, but like, are they revolutionary? Well, I'll tell you, I'll give you, I'll give you one example. Although I still don't think it's quite where, where we hope that it will achieve. So I've been learning Mandarin for about a year. And by far the most useful tool, what I'm seeing texts that I'm unfamiliar with is pasting into Google translate and will translate to a really solid, like, I ran it by my friends who are native Mandarin speakers and they're like, yeah, that's completely correct. Yeah, no. So Google translate is I probably one of the best examples I think of machine learning, being put to very good use. It's just that, like, you know, when people describe machine learning, like eventually you have like, Someone in the room who starts talking about the singularity and like how we're actually teaching machines to learn it. I'm like that stuff is just kind of nuts. I think maybe because it's like so removed from the research that is actually happening and the things that are practical. It's like more like, I wouldn't even, I wouldn't call it philosophy, but it's sort of more like, you know, speculative almost cult, like worship of an idea that no one is working on. Right. Well, okay. Some people claim to be working on it but it doesn't. So I don't want to diminish the impact the machine learning has had. It's had, like, it has had like a very large impact on self. Just like the idea that That you can differentiate through basically like any program in tune its parameters to get better performance on whatever task you have at hand, if you can quantify it. Performance metric, I think is I think that's pretty big. And like, you know, the, the most obvious, I think application of that has been neural networks, you know, trying to predict what an images or to translate a sentence or something like that. So yeah, I think machine learning is great, but I just don't think it's, you know, the new electricity or the new fire. I mean, I think he said machine learning is more profound or as profound as fire, which I mean is, is ridiculous, but it is very cool. It's very cool technology. I, I do some work related to it. So what, yeah. What did machine learning replace before? I mean, what, what, what, what were people. How are people solving these problems beforehand, if they were like, what kind of problem do you have a specific translation payments recognition? I don't think it has sort of replaced itself. I mean, so like a translation. I don't, so I'm not an expert in translation, but I mean, people have been taking a statistical approaches to translation from a long time. But I think. A couple of things have changed, which has made it a lot more successful, which is what a lot more compute power. Like I think, I don't know the specifics, but the compute required to train. And even to just run inference on like large language models is quite, you know, very, very large and having access to so much data, I think has. Really changed the game. I mean the language models, I don't have a specific number. It's like billions or trillions of parameters. So they're very large. So that wasn't possible in the past. And no one really had the guts to even try that. Right. I think even like for like self-driving cars, like the machine learning. In most, I think sensible implementations and self-driving cars. A lot of the machine learning is restricted to like the perception and sensing and trying to figure out what's out there in the world. And people have been using machine learning for that for those tasks for a long time. It's just that I think specifically for images, for example, like the release of, or the. The creation of Alex net in like, what was it, 2012 or something like that, which was a particular confrontational neural network that crushed the image net challenge. Right. I think sort of like caused a lot of people to revisit older techniques, like neural networks and like they found out, Oh, wow. These things are actually, you know, you can just make them bigger and bigger and give them more data and they'll they'll fit. You know, any sensible function you're trying to fit. So a lot of people, a lot of people use the word machine learning when they really are just talking about like traditional statistical techniques, like old school Bayesian statistics. So can you kind of expand on that? Where does machine learning differ from traditional division statistics and maybe where does deep learning fit into all of that? Hmm. Where does it differ? I mean, So I am not a statistician, but I can give you what my understanding of it, like if you read so, so there are some people who actually do statistics, so, okay. I think people who do statistics and people who do machine learning, don't even like really talk to each other or a lot of the times they don't. So they're like totally different communities. So people who do like Bayesian statistics to like, better understand like the outcomes of like election results in like different counties. Or like, these are the kinds of people who will like, you know, look at like a small data set and try to really understand, like, you know, is there looking at like a lot of questions in social science and like, understand like, is there like a bias and policing and a specific district or something like this? There are people out there like Andrew Gelman and stuff. We do things like this. And it's funny. I went to a talk by a pretty famous statistician recently and. He was talking to an AI audience. And so someone asked, and these are a bunch of AI engineers. So someone asked this statistician, he was like, so what's your opinion on you know, AGI, you know, like, you know, all these statistics cause you talk about it really great, but what's your opinion on AGI? And the statistician was like, what is that? Are you asking me? Yeah. Okay. It's artificial general intelligence. And that is exactly what the statistician said. He's like, what's AGI. So, I mean like, yeah, so like people in the AI community, like everyone knows what AGI is, just, that's been open AI claims to be building. And I think that's what deep mind, maybe, I don't know if they claim that anymore. But yeah, the statistician is like what? CGI, and then he, so he had to explain it and then he says, okay, but like, what do you, they, I, engineer goes on and says, was like, Oh, but what do you think about the idea that, you know, Learning is just, you know, any kind of learning human learning machine learning is just SGD, which you know, for you and your audience, that stands for the Castic gradient descent, which is a type of algorithm used to like fit neural networks or to learn them in quotes. And so, but the statistician, so every engineer will know what SGD is. The statistician says once that's Judy, he's just like, yeah, that was a gradient descent, but he doesn't think on those terms. But I, so I don't know if I answered your question, but there's not that much overlap in that I think communities and the way that it differs is like people who do machine learning say today, there's a big group of them, but they do a lot of different things. But I think. Most of them are interested in like, you know, fitting a model to predict something and they care about performance. The performance of their model on some very specific metric statisticians are sort of more scientifically inclined and are, you know, using their tools as a way to actually understand something understand some system, how it works. Well, you know, right. Had predicted in the future with confidence intervals. Got it. So let's get into so I want to talk about your work at TensorFlow. It's pretty crazy. I was looking it up on Wikipedia and it was only released like five years ago. It's insane. I feel like TensorFlow is taking over the world. Like every software engineer I know is aware of this one framework. So I mean, pretty unbelievable. Yeah, no, it's, it's a pretty influential software. I joined. So I joined the TensorFlow team straight out of college, like out of my undergrad, or I guess out of my master's I did a five-year on Tibet master's program, but yeah, I just you know, I had applied to Google. I had done a couple internships that were before and I just wound up on the TensorFlow team and I guess this was 2017 and I was there for like, like a year or like a year and one month. What was it like working with those guys? It was fun. It was like the attention of the team was quite large. I don't know. I guess it probably still is large. The engineers were really, really good at what they did. A lot of the engineers came or maybe, I guess most of them came from like a computer systems background, many had PhDs And so it was kind of like this, like I think utopia for engineers who liked working on these kinds of problems. Cause you get to work on sort of whatever aspect of TensorFlow you wanted. And like for, for like an undergrad, like having access to like, you know, people who have been at Google for like 10 plus years or like had storied academic careers as mentors was like pretty awesome. Yeah, absolutely. So what were the, what was the biggest change that you saw from. When you started the project, when you ended, like what was the biggest feature? The most exciting thing? Oh, yes. I joined in a very pivotal time with TensorFlow's life cycle, I guess. So it was 2018. And as you mentioned, TensorFlow is pretty young. But there were competing frameworks out there that were gaining more traction, specifically PI torch and. I guess, so the main way that PI torch differed from to phone or was a really big way, was that so PI torch felt like any normal like library for like linear algebra mathematics, where like, you know, it was a Python library is a Python lab. Right. And you know, you like, you just, you know, the basic concept in Python is in the multi-dimensional array of numbers and then PI torch has a bunch of functions to, you know, Related to training neural networks, but also just the linear algebra for like, you know, performing operations on those neural networks. On those, on those multi-dimensional raised that way it tends to flow was really different because it was not an imperative programming language, which by torch was, it was declarative. You specify what it is that you want to be done. You say, this is the comp, this is what I want to happen and figure it out. And then, yeah. And then, you know, then the system takes or whatever you have, you know, laid out and then goes and executes it in however way it wants. I mean, so the line between these two things are blurry, right? Because imperative programming pro programs at the end of the day, a compiled by a software compiler or interpreted. And, but I mean, so, so there is, I think a meaningful distinction you can still make. So specifically. Yeah. So the, so the maybe the, maybe this is oversimplifying it, but it seems like PI torch, his approach was more like as a programmer, you lay out everything that you need to do in order. Whereas TensorFlow's approach was here's our goal and we'll give some details about the goal, but the program is going to kind of figure out a lot of that for me. Is that fair? Maybe There's elements in both because intentional, you could also still lay out like a sequence of operations, but the PR, but it would still, so, so let me give me, so, okay. The best way to think about it, I guess, PI torches, when you use PI tote, your. You know, you're using Python to like operate on multidimensional, right. It's take the derivatives. And so to create a neural network and train it. But are you using, at least when you were using TensorFlow, like prior to 2018, you were using Python to right. Yeah, we're using Python as a meta-programs language. So you were using Python to write a program in another language, which was secretly TensorFlow. So it tends to have Lola sort of its own language. So if you wanted to multiply two matrices intensive for what you would do is you, you create a symbolic. All right. Or a symbolic matrix and it was called a placeholder. And then you would create a symbolic node, which was a matrix multiplication node. And you would give it symbolic inputs, which would be the symbolic tensors. And then you would say, okay, here, this is my, this is my data flow graph, TensorFlow that I want you to execute. And then you would literally call a Python function called session.run, which would take their data flow graph, which actually is a program. Written in tens and flows on strange language, and then it would execute it by itself. So it was like exposing all the guts of this run time system. So like, there was literally this, you know, like session.run, it's like, you know, execute by program. There was nothing like that in by torch. Like that's not a natural way that people program, I told you would literally instantiate a concrete array of numbers. Then we'll dry them. You would see, you could print it out. You wouldn't have to cost around. So, yeah. So you said that this was a pivotal time in history. So what was what happened? So, because of this distinction, so like TensorFlow is notoriously really hard to use. Whereas PI torch was like how people thought. So I think TensorFlow is losing some traction to people who are just absolutely in love with universities. We're switching to teach by torches. Denser fall. So when I joined the team that I joined is called the TensorFlow eager team, which was, you know, basically I guess to put it simply was like, you know, trying to figure out how to transform TensorFlow into something that was a lot more like an imperative PI torch style programming paradigm. So basically what this, what we were doing on this theme was that building an alternative frontend to TensorFlow's back meaning. So instead of. Instead of creating, we were trying to make it so that instead of creating a graph ahead of time and then executing, tying tend to phone or run it, you know, we would instead, you know, when you type Matt mole, if TensorFlow eager execution is enabled, which is whatever you were working on, we just executed beautifully. And so this sort of, then, you know, people realize like, like for usability, you know, like it really is. It makes a huge difference to, to support this imperative programming style. So like all of TensorFlow rallied around TensorFlow Yeager, and this turned into TensorFlow too, which is, you know, the main version of TensorFlow that's out today. And TensorFlow two is antitrust. They basically converged in terms of how they, in terms of the, how they're, how they're being used. Yeah, the programming paradigm as well. So both of them are imperatives, you know, by default, meaning like they, they work just, as you would think, any other like a library would, but both of them also have this sort of special, you know, Justin time tracer, that's built in like a tracing compiler where like, if you write, you know, your code in just the right way so that So if you write it in just the right way, follow some, you know, some rules, some restrictions on the grammar of what you did, right? You can add a Python decorator to your functions, and then you'll sort of tend to flow or PI torch will transparently compile that function just in time into like a unified graph and then executed more efficiently. Got it. So what, what exactly did you work on? Are you allowed to say that? Yeah. Yeah, sure. So I worked on TensorFlow eager, which is the thing that we've been talking about. Cool. And I worked on the just-in-time tracer, which is the thing I just talked about. So if anyone uses TensorFlow two to today, they'll know what it's called. It's called the. TF that function decorator. And so it was this idea of basically it was like a a mini just-in-time tracer that like, you know, you have a Python function that has a bunch of TensorFlow operations and you could decorate that function with, you know, TF that function. And it would basically compile it when you first ran it. So the first compilation is slow, but the promise is, you know, if you wrote your function in just the right way, we'll actually be able to compile it. And we'll be able to get you speed ups. We'll be able to run your code on a TPU or a GPU or something like that. Cool. All right. So just to kind of take a step back here you have experienced both working in you know, some of the world's top academic institutions and one of the world's top tech companies. So what is, how do you view innovation taking place at, at both of these? How do, how are they similar? How do they differ? Yeah, the innovation. Okay. Let's see. That's that's a really interesting question. So I, I spent a year at a very, I guess, so I Google brain, which is, you know, was not representative of Google writ large, but I can talk about what I saw there. It felt a lot like a research lab. So people could really, I it's. More or less work on whatever they wanted to, maybe there were some unwritten guidelines that, you know, you weren't supposed to cross. Everyone at that team, I think just because of the kinds of people that they hired were like really interested in neural networks and reinforcement learning specifically. And there was also, I guess, a bias towards doing research that ended up in That somehow translate into a software artifact. It wasn't always necessary, but there was a bias to that. And that was, I think, really valued. So you would contrast that with the research lab at Stanford, where yeah. You know, you just do research that you think is. Well, how do you even we'll get, I guess we can get into that later, but I actually, I think this is, this is quite surprising that they didn't mandate that. They said you guys can do research and if it doesn't if it doesn't turn into software, we don't necessarily want to encourage that, but that's not a big deal. Now. Yeah, they had some theory. They had a few, not many, but they had a few people that just in theory. I think maybe, maybe not, but you know, what you chose to work on could actually maybe have repercussions on whether or not you got promoted and things like that. But there was, there was nothing like saying like, Oh, you should definitely work on, you know, Google's what was Google's end game for just doing pure theory stuff. Oh, I don't know. I mean, they didn't have many who did to your theory. Gotcha. So what do you, I think are the main drawbacks or weaknesses of machine learning? Drawbacks or weaknesses of machine learning too. So, I mean, I let's see I guess it really depends on, you know, what application you're using for. So maybe one way to ask that question is, you know, when would you use machine learning and when will achieve, maybe. So, let me actually, let me maybe tell you what I'm thinking. So We've heard about this a lot, where people go on Facebook and they're just kind of normal people who maybe don't have a great media literacy, and suddenly they get exposed to this more and more engaging content, which is actually like a path towards indoctrination into these like hardcore right wing beliefs, same thing with YouTube, YouTube, like watching, you might use it might start out watching like gaming videos and it takes you down this rabbit hole of just like hardcore, like right. Nationalists contents. And it seems like this is because the way machine learning works, it's very difficult for someone to peek under the hood as to exactly what the algorithm is doing. Well, what is your take? Yeah, yeah, it was horrible. I, I hate that stuff. Yeah. Recommender systems, you know, have been, I think, shown to be pretty pernicious at radicalizing people. There was like an interesting study or some researcher looked at like lots of different trajectories in YouTube recommendations and Jen, and they all ended up at a crazy radical content. Well, why does this happen? So, okay. I think it's important to also like, you know, not treat machine learning as this one monolithic thing. Like recommender systems are like a really specific application and it's like, really not. That related to like fitting on your own network or like a linear model to predict a specific quantity. It is somewhat related, but they're going to be using things like reinforcement learning or contextual bandits or something. But, so I, so stepping back, I think like the main problem here is like, we have no good methods right now to deal with feedback loops in user facing software. Like. It's just a total indoctrination cycle. You can't even catch that they have no tools to even catch that that's happening. Oh, well, so I guess I was, what I was going to say is like so the problem so, so the problem that recommender systems are working on already, which is like, you know, I know like, you know, very. So let's see, I need, so here's a user. They are interested in some things I don't exactly know what they're interested in. I don't know how they're going to react if I show them this content versus that content. So let me just find some, you know, let me try and figure out like what they'll like. The best. So that's already a hard problem. Cause like, what's that, what does it mean to a user to like the best, something more than something else? How do you know what's good for a user what's not good for user, so like it's not even clear what the objectives are. The algorithms use to like actually figure out like what people, you know, what to show people are like also like not that well understood or maybe if they're well understood, like the dynamics of how it's actually going to affect the user's behavior is not understood at all. And it's. So yeah, so I can imagine that the people working for YouTube once, you know, people, especially like young kids to get adult indoctrinated into the alt. Right. But that's, what's happening to some extent that's, that's happening on some level. So why like what, why is this hard for them to stop? So I don't know. So, okay. Full disclosure is that I had no idea of what the numbers are of how often this happens, et cetera. I mean, I don't know, like how would you do it? How would you stop it? You could write some kind of random Kluge software code is like don't re you know, recommend a specific set of videos, but that's already like, you know, it's very hard to tell to like, you know, whether or not. A video is, you know, who's to say like a specific piece of content or information or opinion is, you know, deserves the light of day or not. So it's, I think it's more of a question of like, you know, what metrics are they optimizing for? And it seems like maybe in some cases it's not a great metric or maybe, you know, these people actually want to see this content and it sort of, the question elevates above like, you know an algorithmic problem too. Yeah, no, not even just censorship, but like like editorial problems and like, you know, at what point do these media platforms have to make editorial decisions. I mean, it clearly there they are now starting to have to make editorial decisions and it's becoming pretty interesting. Sometimes sweeping editorial decision. I mean, this is not related to machine learning, but with Facebook, like just banning all of Australia news from its entire platform was quite interesting. Right? Yeah. So I don't know if I have a good answer of why they can't stop it. It's like, I, you know, if, if push came to shove and someone said, you know, never recommend content that, you know, has these specific characteristics maybe is related to the alt right. You know, you can maybe do that, but. I don't know. Are they going to do that? Maybe, maybe not. Should they do that interesting question. And they're always going to find ways around it. The people making this type of content are always going to find the little peripheries of the, and maybe one thing to highlight. You know, I've seen this misconception made and like the media a lot when people talk about systems like YouTube. So they'll say. They'll say something like, like on the topic of, you know, quote, fake needs, they'll say something like this. They'll say, man. So like, you know, Google and Facebook. You know, are so reluctant to take down fake news and the spread of disinformation. And yet when someone posts, you know, copyrighted content on one of their platforms and like, you know, a big, you know, the copyright owner is big and famous. Such as like Beyonce or I don't know, some, some media corporation, you know, YouTube will take it down immediately. So this shows that, you know, YouTube and Google and Facebook are not acting in good faith because they could take down misinformation immediately if they want it to. And that's like, totally. It's like a really bad analogy to me. Like those two tasks and not equivalent at all. Like, it's extremely easy to like, if you know, like a specific song is copyrighted and it should not be uploaded by anyone other than the copyright owner is extremely easy to figure out, Oh yeah, this is the song that is copyrighted. I can, you know, and you can just fingerprint and, you know, you ban it from being uploaded. It's like infinitely harder. You know, you have a video, you have an article. John, the computer's job is now to determine semantically whether this thing is, you know, telling the truth or arguing in good faith. That's I mean, humans can't, you know, humans can't do that reliably. I'm like we're nowhere close to having computers from being able to make these decisions. So the, the, the kind of the indoctrination spiral problem, that's not necessarily just a machine learning problem. It's a broader problem within the recommendation algorithm. I guess so, I mean, I would totally preface all of this, but I am not an expert on indoc indoctrination problems. I mean, I personally don't find recommender systems. Interesting. And so I don't think about them, Austin. I think they do pose problems towards society. I don't like it when. When computer is recommended, you know, it's like this, like almost like a weird rebellious instinct. Like if a computer tells me, Hey, you should have like, screw you. I'm not getting, you know, you don't know, like I determined my own taste. Right, right. So I just don't find it interesting. Like, you know, it just ends up making everyone think the same thing and all opinions become bland and everyone is parenting each other. Right. There was a world. There was a world that existed not too long ago where the types of systems didn't exist at all, you know you tubes, or it was very, very obvious. Like I remember when I was, I think in high school, YouTube recommendations on the side were just like other videos from this channel. Yeah. Cool. Yeah, that, that, that would be useful. Or maybe we should go back to that. I don't know. I don't spend too much time. I mean, if I am on YouTube, it's like, I look at my specific subscriptions and I don't really look at what YouTube recommends me. Right. Yeah. Yeah. So we've talked a lot about, about machine learning and, you know, in some senses we can say that was like your past life. So now you're working on some new problems in the space of convex optimization. So can you kind of tell us how you got into that and then explain what complex optimization is? Yeah, sure. I'd love to. So I guess I would just clarify that like machine learning is not necessarily money. Past life. So as you'll see that convex optimization and optimization more broadly is like really sort of intimately related to machine learning or at least machine learning is one of its biggest applications these days. So I still do work in machine learning, but I just have broader interests, I would say. Yeah, so I can talk about optimization, convex optimization, specifically optimization. Is this really interesting? It's a really interesting subject. So I guess the broadest term for it is mathematical optimization. And actually, so maybe, you know, the discussion we had on declarative and imperative programming is helpful here. So mathematical optimization is almost like this declarative approach to solving problems where you articulate. Exactly what it is you want. So, so typically like, so in a mathematical optimization problem, you have a variable which represents, which is a list of numbers, which, you know, can repr represent one of many things. It can represent all kinds of sets. It can represent, you know, trades that you're going to make an, a financial portfolio. It can represent, you know how you are gonna it's gate can represent like actuators. It can represent it can represent things like how are you going to allocate resources across many different entities, things like this, but you have this variable and it needs to be chosen in order to, you know, you want to choose it optimally. So your job in creating a mathematical optimization problem is to define what does it mean for, you know, a choice for this variable to be optimal. That's through a mathematical objective function, you write down a function that associates. With each associates to each value of the variable and associates, but a value, which says how good was that choice. And you may also have some constraints on the variable, which are inviolable, which says something like, you know, you can't spend more money than you have that fan. So in that medical optimization, you write down exactly. You articulate. What needs to be chosen, you know, what, what makes one choice of the value better than another? And you write down also constraints on, on the value. And once you write down those three things if you can write them down using math, you now, all of a sudden have access to, you know, Hundreds of algorithms that can be applied to numerically solve this problem and find you, if not the best choice of the variable, a very good choice of the variable and importantly, it trusted the variable that you, a human would not have been able to come up with yourself. So I think maybe we can illustrate this by talking about like say landing a spacecraft. So let's say I'm yeah, let's say I've launched a spacecraft and I want to land it on the surface of the moon. What would an ML solution look like? And why might that not be so good compared to a an optimization solution I've got, I don't know what an ML solution would look like. That would be, let's talk about what an optimization solution would look like. Okay. So you want to land a spacecraft. So there's going to be a lot of things going on, but at the end of the day, you're going to be so. You're going to know, I guess, where you want the land or like a regional space that you want to land in. So you want, you know, final position should be, you know, what your final velocity should be. Yeah. Right. Cause you want to land how your current philosophy, you know, your current position Now you want to find, so you, so you want to, I guess there's two things you want to plot a trajectory, right? That will take you to your specific location where you want to land and, you know, have velocity zero, and you want to track that trajectory the way that you're going to you know, I guess create that trajectory and, you know, actually made your spacecraft is by coming up with a sequence of forces to apply to whatever actuators are on your spacecraft. So these forces. Yeah. Like, like use the, the rights thruster, like push that a little further, push the luck, touch a little faster. So the order in which you do that is what the content is, what the optimization problem is trying to figure out. Yeah. Like when, you know, when do you basically I maybe, yeah. So when are you going to actuate? Maybe you've already decided and you've, you've discussed as time actuate every like. I dunno, millisecond. My let's go, let's keep it trim microsecond. And then you're going to decide at each time, time, step it, how much thrust or, you know well, each actuator generate And in what direction, things like that and doing that, you know, you're going to create a sequence of positions and a sequence of velocities that your, your space, your spacecraft is going to track through time. So here, like the objective function might be something like find this sequence of thrust, these or actuator allocations through time. That will allow you to land safely. And, but the objective will be to use as little force as possible. Say, use a little as little fuel as possible and among fuel, or, you know, if you're weird and you might want to like land as fast as possible, right. Yeah. In as short amount of time as possible, something like that, the constraints are going to be the laws of physics, right? The constraints are going to be like, you know, if I apply, you know, this forest here, I'm going to be over, you know, at the next time step, I'm going to be in this position. My velocity is going to be this. So now you have this. So once you kind of quantify these things the physical constraints. What you want out of your landing profile of your, of your vehicle when you it, while it's landing, you know, what you want to achieve. And I have a mathematical optimization problem in which the variables are, I guess, in this case, you know, the decisions you're making for each actuator at each time. So, so, okay. So we've laid out. All right. So we have this problem, which is, we want to land this spacecraft in order to. Turn it into an optimization problem. First we say, what is the goal? The goal is to land it, which means reach position X with velocity zero. We have the thing that we're optimizing for, which is we can either minimize the amount of fuel or we can make sure that we're landing as quickly as possible or whatever else we want to. And it's possible to choose more than one of these. Yeah. Yeah, you can, you can have multiple goals, objectives that you're optimizing for. And in which case you're, you're, you're trading off. You're trying to find an optimal trade off between those two and the degree to which you're trading off between them is represented by that's a hyper parameter in your objective. so you'll say like, you know, I care about this objective, this sub objective, more than that one, you know, it's a hybrid primer that you would tune ultimately to get good. Okay. So then we've got that. And then finally we have, what are the constraints that we cannot violate under any circumstances? The law garbage cannot be violated. You can't you can't touch the grounds and still have a. Velocity. Exactly. Like there are just some, some rules that can't be broken if you break, if you break that constraint, you're going to crash. Right, right. Yeah. If you're not right. So so yeah, so it sounds like this is the only game in town then, right. Or this is the best, I mean, it's great. Yeah. So, so this is like broadly in the topic of control, by the way. So space X does this. They I don't remember what the name of the rocket was. This was, my advisor loves the valgus example. It could have been the Falcon. It wasn't one of the rockets that lands, you know, verdict. So there was a guy at our lab who, who worked on these kinds of things and he graduated, he went to space X full time. And, you know, he writes to my advisor, Stephen Boyd from time to time giving him updates, but yeah, and they saw like they saw my I took it. It could even be like at the kill Hertz frequency. I don't remember. It's like tons of convex optimization, problem sequences, optimization problems while their vehicles are landing, you use it something which is called model predictive control. Yeah, it's awesome. I think like just texts, like save things. So there's like this missus code generator called CVX Jaron and you know, it's. Unfortunately, it's a commercial tool. I mean, we're working on building co-generation into some of our own software. We can talk about that later, but, but the idea is you, you type in, you know, it's a web interface. You, you type in the problem that you want to solve and it generates flat that you can put generates Flatsy and in particular, that C will take us. The inputs to the problem. So whatever data specifies the problem. So for example, it could be your current philosophy, current position, some things that you know about your environment, we described, like all of those, those inputs to the convex optimization, you just throw them on this website and then it will spit out. So you code. That give, that actually gives you the answer. So you, well, so you, in that website, you, you type in the problem symbolically and it spits out SICO. So, so the problem is going to be parameterized by some input, right? Like, so you can imagine you solving this program like many times a second and what's different each time, each time you solve it, like you have a different boss and you have a different position. So it'll spit out code that takes the input data. And then it's just the function really, and take the input data and then give you an answer. And so you take that seed coats, flat CECO, and you put it on your space ship, and now you have no controller. So I think they actually use CVX gent, maybe. I don't know, they, there there's a paper by Lars Blackmore that describes how they do this in like two sentences. It's mentioned space pointing in space. So people talk a lot about how Tesla is really powered by machine learning. And that's probably true for the image recognition stuff, figure out what's going on around them, but it sounds like that. Feeds into input. For, I don't know. I guess there's a way to do, I mean, I'm sure there's a way to do something. Totally. Oh God. Yeah. Some people are trying to do that. Like comma, AI, I think claims have you heard of them? No, they're they're like actually better this open source self-driving car. Company, I guess, I don't know all their software is open source and they're trying to do self-driving end to end from vision to steer actions and accelerate. But no, what you're describing is exactly that's how, like that's how like every, all the major self-driving car platforms or whatever you want to call them work today is that. You have machine learning that does that sensory processing, right? So you have a bunch of sensors. You have lighter, you have vision, whatever else you have radar machine learning will sort of make sense of the world and say, okay, there's a pedestrian over there. There's a stop sign over there. Here's the road. Here's other things that, that are around you, but now you have to make, now the car has to make decisions. And in my opinion, if you have a sane self-driving, if you have a sane software stack, your decision is not going to be made by a machine learning model. It's going to be made by optimize it by solving an optimization problem. And that's how, at least, I think many of these places do it. So like the first decision you have to make is, okay, here's the scene like trajectory? Do I, you know, how do I apply safe? Safe and comfortable trajectory through the scene, which by the way, is like an extremely hard problem because it's also like you have to predict where everyone is going to be. You know, you have to make sure it's just like fake. First of all, like semantically, figuring out what's in the scene is hard and then figuring out what to do safely. And like you are working 100% of the time is even harder, but so that's the optimization problem to plot plot a trajectory. And you have another layer of optimization that below that once you have a trajectory that you want to, you want to achieve. Then you have something that's probably riding out like a killer hurts or faster, which is which is actually generating controlling, puts for the car to track the trajectory. So it's like generating accelerate, decelerate, steer, all, all these types of commands and probably, you know, more things that I don't know about. But you you've got like multi-level yeah. Decision. So it sounds like maybe one of the. Most impressive things with the most useful features of optimization is constraints. So the fact that like under no circumstances, is this car a lot to crash. There are just some laws that we cannot violate. Is that also doable in ML or is convex optimization? Is that why convex optimization is really the only solution? I mean, let's see, is that doable? I mean, ML, you can have. You can, they're like lots of ad hoc ways to put constraints on your output. Right? You can parameterize something so that the output is always positive or something like this. But like more broadly, like if you are able to articulate what it is you want and what constraints it has to satisfy. Like I see no reason not to use an optimization problem. Cause like, you know, you will get exactly what you asked for, you know, so you better know what you're asking for you better be sensible and making, you know, in crafting this program mathematical program, but you'll get exactly what you asked for assuming your problem. Yeah, this is modular. Some assumptions, like, you know, we haven't talked about convexity yet, but we can talk about that later. Basically. Convex optimization is a subset sub set of problems that are efficiently and globally solved solvable, but yeah. So you have guarantees on your output. It's sensible, it's interpretable. Like, you know, what your model is optimizing for. And like, you know, it's never going to do something crazy. Like, here's my advisor's favorite example when he gives talks these days, he's like, you know, your system will never characterize a stop sign as a stop sign as a banana, which, you know, is a famous example of neural network missions, which might happen with machine learning model. So you're not going to use a context optimization problem to do vision, but that's, you know, the general idea, like it's not going to do anything stupid. Unless you, the modeler was stupid and created a stupid problem, in which case it's your fault? Not the technology's fault. Right? Yeah. So you, you can, you can find it with conflict optimization as compared to machine learning. It's easier to find stupidity upstream than discovering it downstream. Exactly. Yeah. I've got, yeah. With machine learning. Once you hit, once you realize you have stupidity or model, like. Oh, God, good luck. And I guess it's very hard to, I that's, I think part of the reason, the whole conversations around like bias and machine learning models, it's like, okay, you've detected that your machine learning model is making biased outputs. Predictions are something harmful predictions have fun, retraining it and reprocessing all your training data. I mean, Yes. Yeah. So, so the beautiful part of optimization is that like, you know, it really requires you to think really clearly about what is it, the problem that you're trying to solve. No, not how are you going to solve it? You first need to think really clearly what they want to solve. And I think this is like a gap in a lot of, I think like even in computer, maybe, especially in computer science education. So that was my undergrad, or like, you know, there's a lot of emphasis on how are you going to solve a problem, but not so much emphasis on what is actually the problem. And the beautiful thing about optimization is that it decouples these two things. So first you have to articulate your problem. And then there's a whole family of optimization algorithms, which are, you know, You can think of it as gradient descent, plus plus more sophisticated things in that for actually solving these problems. So so the, the concept of stupidity in in models, is that like an academic term? Cause if not, you have a great opportunity to coin something popular here. Yeah. I don't know. Sure. Yeah. Maybe I shouldn't know, stupid models versus not stupid models being facetious, but yeah. Sorry. Well, also like to, to make that a specific metric, like the percentage of stupidity. Yeah, yeah. No, that would be great. Unfortunately, I think it's very application specific. So it would be a very laborious process to quantify that in all the different fields, bang parish, people who go to work to work on this. I mean, there's even like, like optimization has tons of applications. It's like, you know, literally. So my advisor Steven bought is optimization course. He has. So his book is, you know, all about applications of optimization and it's already 700 pages long, but he also has like a PDF of like additional exercises, which are all applications of context, authorization self-contained applications. It's actually really beautiful. You don't really need to know anything except for optimization. And it's like 200 pages of applications in Michigan learning and geometry. And so I can design and signal processing and control and optimization in finance mechanical and aerospace engineering, energy and power management. There's just tons. So it's anyway, it would be very hard to develop a stupidity index for all of these. I mean, this sounds like pretty amazing stuff. Like ML is getting all of this hype. Why is optimization not getting the spotlight? I mean, it's not ML is machine learning. You know, it's artificial intelligence is very attractive and like five, five, I dunno. It was. Yeah, I'm sure there's some part of our brain that is stimulated. And when we talk about AGI or something, right. Optimization is interesting. Like, so I remember like, so, okay. So there's some, there's some people in the machine learning community who, when they hear complex optimization, you know, they're like, they're like kind of raise their eyebrows, like, Oh, you work on. Convex optimization, you know, that's that's pretty old stuff. Like, you know, what does your, what does, what do you think now that we're doing all these resulting all these non convex problems, which is like, wait, should we talk about what convex mutually? Yes. Yes. Yeah, so, okay. Like geometrically conduct problem. Okay. So tell me what, what. Basically, if a problem, mathematical optimization problem is convex. It's sort of like a, a condition that guarantees that we can solve it efficiently, globally, polynomial time. And really what it means is like, so at the end of the day, what you're doing and all these optimization problems, you're finding some minimums of some function. This function is your objective function, which specifies, I guess the cost of each. Have different values for your variable. So you want to find the, the assignment of lowest cost. So you're minimizing a function context means the function of the goal. So you have a goal, which is like, we want to use the minimum amount of fuel to land this spacecraft. Yeah. Right? Yeah. So that is convex. Yeah. That, that function, that, that, that minimum fuel function is context, which means like, roughly speaking, like if it's shaped kind of like a bowl. Or something like that, like it has nice curvature that makes an amenable to enter into the algorithms to find its minimum. Yeah. So there are some ways for me to do this. We should use a ton of fuel, but there's one does it. Okay. Does it have to be the case that there's only one perfect solution. Can there be mold? There could be many, but what is true? Is that any solution that, that appears to be locally optimal? Like, so any solution that is better than anything around it nearby is also going to be broken optimum. Okay. Yeah, but yeah, it's a good convict shape. Like like I'm trying to describe this over to podcasts. How do you describe the bulk can have a flat bottom in which case there's many solutions, but they're all the same. What's nice about this semantics of of the optimization problem is that. You don't care about which solution you get. If you did your job correctly and specifying the problem, like you should be equally happily, you should be equally happy with any solution that is furnished to you, right? You cannot complain if you get one solution compared to another, because they both satisfied the constraints and both they have the same exact objective value. So if you can, if you still have an opinion, that one is better than the other, that means you need to go and fix your specification of the model because you haven't. You had not told me enough. I told me optimization algorithm enough about what you care about. So my understanding is that for these solving techniques that are used for convicts optimization problems, if you apply a non convex optimization problem to them, you're still going to get pretty good results a lot of the time. Is that correct? So that's actually, that's where I was going. And which isn't a really good insight that you had is that like so people kind of hung, you know, people had. Okay. Let's see, we should be polite. So some people who may be like, well, who, who are used to, you know, fitting neural networks are known that looks at con non-complex objects or the functions there are lost functions or non-time text typically. So of like, Oh, you know, you do convex optimization. Like, does it scare you that we are doing? Non-complex like, the thing is like, you know, literally the only thing is we know how to sense of the optimize, like loosely speaking or context functions. You know, with complex constraints. So, you know, algorithms for minimizing non-complex functions are they're based off what we know about convex functions. So like gradient descent, which should we define gradient the signers shouldn't you just move on? Let's move on. Okay. So gradient this end, right? Like, I mean, that works for convex functions. Like, you know, you want globally always find, you know, the best or, you know, you will always find a solution. Well, you always find a solution if you apply it to a non-contact problem. No, but sometimes he will. It's what people use. It's fine. And like more broadly, like for, you know, for things that are not even for non-complex optimization problems, like, you know, I spent the last year thinking about some very specific non-combat customization problems. We appear to solve them almost globally, or, I mean, they're like, those are solutions are getting enough for our use case. And we use algorithms that were developed for content optimization, so right. The algorithms development for complexity, and then you just apply it into non convict space. So speaking, so if we're gonna get let's, let's get a little technical now about how you solve a. Well, actually, let me take that back. So let's, let's, let's look at this. We've been talking about theory, so let's talk about practice. If you want to solve a optimization problem, I've been done some digging and I found that there's a wonderful library in pipe CVX pie. It's got 5,000 stars on GitHub. That's more than I thought. And I saw it on the contribution history that you actually are. Your wall wrote 90% of the code base. So my question is how would you role how much that doesn't sound I, you wrote like, like, like 80,000 lines or something like your contribution is like way more than everybody else's no, well, really I've written a lot, but the original developer was Steven diamond. Who, so I would guess he's done more than me and I don't want to take credit. Fair enough. Maybe it maybe I'm second. Okay. So you've written huge percentage of, of a code base and it's very popular library. So my question is, how has your life changed since becoming a celebrity? Yeah, I think I think if I was a celebrity, I would know it. So my life has not changed yet because I guess I can go on podcasts. Now. This one. You know. So if someone is interested in getting into interested in playing her out of this field using your library, that's your number two or number one? Maybe number two. What is something, what is like, kind of like a fun pet project that you might recommend for them to play around with, for CVX pie? Yeah, it really depends on, you know, what their background is and what their interests are. And so maybe we can do a few different case studies of what people are interested in. So first, the first thing you would do is go to CVX pi.org to actually figure out what the heck this is. And I guess we can say what this is. So it's like a, it's like a Python library for specifying and solving conduct's optimization problems. And so this is like a, I think it's a really beautiful embodiment of the thing that we've been talking about of like declarative versus imperative programming. So CDX PI it can be thought of as like a dome, you mean specific language embedded in Python. So what you do is, and see if it expires, you know, with like really natural syntax you specify a complex optimization problem and. You know, CVX PI has a, has a really small grammar like in the programming languages, since that guarantees that the functions, as long as you stay within the grammar, the problems that you specify are context, meaning we can solve them efficiently, globally. You don't have to worry about so all those things, all those things we talked about, like just, you just say dot lost function equals this or dot constraint. You said objective function equals this constraints is a Python list of like inequalities basically, or equalities. Like literally like, you know CVX, PI dot X, or like norm of variable is less than or equal to one or whatever your constraint is less than seven, whatever. Exactly. A better example. You know, final position equals equals zero. Like we overload the equals equals operator. Anyway, you specify your problem. You create a problem object, then you call it problem solved. And that will basically that it's going to compile your program into a format that I'll get a low-level numerical solver will accept and then to a solution. So that's what CVX pilots. So if, if you're interested in like trying it out and so I guess here, I like a few different classes of problems that, you know, convex optimization can be used for and which you can, you can, you know, try it out on. Let's see. So. One, one really big one is like resource allocation. So you have, you know, you have like a bunch of let's make this like really concrete, I guess we can talk about like, allocating, like compute resources, like in a data center to like a collection of jobs. Okay. So you have a bunch of resources, like your dad's like an AWS. So I'm like an AWS server. And there's a bunch of different companies who are all renting out our top of GPU. So I need to figure out like, I'm going to give this percentage of GPU time to this one guy, and we'd give this presentation to this one guy. So everybody's happy. They don't have to wait too long and they get good results. Exactly. Yeah. So like the things that you're allocating a CPU, GPU memory disc network bandwidth, something like, you know, public, a five or six vector like that. And you had a bunch of people coming in, asking for different amounts of researches and you figure out what's the best way for me to share my, or allocate these sources to all these different people while like, you know, maximizing effective throughput or something, or some notion of fairness or something like this. So that's, that's one example you can get started. I don't know if we have an exempt, we have a library of example, notebooks@cdxpi.org, and then there's like an examples link. Maybe that's there. If it's not, should I add one? But that's one really good example and like the easy way to get started. I think like, you know, the most, one of the this is also related to like a famous kind of silly problem called the diet problem where like you, as an individual are choosing what you should eat. In any given day and you want to maximize your nutritional value or I don't know, minimize the amount you spend or something. Well, maybe a constraints are you meet whatever your minimum nutritions, budget, nutritional requirements. And then. Yeah. So anyway, if you were really weird, you could solve a convex optimization problem to decide what to eat on any given day. I feel like people around you would not be very happy with you. If you went to a restaurant and said, Oh, you can't, you can't order that you're to order this such. I think you will be ridiculed. So I don't recommend, but anyway, so it's a resource allocation or like abroad class. I know the one. Is finance. So finance is probably like, I think a better or worse the most prominent it's probably like the most salient example of context optimization of having like a really big impact in industry. So, so here are the variables are Here, the variable represents how much money you're gonna invest in across the universe of assets. You might be along some assets, you might be short, some assets, and the objective is to maximize your expected return while also minimizing risk. So you're trading off turning risk, right? So you can get started with that pretty easily. I think like you get some data online for like, you know, mean rich historical returns of different assets. And you can get a covariance matrix by making it yourself. Actually, there are companies that sell these a complaint, covariance matrix, specifies how risky different assets investments are. There's literally companies that will sell you a covariance matrix for like a million dollars. It's kind of ridiculous. I thought the whole point is that this is public markets. So like all this stuff anyway, but that's an easy way to get started the way. So you're saying, you're saying I can use CVX when Bitcoin millionaire. Yeah, maybe. I mean, I feel like some, there's been some cryptocurrency papers that have been citing our paper that I don't know, some people might laugh. We'll talk more about this after the podcast, you know, you can't can't share it too much. But yeah, financed people also use CVX PI to do things like design airplanes. I dunno. Maybe that's hard to get started. I don't know how many people are interested in designing airplanes, but you could do it. But yeah, I guess I would say the best way to get started would be to go and to see this and look at the examples. You would get a sense of the different things you can apply it to. Very cool. So what are some of the open questions in the field of conduct optimization? Yeah. There's a lot there. Let's see, in my opinion, the more interesting ones have to do with Applications of convex optimization. So this resource allocation, one that we've been talking about of like, how can a data center, how can a data center, a provider or someone who maintains a data center? How can they allocate their resources? Optimally, that's one that I'm pretty interested in just because I think it's low hanging fruit that people just haven't done it because not that many people know about convex optimization. It's kind of weird, right? Because. So context optimization is based on linear programming, which is this like very old school or some people think this is very old school technique that was developed in the 1930s. But what people don't realize is complex optimization really only came into maturity and like mathematical journey in like the 1990s. So like software is like really lagging behind it's still anyway. So I think that this resource allocation example is like a great way that convex optimization could have a really big impact in the industry. Because like, I, I think like data centers run out like something like 20% utilization or something ridiculous. Like they really wasted their resources. And I mean, you know, it's more important that I, you know, you can actually run the workloads that need to run, then you figure out how to optimize it so fine. But that's a great example. Right. Okay. There's other exempt. There, there are some more things related to algorithms for conflicts optimization that I think are worth researching. One question that I'm interested in is, you know, how far, how far can we push like gradient based methods for, to, to solving optimization problems with constraints? So if you have no constraints, On an authorization problem, you can always do any descent or less when you have constraints that becomes a little bit more interesting, a little bit more difficult. And like now that we have, you know, just amazing software, like PI torch Jackson, TensorFlow, like has really sort of like led to a mini revolution in optimization. Like, Oh my God, we can differentiate through everything. So people are looking at like, how does that inform our algorithms? Plus you, you said that like the math has gotten better, but I mean, the fact that the computers it's gotten so much better than stuff that was not on the table before now, you can do in seconds, which would have taken that, that is a definitely a similar paradigm. And I think that's only going to become like a lot more important in the future. So right now the software for contract optimization is okay. It's not excellent. Like civics. Great, but right. You know, there are some bottlenecks right now in which if your problem is too large, you know, we will have some trouble compiling. It'll take a long time, but won't be suitable for real time. You may be discouraged and just stop. But, so I, and I'll give it another 10, 10 years, maybe, maybe five years. And I think we're going to find that it's going to be so much easier to solve large scale convex optimization problems like very quickly. And I think like, you know, That is the thing that, you know, the key thing that everyone has sort of, no one knows they're waiting on it, but I think that is the key thing that is bottle-necking progress in applications is the quality of the software. But I think it will come. Like I said, the software is really on there isn't that much open source software, but you know, our lab is trying to change that or other labs Well, the other aspect of it too, which is quite surprising is that it's really easy to like lay out, to present a conduct complementation problem. I mean, you need to understand what the problem is, but so it's like very intuitive. These are the constraints. This is what we're optimizing for. You know, that spacecraft example, I think really shows that once you are aware of context optimization, applying it is not bad. Yeah. Yeah, definitely. That is totally true. And that's, I think the beauty of it, right? Like it's like all of that requires from you and this is sometimes a big ask, but it's really important that you do it as you figure out what you want. Right. You figure out what you want and what is not allowed. And so many times people don't figure that out, which leads to really confused. Now. I mean, just add people are, I think a lot of times people are confused about. Algorithms for various problems because they haven't figured out what, but yeah, like you said, if you can figure those things out, it's not that hard to take it a step farther and then write down your problem and have it solved. I think it's more intuitive than ML, at least for someone who's just getting started with either one of these. Definitely. And there's like no false analogy. You don't have to worry about things like that. Yeah. Let's, let's, let's dig into that a little bit. Everyone talks about how machine learning and neural nets in particular use learning in the same way. Why wait, why don't you buy that? I mean, I could say that, you know, when I saw the convex optimization problem, I'm not solving it. I'm learning the parameter, but like my computer is learning what the best, you know, variable is, which is fine. Sure. I mean, it's just kind of, you know, it's kind of silly, I think. I mean, but it's, I think it's cultural or, yeah, it's like an aesthetic. I mean, some people like that verb, I like it. So if someone is an optimization, we wouldn't say we're learning the primaries for neural network. We say where it's fitting in finding the best branders minimize the loss function. I see. It's like a, yeah. I mean, yeah. I saw a meme that said machine learning is just if statements change my mind. Well, I don't think it's interesting. Yeah, no, I don't think it's that. All right. So I think we've got, I think, I don't think we have enough time to get into embedding at MD in particular. So we'll definitely have another episode where we dig into that. But let's go. So I like to be like to wrap the interviews up by asking in your opinion, what is the best piece of software written either in recent history of, or of all time? Yeah. That's a, that's a tough question. You know, I, I don't usually think in terms of. So it's funny, you know, I, I work in optimization, but you know, I don't think about my life in terms of optimal things. You know, I don't really optimize my life, I think, or, but anyway, I can talk about like, you know, maybe a piece of software that I think is like, I just find like, no, really cool. And like software recently that I found, like, to be really influential for my career. So at a high level, I, and I think this sort of fits our discussion. I just like, think like software compilers are just like super, super cool, because like, I think like they are what, like really has unlocked so much productivity in in software development. Right. Because it allows us, it allows humans to think in ways that are. Natural for them instead of, you know, riding like assembly or something, even lower level. And the idea of taking like, you know, a high level description of something, and then, you know, rewriting it through a series of reductions, into like equivalent a series of re equivalent forms and then giving it to a computer at the end is just. Really, really cool. And like, you know, that general idea has so many applications and like lots of, you know, not, not just in software compiler, strictly like, you know, CVX PI can sort of be thought of as something like a software compiler. Like we let humans think in ways that are most natural for them. And then we do the work of getting them, you know, a solution to their problem. And which requires thinking in ways that are on natural sort of right. Yeah. So I think this paradigm, you know, of domain specific languages and just abstraction that is enabled by things like compilers, I think was just really, really powerful. So that's what I'm going to give you. Well, let me, let me, let me, let me, I want to call him back because I think, I think that's a really, I think that's a really great observation. Yeah. When I took a computer architecture courses in college and. In the beginning, when you're writing code, you had to understand exactly how a computer works. Like you have to know what a register is. You have to know how a CPU or extraction to will clock cycles. You have to know you have to know if it's pipeline or single CPU or not. You have to know all of those things. So you can give the computer instructions that will say like a branch, if this happens. But once people were able to write higher level languages, that would just automatically figure all that stuff out for you. The fact that you don't have to spend, what months or years learning how a computer works, aren't the right programs for it. It makes it accessible to so many more people. And now it seems like it's almost like competitors are spectrum, right? So like you have a compiler where you don't have to know how the computer works. You have a compiler, you don't have to know how memory works and you have it completely where you don't even have to know how like. Maybe object oriented, it works for you and make like a nice little simple domain specific language. So that does make sense that you guys are kind of doing that with optimization, you know, maybe you don't even have to be a coder tool to just to lay out a optimization problem. Yeah, yeah, no, definitely. Yeah, it's beautiful and so powerful. Yeah, I, I think, I think it's, I think you put, you put a beautiful. Cool. You can do the other answer if you want, but I think that's a really good answer already. Yeah. You know, I not, so I can, the other, I was going to say what you, you triggered my mind when you had mentioned something about pilers related to. Conduct's optimization, but I'd forgotten it, whatever. So the other thing that I was going to mention, I think that has been really personally influential. Cause you know, I actually don't know too much about compiler's work, but whatever. W what I found personally influential is like just software it's automatic differentiation. Like, so these days, a lot of people really put a lot of focus on like how TensorFlow and five torch or software for machine learning and fitting your own networks. I think the thing that's been more valuable is the fact that like, These things, just let you take derivatives of like arbitrary functions, more or less arbitrary quotes. And I think we're going to see that that's going to have a really big impact and maybe even a more lasting impact than neural networks specifically. Cause it's going to enable us, you know, the idea that we can take a derivative and, you know, arbitrary code tune its parameters is I think really powerful and we'll see applications and then many, many places. So basically you just have a bunch of data, you throw it into this and then it will map a function to that that you can use for any sort of these problem solving techniques. Be it optimization, be it MLB and anything else. Yeah. Anytime, you know, anytime you want to understand how will my function change. If I change my parameters a little bit and change the data a little bit, you can do that with automatic differentiation. There's this beautiful combination of like elementary math and elementary programming. Cool. Well, that definitely sounds like something people should look up for a burgeoning fields. Actually it's been a very fascinating conversation. I can feel my brain expanding every time I talk to you. So thank you very much for coming on and sharing your knowledge and wisdom. We describe actually the research and link to everything that he just, that he talked about in the show notes. Any other, any other comment that you have for the people out there? No, I don't think so. I mean, this, this was an absolute pleasure. Thanks for having me on. Cool. Well, thanks for coming on. All right.

Age of Information

Akshay Tells Us If Machine Learning Is Overhyped

Listen to this podcast on