ADSP: Algorithms + Data Structures = Programs
A programming podcast hosted by three software engineers (two at a time) that focuses on algorithms, data structures, programming languages, latest news in tech and more. The podcast was initially inspired by Magic Read Along.
ADSP: Algorithms + Data Structures = Programs
Episode 277: High on AI Update
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
In this episode, Conor and Bryce give a "High on AI" update, chat about the AI tools they're using, their workflows and more!
Socials
- ADSP: The Podcast: Twitter
- Conor Hoekstra: LinkTree / Bio
- Bryce Adelstein Lelbach: Twitter
Show Notes
Date Recorded: 2026-03-10
Date Released: 2026-03-13
- ADSP Episode 244: High on AI (Part 1)
- ADSP Episode 245: High on AI (Part 2)
- Cursor
- Claude Code
- Artificial Analysis
- Enter The Matrix
- podgod.ca
Intro Song Info
Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic
Creative Commons — Attribution 3.0 Unported — CC BY 3.0
Free Download / Stream: http://bit.ly/l-miss-you
Music promoted by Audio Library https://youtu.be/iYYxnasvfx8
5.3 codecs or I guess 5.4 codecs now, and then you know, sonnet and opus. And whenever you do that, it one of them always solves it. And so that's where I'm at. What is your you know, daily workflow? We know you're on cursor. What's your what's your model that you use?
SPEAKER_01I use the the Claude 4.6 Opus high thinking. I use like the slowest, like best thing because I don't care how long it takes. I want it to be as good as possible.
SPEAKER_00Welcome to ADSP the Podcast, episode 277, recorded on March 10th, 2026. My name is Connor, and today with my co-host Bryce, we revisit the topic of AI and do a deep dive, the first time since August of 2025.
SPEAKER_01Sorry, I was uh I was distracted by the AI. I was I was too busy talking to it to to notice you were back.
SPEAKER_00Understandable. Understandable. I got autorun working. Maybe I don't actually know because it still asks me quite a so it a lot.
SPEAKER_01So there's a couple weird things with the new auto run. First of all, the old auto the like the old or the sorry the non-auto run mode, if it needed to search the world.
SPEAKER_00We're talking about cursor folks, just we're hopping right into it. We're gonna talk about AI, lots of stuff today, but Bryce was telling me I needed to focus all my energy on getting sandbox autorun with cursor two point, whatever, and higher. I'm on 2.6. And I finally, thanks to well, actually, let's get their names. Just their first names. Thanks to why isn't Slack open? Slack's not open because we got issues with my computer, folks, and I had to restart it.
SPEAKER_01Was that the s the root cause of the problem?
SPEAKER_00No, no, no. I just I thought it I thought I'd be having SSD issues, but I asked Cursor about it. It said that my SSD is fine. It ran some F trim command and that freed up 76 gigabytes of stuff, and it said you don't have F Trim running. And so I auto set that up, but uh every once in a while I run into an issue where I lock my computer screen and then just like the UI becomes dumb like awfully slow. Like, you know, I will type in a character and then I'll see it like a minute later show up, and so I just have to restart. And I thought it was an SSD issue. Cursor seems to think my SSD is fine. Anyways, thanks. Shout out to Pavo and Scott on the CDD code assistant cursor thread in our slack. The main cursor individual just pointed me at the docs. I'd already seen the docs, folks, and also was telling me that my kernel wasn't new enough, which it definitely was. It was. But yeah, yeah. Well, I don't know. It is confusing dot 11 versus dot two, you know, which is newer? Is it is what is dot one one? You know, technically that's what do you call that? They call that lexicographically less than two. But if you interpret it as a two-digit number, then it is larger. Anyways, I got, thanks to these two individuals at NVIDIA, I got the sandbox autorun working. However, at first I thought it was working great because it wasn't asking me for stuff, but now it's still asking me to run like Python commands every once in a while, and I'm like, that should definitely be inside the sandbox, you know, rules.
SPEAKER_01But anyways, so my experience with it has been interesting because it used to be that the way our cursor was set up, you know, if it wanted to search the internet for something, it would just do that. But then any command that it would need to run, it would ask for permission, which was like super annoying because my job just became sitting there and waiting two minutes for it to churn out the command and then pressing yes, and then inevitably I would go do something else and I would come back like 30 minutes or an hour later and remember it, and it'd be like, oh, this thing has just been waiting for my approval. But but now what I find happens is that it'll run most stuff in the sandbox. The way that it works is basically it tries to run everything in the sandbox, and if if it doesn't have enough permissions in the sandbox, then it will it will ask you to run it outside of the sandbox. But what I've noticed is that with the sandbox mode enabled, I I get these requests to like fetch content from the web. And I don't think it's like downloading scripts, it's like, oh, like I want to look at the docs for this thing. So I think if you have it in sandbox mode, that like any network request, it's gonna it's gonna ask for permission. So that is annoying me a little bit that like I don't mind approving the commands that actually do need approval. But like basically if you if if if you're encountering that, you should look at the the reasoning trace. It probably means that the Python commands it's running in the sandbox for whatever reason don't have the right permissions. Sometimes that happens if like if it needs to access something that's outside of the repo route. So just like look at the reasoning trace. It should like explain it. Like you should see, you should see in the reasoning trace like the reason why it didn't like why it failed to run it in the sandbox. But has it been a force multiplier for you yet?
SPEAKER_00Not really, not really. I mean, don't get me wrong, cursor itself is a hundred X force multiplier, but the auto run like until it I don't have to click any buttons, like I don't care whether it's like 10% of some set of stuff or 50% of some other set. If it's any percentage of any set of stuff that is like in the common path, aka network requests, aka running scripts locally, aka. I mean, I think you should still be able to locally remove files if you're in a git repo. You know, you can just, you know, git checkout dash dash dot and you're back back to where you were. I mean removing stuff outside of the repo, yeah, that that should require because that is something every once in a while now. And this is one of the great Oh we we should we should back up in a second and start give an give a full AI update. Because I I entitled last episode the mini cursor slash AI update. But anyways, we'll we'll step back in a second and do that. We'll talk about the models we're using our.
SPEAKER_01I'm I'm trying to get statistics right now about how much more code I'm outputting with the uh with the auto run than before. I was scared of the auto run for a long time. You know, like we're we're we're like old men basically in the world of AI because we're still using cursor, you know, like we're we're not using clawed code like everyone else.
SPEAKER_00And uh no no, that's not true. First of all, I'm not an old man. You're an old man. I was I I bootstrapped an auto-run OCR screen capture. Like I was working with a project where I really needed, I like didn't I needed it to be running while I was away, and I and we didn't have access to like what they called yellow mode at the time. So I wrote, I got I bootstrapped using cursor, I think it was 3.5 at the time, a script that would take screenshots of my main monitor. It would identify the I think there was like three different things. One was run, one was approve, and what it changes over time. It would find that, then it would use auto hotkey, I believe, to, or some you know, Python move your mouse module to move the mouse, go click the button. Beautiful, beautiful. You're I'm not an old man. And I and someone I actually told someone this like way back like a year plus ago, and they were like, Well, aren't you worried it's gonna like you know, RM dash RF your system? And I was like, that is a risk I am willing to take because of how useful these tools are. And it never did, folks. It never did. So you may be an old man. Does it make us old? You said you tried cloud code, yet you're back to cursors. So what happened?
SPEAKER_01So okay, no, no, no. I didn't try cloud code. I downloaded cloud code, I tried to use our corporate login to log into cloud code, and because like with it, I I couldn't get cursor autorun working similar to you. I followed all the instructions, but like I wasn't showing up. And so I was like, all right, I'll just try claud code instead. I actually didn't, I think actually I didn't know that we had cursor auto run. I I just knew about the Claude Code Auto Run. And so I was trying to download and install Claude Code with like our corporate, you know, account. And when I tried to do like the single sign-on through Claude Code, like it gave me an error, and I went to the Slack channel and like five other people had reported the error. So I just determined like Claud Codes, you know, SSL, like SSO integration with us was not working that morning, but like I had to do stuff. So then I I like searched within the internet and I was like, oh, there is a cursor auto run. I don't have to go learn any new things. Also, my like my first like two minutes interact because Claude Code is a is a TUI, it's a text user interface. You start it from the command line and it pops up like a little text windowing system instead of just being a GUI. And my my initial interactions with it, solely from starting it to like the first few steps of the login process, I was like, oh, this is so awkward. And like I keep in mind, I'm somebody who for many years like did everything in Vim and like a command line. I never used an IDE before AI whatsoever. But like my initial reaction to this was like, ugh, you, like I want to go back to cursor. And I feel like I'm gonna give it another shot probably after GTC because everyone seems to be using it. And somebody made the point to me that because there's so many people using it, that there's a ton of like useful skills out there for clawed code, and that like you're like one of the one of the benefits to using it is that you get like the ecosystem effect of like everybody else using it. But I I I'm very happy with the cursor experience generally. I I like having an IDE for this type of development because I find that the majority, the thing that I spend the majority of my time doing is not writing code. If I was just writing code, I would probably you know want to do that in Vim. But the thing I'm spending the majority of my time doing is browsing through the the code base, you know, browsing through the the structure of the code base, looking for particular files or something to give it a task to do, or reviewing code. And I think if I'm spending most of my time reviewing code or browsing through the sort of the file structure, I want an an ID, I want a GUI for that, not just solely a text interface. So I'll I'll give it another shot and we'll see, we'll see what it's like.
SPEAKER_00Yeah, we should put it both on our list of things to do. I admittedly, I've tried cursor CLI, which I think is modeled very closely after Cloud Code, and I have heard this about Cloud Code, is that anytime you make a request, it like gives you a couple questions like, do you want me to use React or Flutter or like something like that? And uh like the cursor CLI experience, I was like, this sucks compared to cursor, and I just went straight back to cursor. No offense. I mean, the folk, the lovely folks, Michael Truel, CEO of AnySphere, and every you're doing great work. I'm just saying. The cursor CLI experience was subpar compared to cursor. And I just have had, I mean, I I'm saying I have no need to switch, yet here I am whining about the fact that we don't have autorun basically like 100%.
SPEAKER_01But I don't think you you won't you won't have that with Claude Code either. Like like the the the corporate auto run, I think well, I do know that they have like a pilot program. I don't know if we're allowed to talk about it or talk about any of this, but I know that they're working on rolling out auto run with network access. So like sandbox, but with some some amount of network access with like an allow list. I've seen a couple emails about that. So so we may be able to do that soon. The way that it works in like uh sandbox mode. So so uh all these sandbox modes, they use like built-in OS or kernel features to run your stuff in. It's it's I don't want to say it's like Docker, but it's kind of like similar to like a containerization like Docker, where it's running it like on your OS, but like in a separate environment. I think on Linux, the the thing that like Cursor and Claude Code use is called landlock. The difference with Docker is that like Docker runs it in like a different environment. And like it's like Docker's like sandbox, but it's in a different environment. I think the thing that Claude Code and Cursor are doing when they're sandboxing is it's not a different environment. It's your same environment, but it's still like sandboxed. So it has a different set of permissions and it's got like like isolated uh access to your file system, to your networks. So like when it runs it in your repo, it's gonna give it access to the files that are within your repo, but not like the whole, it doesn't have access to like the whole rest of the file system. Yeah. But I think it's called landlock, is the thing that they that they that they use. But you know, like I think the idea with the auto run with a network allow list is like by default, right now today, if you try to access anything in the sandbox, like anything that has to, you know, access the internet, it's like it won't work in auto run and then it'll fail, and then it'll ask you for permission. And with the allow list, it's like it gives you a blessed set of a blessed set of things. And one of the other things that's specific to NVIDIA is that I think in in both cursor and clawed code, the default sandbox setup does not have GPU access. So if you need to like run your tests and you're because you work at NVIDIA, your tests are tested.
SPEAKER_00That's only a small part of our business though, right?
SPEAKER_01So the GPUs.
SPEAKER_00But uh waiting for the joke to register on Bryce's face there was hilarious. I could see the gears turning. It's it's rather early in the morning right now, and Bryce was like, uh, I don't know, uh wait, is he talking about the GPUs?
SPEAKER_01So this is just like a common problem with GPU development and containerization, is that like you need you need some amount of privileges to access the GPU. And I think by default in these tools, the sandbox environment doesn't have access to the GPUs. And so anytime you go to run your tests, it has to ask for permission. Like I've noticed that I I have some pre-commit hooks in the repo that I've been doing most of my work on recently. And the pre-commit hooks don't run in the sandbox, but I actually kind of like that because I will have cursor write my commit messages for me. I'll tell it, like, you know, commit this change, like push it to GitHub, like open up a GitHub PR, et cetera, because I don't want to deal with the git command line. But I have noticed that sometimes if I tell it like, hey, go like fix this thing, sometimes it will like go fix the thing and then it will ask me if it wants to commit the change. And sometimes it will just commit the change and it will do that before I've had a chance to review it. So it is actually useful that the uh that the pre-commit hook fails in the sandbox because then it means that before it commits anything, it goes and asks me.
SPEAKER_00Yeah, well, at one point on not on my workstation, but on my work laptop, I somehow magically, for like it was a a month period, I downloaded some cursor and it just had yellow mode. And I I was like, I was so fearful that cursor was gonna force me to update at some point. But this was back when it was like Claude 3.5. So I it's like you win some, you lose some, right? Like it really what we need is it with 4.5 opus right now, like 3.5 Sonnet wasn't as great, but still it was it was still amazing to not have to hit that button. Yeah. And uh the except for one time it did write me, like write the commit message and like push, and I was like, whoa, whoa, whoa, whoa, whoa, whoa, whoa. Like I was a little bit scared there. I was like, I did not ask you to do that. I do not know I why you think I wanted you to do that. And then so I immediately was like, you know, added it to the custom rules or whatever they were called, and was like, please never get pushed. Or get even even writing a commit message like is confusing, right? And like, because if you git add and git commit, like it dis the changes disappear, and then you're like, what happened? And you have to go git log. And if you don't go git log, then or git show, like you're you're a bit confused. Like, did it just yeet all the changes? And then you're looking at the files and you're like, no, the changes are there, and it's like, did it go commit these? Anyway, so uh it happened once to me and it was a very terrifying experience if that's not what you are. I'm I know that there's some people that are just you know letting the AI do everything. But I I do like writing my own very terrible commit messages. They're always just like two words or three words, but it gives me a little bit of joy running git add, git commit, and git push, which is the only reason I do it now.
SPEAKER_01So it's funny. I have I've gone down a slippery slope here. I historically the commit messages that I've written have been one to two sentences in fairly descriptive, but not like a paragraph. And I typically what I do is I I write like what section of the code base the change is in. Like if I'm changing like something I've been working a lot with are tutorials lately. So if I'm changing a particular tutorial, I would put like tutorials slash accelerated Python slash the name of the notebook or the stuff I'm working on, then colon and then a description of the change. And I had a cursor rule that basically said, you know, when you when you write a commit message for me, look at like the previous commit messages and follow the format that I used and like, you know, make it like state it in my voice, basically. Like follow the style of of what I've been doing. And I did that for a while, and then like starting last week, I've been doing a lot of changes to this repo. And I started using the autor run and you know, it started like wanting to, it would follow that format for the the first sentence, but then you know how git mess git commit messages, you can have like one sentence and then you can have like a blank line and then like a paragraph with a longer description. And then it started like writing like slightly longer descriptions, and like one time I was like, all right, this longer description is actually like helpful. And I was just like, all right, approve. And then what happened is because my cursor rule told it that it was supposed to look at the commits, instead of became verbose. Yeah, instead of instead of telling it exactly what format I want, I told it just look at the style of my previous commits. But then like I wrote this one commit this way, and then it was writing all my commit messages. So over time it started to evolve, and the commit messages have gotten a little bit longer, and now it's like it's totally doing its own thing. And honestly, I don't look that much at the commit message descriptions that it is writing because they've like usually been good. I'm usually not that concerned about whether it's written a good commit message or not. I do like I s I I'm too busy like reviewing its code to spend time reviewing its commit messages.
SPEAKER_00Yeah. Well, I mean, I said I said we were like I was just thinking that you call that like extreme drift, commit message drift based based on the AI models. But we we haven't actually talked, so let's now like I mean we were doing this the reverse order that we should have, but you know, say la vie. Chaos with sprinkles of information. Let's give our I'll go first and then you go. So what is the update? I mean, actually, for before we do that, let's when went when did we do the high on AI? I think it was August of 2025. Let's see if I am correct about that. High on AI part one, part two was July 25th and August 1st of 2025. So that's the last time. I mean, obviously we it comes up from time to time, but we haven't really done a deep dive being high on AI again. So I mean, I'm probably not gonna call it or maybe I will. I don't know. Maybe we're gonna read this.
SPEAKER_01I thought you were gonna call it idea person.
SPEAKER_00No, no, no. Well, because we're gonna we're gonna split this up. We'll call one of the next one, the next part, the idea person one. So this one we're just talking about our workflow now. So I think back in, and actually let's go to the what do they call it? Is it intelligence index artificial artificial analysis is what it's called. I don't want to resume browsing. Just bring me to the website. And actually, I'll share this. Uh let me move some stuff around. Let us take Bryce's head, move that over here, and now I'm going to share, because we've shared this site before, but we will link it as well. And now we are going to scroll down to what we want is this one. Why they show 16 different LLMs by default is beyond me, folks.
SPEAKER_01Korea Telecom is an LLM.
SPEAKER_00And all we care about is Google, Anthropic, and OpenAI. So right now we are staring at, I've showed this in a talk that I gave back in December. It is the evolution when they introduced the different models. So right now we're on Gemini 3.1, GPT 5.4, which is the most recently released. Haven't taken it for a spin, folks, but I typically don't like the open AI models. And then Claude Opus 4.6. So back in July of 2025, we clearly were on Claude 4 Sonnet, Gemini 2.5, and O3 Pro. So back then we were using Claude 4, and but it was Sonnet 4, and there's a huge difference. Huge difference, folks, between when they released when did they it doesn't really show when they released or I guess was 4.5 was actually so 4 was good, but yeah, 4.5 was the huge jump in uh the tail end of November. So, you know, obviously we love these tools, was using them daily, and then they released 4.5 and Cloud 4.5, Sonnet and Opus late November. Let's what was the actual date? It was well, it says Sonnet was 4 it was December 28th, but then Opus was yeah, November 23rd. And I I remember this, and then and then this was like, look at how quick close these were GPT 5.1, Gemini 3 Pro, and Cloud 4.5 Opus. The the dots are literally like on top of each other. So it was probably OpenAI, released first on the 12th, and then they all within like a couple weeks dropped their model increases. This was like uh it was a massive jump. And the first thing that I ended up doing was basically creating my whole slide deck for that talk that I gave in December. I created the whole slide deck. I started by creating the title slide, and then I was like, ah, maybe I'll do the about me slide, and then I was like, wait a second, I can just keep on asking it to create a new slide. It created this nav.js thing, the whole thing was done in JavaScript. Oh my god, it was beautiful. But that was not taking it to its full potential, folks. And since then, you know, 4.6 came out like just a month ago, and I mean 4.6, and so the thing is here's one of the big things. I used to use Sonnet on a daily basis because Opus took too long, you couldn't interrupt it as Easily and I felt that working with Son it actually you could get further by like redirecting it whenever it it went off the the you know the railway that you wanted it to go down. Now the models are so good, you gotta use the big ones, folks. You gotta use the big ones. They one-shot everything so well, and then you obviously want to design it a little bit differently, but probably up next in the idea episode, we're gonna talk about all the different things. Last episode I mentioned probably in a weekend, on a weekend day, you could build a podcast app. That podcast app has been built. It is beautiful, it's better than all the other podcast apps out there. I've been using it. There's still a few. Oh yeah, so I I will well, should I maybe I will give if folks leave comments on the GitHub discussion, I will find a way to give people access to the Ape just the APK. I'm not releasing it yet. I want this thing to be perfect, or at least perfect for my daily use cases. Obviously, there's gonna be edge cases that I don't run into because it's not the way that I use podcasting apps. But still, like I use this, so I I added right before I shared it with Bryce, I added a stats page, and that was last Friday. So recording this on Tuesday morning. Since then, I've listened to 20 episodes, 14 hours and 25 minutes, and it's got it's got beautiful stats. But every day I notice something like this morning I woke up, my podcasts were not there. And this is a common issue for other podcast apps. I imagine Apple Podcasts doesn't have this, but because it requires background like checking of is there new updates, a lot of phones in order to conserve battery will like aggressively shut down apps' abilities to like do different things while you're not using that app actively. The weird thing is when I opened it, there's supposed to be some code in there that says, oh, you've opened the app, definitely go and do a check. But I had to go shut down the app, reopen the app, and then only one of the three new podcasts had downloaded at that point. I had to go manually. So there's still little bugs. 98% of the time it works perfectly, and it works better than Castbox, which I was using before, and other ones just because I designed it the way I want to use it. Anyways, we're gonna talk about that. We'll talk about ArrayBox.
SPEAKER_01You did you did give me access to the repo, right? I I I was hoping over the weekend to file some issues and fix some things, but I didn't have time.
SPEAKER_00But I I didn't add you, but there's two different repos right now. One of them is a public one, so I just said file them on that. That is the one that backs the podgod.ca website. There's another one that's privated, which I'm probably gonna keep privated for now because I think that this could actually like take off because it is going to be a podcast player for podcast listeners. All these other podcasts are not for listeners. Therefore, these companies that are trying to make money. Castbox, the one that I was using, is a VC-backed podcast like company that is like spamming you with ads all the time. That's why when you're looking at the artwork, it's always like rotating in between an ad and between the actual artwork of the episode, and like when you go in between episodes and downloads enough times, it shows you like a 30-second ad that you have to like, you know, skip. And then they've got their premium thing, which is like pay money. Guess what, folks? I don't want to make money off of this. I mean, don't get me wrong, I'd be happy to make money off of this, but I'm not the same way that ADSP has never had an ad. I'm not gonna say we never will have an ad. If there's a life-changing amount of money, I'm happy to sell out, folks. I'm happy to sell out. You tell me a number, and I'll let you know if that number's big enough. But you know, I don't think anyone's gonna give me a number that I'm gonna be like, oh, subject the thousands of listeners that we have to an ad, because I personally hate ads. That's one of the other issues I ran into the other day. I was trying to skip an ad from the lock screen, and I accidentally hit like the tracker to the end, and when when my podcasts hit 100%, they automatically get deleted. And that was very unfortunate. But then what did I do? I just went and blocked the ability. Like, how often do you ever use the tracker, like the slider, on the lock screen of a podcast in order to skip ahead? You never do that. The only time you're trying to skip ahead is when you're trying to skip ads and you use the little skip 30, skip 30. Anyways, we're talking too much about about PodGod. But the the point is, Claude 4.6 Opus, it's basically what I use as my daily driver. It's amazing. You have to go try it if you haven't tried it. Every once in a while, if I give it too difficult a task, because I am giving it the world now, it will fail. And after like a couple minutes, if I can't get it to, you know, get the right answer, I just go into multi-mode where or multi-agent mode, where I'll give it Gemini 3.1, uh, you know, 5.3 codex, or I guess 5.4 codex now, and then you know, sonnet and opus. And whenever you do that, it one of them always solves it. And so that's where I'm at. What is your you know, daily workflow? We know you're on cursor. What's your what's your model that you use?
SPEAKER_01I think I well, I was just checking. I use the the Claude 4.6 Opus high thinking. I use like the slowest, like best thing, because I don't care how long it takes. I want it to be as good as possible, and I am not paying the bill. So, you know, it's it's I'm using it for work, for work stuff, and they didn't tell me to worry about the costs, so I just use whatever the the the top thing is, which I think 4.6 Opus is the top thing with the th with the high thinking. Is that what you're using too? Presumably.
SPEAKER_00I believe so. I'm I've gone to cursor dashboard, which they definitely at one point used to tell you. I mean, there is a usage thing, but the usage just shows you the cost of every single request that you make.
SPEAKER_01And if if I went I went down to I clicked on the left on analytics and it showed it there. It showed it when it showed my like ranking, it showed me what was the most common, the model I used most commonly.
SPEAKER_00And see mine shows me the the activity graph. I mean, you're looking at my own. Scroll down, scroll down, scroll down to the bottom.
SPEAKER_01Yeah, you see right there it should say opus high. So you're not using thinking mode as frequently.
SPEAKER_00Where does it wait, where does it say opus high?
SPEAKER_01You see in the list of rankings it shows where you are in the rankings.
SPEAKER_00Oh yeah, yeah. That's sad though. They used to actually give you like a pie chart. Yeah, I mean I do it mostly for actually I don't know why. I probably just clicked it. And and I maybe actually I think I have switched back and forth because thinking does have well, I don't I don't actually know. Okay, I think well, that's why I want to see a pie chart because yeah, I'm not I'm not sure I'm not sure why I'm not using the thinking. Maybe I maybe I would have those couple problems that I had, thinking would have fixed it for me.
SPEAKER_01I I I never really changed the model. I just always use thinking. And I like I haven't even I like I don't even play around with other models that much. I mostly just like I'm too busy for that. I'm too busy for that shit. I got too much stuff to do. But yeah, I always I always just use the one that's like slowest and fastest. I mean what I do is have you started using the git work trees?
SPEAKER_00Yeah, I mean that's easy. You just click a button and it sets it up. Yeah, and then you need that for the multi-agent mode.
SPEAKER_01Yeah. So I don't so uh the multi-agent mode is the one where like you have it like multiple agents working on the s on different solutions to the same problem and then like it shows you the best one. I haven't used that. I mean, maybe I could try using that, but uh my primary like mode of operation is to like be be working in parallel. So I'll typically have two or three different you know tasks that I'm working on at a time. So I'll have two to three different chat windows open. So sometimes sometimes four, but usually it's two to three because like that's roughly the like length of the pipeline of you know, I write I I write some prompt of here's the next thing to go do. And then once I one when it starts churning on that, I go to the next window and I I start feeding the next prompt in. And then I sort of like ping pong back and forth. And I found that like two to three is about the number of like tasks that I can have running concurrently because like it needs my attention at some points, right? Like it either needs my attention to approve something or it needs my attention because like it's finished its response and it needs more input from me. And I use I'll I'll typically use the git work trees. I have run into a couple issues with the git work trees. But then the other thing is I try to I try to find separable tasks, tasks that are not gonna have overlap within the code base. And that's actually challenging because I think typically when you're working on one thing, you'll realize that something else has to be done in the same place in the code. But if it's you don't want to serialize your process. And so if I if I'm working on like one file, let's say, and I realize some other change needs to be made within that file to like the same place that the current, you know, chat, the current agent is working in. If I go and have in a separate work tree another agent work on that related task on the related code, then I'm gonna probably end up with some sort of merge conflict later. And sure, it can resolve the merge conflict. But if I end up with a ton of merge conflicts, like it'll it takes it time to go and resolve the merge conflicts, sometimes like more time than like other tasks. And so I typically want to avoid that. And so I try to find like separate places in the code base where I can be working in like two completely separate tasks or at the same time. And and then what I do is like if I am working in one place and I realize, oh hey, there's this other thing that I need to do, I will buffer up that other task in some way. I feel I feel more like a project manager because I have like these, you know, lists of like to-do items or of like action items or GitHub issues or things that need to be done that I keep in various places. I was thinking about like filing a GitHub issue for each thing, but honestly, like that would be too time consuming. That would be too that would be too process heavy.
SPEAKER_00So what I it is so funny that you say that because like I used to, that's that was my you workflow. Anytime I would have a project and like you know, I'd encounter something that either needed to be fixed or changed or a feature that I wanted to add, I go. Usually I'd have like, what do you call it, like a a plan to-do list, and then I would create sub-issues that would all link because it looks nice, and whenever you finish one and they turn purple on GitHub. And now I don't I don't create those anymore. I just have a to-do like text file, and then I just I put them in because you can finish them so quickly. And one time I looked at the list and I was like, actually, you know what? I just like pointed uh Opus 4.6 at it and said, There's my to-do list. I was in plan mode and I said, make a plan to like do all these things, and as long as they're not like two big of tasks, it'll then go and create like a really nice like checkbox, checkbox, checkbox. It'll do them one at a time. And like I remember one time I had like 14 things in a to-do list, which like typically I would have gone and created 14 different issues and then closed them one at a time, and it just it got I think it got 13 out of 14 of them, like one-shotted, and you know, you have to go and test them and verify that it did it. But I was just like, so it's it's so funny that you're saying you used to use issues and now you just use like a to-do list, because that's exactly what I've I've started realizing too. It's a waste of time to go set these things up because you can you can basically fix them and add features so quickly.
SPEAKER_01So, so you said 14. That's that's cute.
SPEAKER_00That's cute.
SPEAKER_01Now, admittedly, this is out of line in the repo because this is on a project that I'm working with other people on, and so I needed to share it with them, and we saw a Google Docs within NVIDIA. Although there are now very good tools for hooking up Google Docs to to agents. But this is this is my list of things, and I don't know how many items do you think are here?
SPEAKER_00I mean, there's like 10 pages, so we're gonna go definitely in the hundreds.
SPEAKER_01Yeah. So admittedly, this is this is like a backlog from like originally this was like in my personal to-do list from like the last like four or five months of like all the various stuff I'd written down. This is for our training material. And basically every time I do the training, I make a bunch of notes of like we need to fix this, fix this, fix this, and I write it down. And then like last week I just like took them all and I merged them and organized them a bit. But just like this set of things right here that's under completed, this is I don't know, this is maybe 40 items. And like each one is like basically a sentence or so. Some of them are like descriptive sentences that say exactly what the change should be, and then like why. Some of them are just like, you know, like kernel correctness checks. But these are like 40 things, 40 like individual things. And these are all things that I fixed in like the last two days. And this is not even like all the things that I actually fixed, because there's some set of things that I notice while I'm doing other things, and I just cue it up in in cursor itself. One of the nice things is with the cursor chat, if it's working on a prompt, you can send a follow-up. And the way that it works is it has a queue. And so by default, if you s if you type something while it's working on something and you press enter, it doesn't send it to the agent immediately by default. It puts it into this little queue. And then once the agent has finished its current prompt, then it will send whatever the next thing is in the queue in the same chat session. But there is also a button for called send now. So you can press that and that will send the message immediately. And so if you're watching what it's doing and you see in its thinking trace that it's like going off in the wrong direction, you can send a message to it immediately, like saying, you know, oh, don't do that. And I and I do do that, you know, every every now and then where I'm like, oh, you're heading in the wrong direction, like don't do that, do this instead. But what I will often do is, as I said before, like, you know, sometimes you'll be working on a project or working on one particular task, and then you notice that there's something else related, like in the same place. And so what I'll often do is if I notice that while it's working on something, I'll just put the follow-up thing in the same chat and I'll cue up a couple of them. And like those, I don't even I don't even write down those to-dos. I've just like queued up, I've queued up this series of prompts that are going out to the agent. I I I am a little cautious about doing that because I don't want to like, you know, I don't want to have one chat context that has like a bunch of unrelated tasks. But I do find myself like in the early days, like when I start first started using cursor last year, and I guess it was like spring last year, maybe, maybe it was earlier than that. I noticed I would, I would very frequently create a new chat because as the chats got longer, performance of cursor itself got worse. But also I found that it like it it went off on more tangents. So I would, I would almost always for a new task, I would start a new chat. I think mostly because of this problem where I had a long time where cursors, UI, like in the early days would like freeze up for me all the time. But now what I find myself doing is I'm more likely to continue work in an existing chat because usually there's like relevant context there. Like I've told it to do, you know, like like I've told it to like do some task and I want to tell it to like extend or or or fix something like related to like some function foo or some feature X. And if I was in a new chat, I could probably just say something vague, like, you know, like or sorry, if I'm in a new chat, I can't say something vague. If I'm in a new chat, I can't tell it just like, oh, go add like go add, let me give a better example. Let's say that I'm like modifying some function to add a new parameter to the function. And so I modify it in one place, and then I'm like, you know what? There's this other function that I should also add the same parameter to. If I do that in a new chat, then I have to tell it, go add a parameter to function foo, just like this parameter that was added to function bar. But if I do it in the existing chat, I can just tell it, go add the same thing to bar. I don't have to like explain more. I can I can type it out quicker. I don't have to give it as much context. So I find myself doing that more and more, and I've had less issues with it getting confused with large contexts.
SPEAKER_00Yeah, degradation is not as big a deal. And I I found it's it's yeah. I it's funny, I think probably a lot of people that use these tools on the daily end up learning the same things because I have the exact same thing. Like, unless if it's an entirely new like thing that is gonna require a long conversation, I almost always just keep it in the same chat because there's there's always at least like 10% of what you were just doing, even if it's just like for me, because I've been working on like the podcast player, like building the release APK and you know, launching the emulator, like that is in the build thing, but like as soon as you have like a chat window open, you've done it once, like it's like so fast. Like, whereas anytime you open a new chat, you're gonna see it go and like read the docs and figure out how to do it again. And like even that much, I it's just like the the model, like these the conversation lengths don't lead to degradation the way that they used to. So, what's the point in starting a new one unless if there's some like good reason?
SPEAKER_01There is a disadvantage, which is I have found that for cursor rules that are set to always apply, which is most of my cursor rules, because most of my cursor rules are very short. So if you have a long cursor rule, then you want to have a cursor rule where you have this little header where you tell it like in what situations does this apply. If you've got one that's like a big long descriptive thing, but a lot of my cursor rules are very short. Like my cursor rule for the commit messages is just like when you commit a git, when you make a git commit, look at the existing git commits to understand the you know format and style used in this repo. It's literally one sentence. So there's no point in me having a like short description of that at the top. So I just set it to be an always apply rule. And what I've found, at least maybe this was more true like two or three weeks ago before the latest cursor updates, but in longer chats, it would not look at the cursor rule or or like it would just completely ignore the cursor rules. And what I think was happening is for the always apply rules, like it had them, it reads them in its like initial prompt, like it gets included in the initial prompt, but then like if I had a long chat, like the cursor rule was falling out of the context. I've noticed less of that recently, but I also like the main cursor rule that I had was the git commit, it was the git commit message one. I don't know that I have any other like cursor rules that are super impactful. I know everybody like talks a lot about skills and cursor rules and stuff like that. Honestly, maybe maybe I'm doing something wrong, but I have not found as much of a need for that recently. Maybe it's just been the sort of task that I've been doing, but like we even with no rules, the the reason why I would think you would need a rule or a skill would be if you're if you're having to explain something multiple times to the model. If like you tell it to do something and it's like doing it wrong, and so you have to tell it to do it in a more descriptive way. Like it used to be that whenever I would tell it to make a git commit, you know, it would write a message that I didn't like. Or one another example, it used to be that if I would tell it to push a commit, I like to have my branches on the main repo, not on my fork. So if I would tell it to, you know, commit this change and push it and open a PR, it would push it to my forked repo instead of to origin, which would drive me nuts. Variety of reasons we can get into some other time. But like that would be something that I would create a cursor rule for so that I don't have to explain every time that I want it to push to origin in not to, you know, my forked repo. I want it to be pushed to the main repo, uh, the branch to the main repo. But I I have not found that many things, that many things that I'm asking it to do where I need to explain to it how to do that thing in some way that I find myself repeating a lot. I I typically I write some description, and honestly, my prompts are not very good. Sometimes I can write something very vague, and that's it's that's usually good enough. Like, I I don't find myself spending a lot of time writing cursor rules or skills or anything like that. Like I don't I haven't felt the need for it.
SPEAKER_00Yeah, I have zero cursor rules as as of today. I mean, there should be probably one which would be anytime you're using Python, activate the local UV VM.
SPEAKER_01I do have that one, yeah.
SPEAKER_00Because I constantly am typing that. But I don't know. I uh I like my zero cursor rule workflow because half the time now the VM that I want is actually in some other folder because like it used to be painful if you're in a current repo to like navigate over somewhere else where I've installed a bunch of you know multi-gigabytes, you know, CUDA modules from Python. But now I'll just be like, go like go get the VM from that folder, and it it takes you know the time it takes to type that folder, right? And most of them are just like sitting in home, right? So it's just tilde slash the name of that folder. I don't have to do all the cding, you know, source dot activate, whatever the command is. And that that's the funny thing too, is you know, I I hear and maybe maybe we should, because we're at the 46 minute, let's end of episode one, start of episode two. Be sure to check these show notes either in your podcast app or at adspthepodcast.com for links to anything we mentioned in today's episode, as well as a link to a GitHub discussion where you can leave thoughts, comments, and questions. Thanks for listening. We hope you enjoyed and have a great day.
SPEAKER_01Low quality, high quality. That is the tagline of our podcast.
SPEAKER_00It's not the tagline. Our tagline is chaos with sprinkles of information.
Podcasts we love
Check out these other fine podcasts recommended by us, not an algorithm.