#175 - Charlie Marsh on Ruff, uv and designing fast + ergonomic Python tooling Artwork

Pybites Podcast

The Pybites Podcast is a podcast about Python Development, Career and Mindset skills.

Hosted by the Co-Founders, Bob Belderbos and Julian Sequeira, this podcast is for anyone interested in Python and looking for tips, tricks and concepts related to Career + Mindset.

For more information on Pybites, visit us at https://pybit.es and connect with us on LinkedIn:

Julian: https://www.linkedin.com/in/juliansequeira/
Bob: https://www.linkedin.com/in/bbelderbos/

All Episodes

Pybites Podcast

#175 - Charlie Marsh on Ruff, uv and designing fast + ergonomic Python tooling

November 06, 2024 • Julian Sequeira & Bob Belderbos

Who isn’t using Ruff for its blazing speed? Who hasn’t yet tried uv to make project management seamless and fast?

What goes into building these tools, managing their increasing popularity + community of adopters?

Hear from the creator himself, Charlie Marsh, as he shares insights on designing fast, ergonomic Python tooling that elevates the developer experience. 😍 📈

Chapters:
00:00 Intro
01:45 Charlie's background
03:32 2 reasons to work on new Python tooling
07:10 Inspiration from Rust / Cargo
11:00 Thinking about software design (uv)
15:00 uv's two use cases (low vs high level)
17:15 Balancing feedback vs roadmap while being nice
23:00 How shipping evolved
24:28 Managing open source + quality / testing tooling
32:31 Pybites coaching ad segment
32:57 Astral's vision / what's coming (type checking 🎉)
37:50 Support Conda? uv can be embedded
39:53 What helped you to learn Rust (build!)
45:25 Book tip, CTA and how to reach out
49:12 Wrap / outro

Reach out to Charlie on X or LinkedIn.

Join our Python developer community

Take your Python dev skills to the next level? Join our coaching program

SPEAKER_01: 0:00

In terms of like automating the open source management, I would say we have like like almost no tooling on that, like around contributors and like issues and the contributor experience. We like triage everything, you know, ourselves. And it is a lot to balance on the team because we're both trying to like push the company and the project forward and build like totally new things and maintain and continue to improve these things that exist today. Right. So like Ruff is out there. We want Ruff to keep getting better, but we're also investing a lot of time in like what we think is like our next big release. And like, that's probably pretty separate from the day-to-day maintenance of Ruff. Again, it's really a question of prioritization and basically go to people on team. We say like, Hey, like Ruff is like absolutely critical. I think it makes sense for you to be spending like 40% of your time.

SPEAKER_00: 0:44

Hello and welcome to the PyBytes podcast, where we talk about Python, career and mindset. We're your hosts. I'm Julian Sequeira.

SPEAKER_03: 0:51

And I am Bob Maldables. If you're looking to improve your Python, your career, and learn the mindset for success. This is the podcast for you. Let's get started. Hello and welcome back everybody to the PyBytes podcast. This is Bob Beldebos and I have a very special guest with me here today. It's Charlie Marsh. Charlie, welcome to the show. How are you doing?

SPEAKER_01: 1:12

Yeah, thanks so much for having me. I'm doing great. I'm very excited for this. Are we recording? Yes, we are.

SPEAKER_03: 1:18

Glad to have you on. Very excited about the work you do at Astro and with Ruff and UV. And from a personal user perspective, I can really say that these tools are having a big impact. So I want to talk about that. I want to talk about how you develop these things, Astro's vision. uh balancing open source live and all that good stuff but maybe before um we dive into it in case people don't know yet i mean i think most people do but can you give a short introduction yeah

SPEAKER_01: 1:51

yeah of course um so my name is charlie um i've been building python tooling full-time for um almost two years now um before that i've really been a you know software engineer my whole career writing Python, but also a lot of different other different languages. So I started my career at Khan Academy, which is an education technology company. And then I worked at a computational biology company for a few years where I was in charge of a lot of the data infrastructure, machine learning infrastructure, software infrastructure. So I was writing a lot of Python and some of the experiences I had there as being part of a small team that was trying to get a lot done and just some of the friction that I had with the tooling and a lot of the innovation I was seeing with with other ecosystems and a lot of the tooling I was seeing other ecosystems led me to start working on this stuff. So, yeah, I released rough probably about two years ago, and that's a a Python linter code formatter. So it'll kind of look at your code, try to identify problems with it, and maybe fix them automatically if it can. And then we released UV in February. So that's a bit more recent. That's a package manager. So if you use tools like pip, In the past, it's a way to manage your Python dependencies and Python environments. So yeah, not only I work on this stuff full-time as part of a company, that company's called Astral. We're a team of eight people, fully distributed, fully remote, just right now building as much good Python open source tooling as we can. That's awesome.

SPEAKER_03: 3:31

So yeah, it's clearly like you're now to, Python and Rust tools guy in the Python space, right? So it seems that some of these frictions came from previous work experience. And because, you know, there were good tools, ARC good tools, Black and Flake 8 and all that. So what drove you? What were those frictions? Was that speed or was it maybe that there were too many tools? What motivated you to really branch into this?

SPEAKER_01: 4:00

Yeah, totally. I think there were like two primary reasons issues that I was having with Python tooling that I think the first and, you know, obviously I'll talk about what they are. I think the first one I sort of set out to to to solve when I started building rough and the other, I think we've made a lot of progress on like almost by accident. So those two things were one like performance and to what I would call like fragmentation. So the performance was We were like a team of eight, maybe at most 10 people working on a big machine learning code base. And I just felt like it was taking a lot of time to run our tools, even on a project of our scale, which was medium large. And I also felt like I was spending a lot of time, I was having to invest a lot of time in the tooling itself. In order to get it working for the team, I was also having to put a lot of time in. But really, it was performance that first inspired me to look at writing Python tooling in Rust. And I was sort of seeing this happen in other ecosystems, like in the JavaScript ecosystem in particular, or the web ecosystem, whatever you would want to call it. They just had a lot more of these... local machine-centric performance problems over time because they have all these like steps that they run on their machine. They have like transpilation, minification, bundling, like all these things that happen when you like run a JavaScript application. And applications were getting like bigger and bigger and bigger. And so these problems were becoming more and more a part of the development work, the friction in the development workflow. And that led to a lot of JavaScript tooling being written in not JavaScript. And in Python, we don't have transpilation or minification or bundling or those problems. We do have other things that we're running on our machine all the time, whether it's a linter or a formatter or a type checker or a language server like Pylance. And so what I was wondering was, basically, what would the trade-offs be if we started to build some of that tooling in Rust? Yeah. how much faster would it be? Like, what would the downsides be? How hard would it be? All those questions. And that's really why I started working on rough is it was, you know, it was sort of built to try and answer that question of like, could this work and what would the trade offs be? And the first prototype was like really aimed at that. It didn't really support very many rules, but it could parse and traverse Python source code. It could do some analysis. It could find unused imports. And the goal wasn't to implement 1,000 rules on day one. It was to say, well, is this possible? And what would it feel like? And what would the trade-offs be? So yeah, the first problem was really performance. And that's kind of what I set out to explore. And then the second was just and I don't think you notice this as much unless you work in a different ecosystem where this isn't the case. But in Python, you just needed like a lot of different tools. You need to find ways to glue together a lot of different tools in order to get things done. And In Rust, which I cite Rust very heavily in terms of the influence on what we're building, but in Rust, tooling is a lot more streamlined. There's super different ecosystems. Rust has tremendous second mover advantage compared to a lot of other programming languages that are around today in that they can look back at all these decisions that have been made and make slightly different decisions and invent an ecosystem from scratch, which is a huge advantage. know a huge benefit that a lot of other ecosystems don't have but one thing they did is they have a very centralized and consolidated tool chain so they have a tool called cargo and basically anything you do in rust you do it through cargo and if you like clone a rust repository you kind of already know how to work with it because you just know it's like using cargo and you're very confident that you can get it working because you like know the commands and the package manager is very uh it's just very well done So that's kind of like another part of the experience that we're trying to build towards. It's not just like the work we do. It's not just about performance. It's also about a lot of the user experience that comes from trying to build what I would consider like a more unified tool chain. So, you know, a package manager that understands everything from how you install Python to how you created your virtual environments, what your dependencies are, how you actually run Python code. If you build something that understands all those layers together, I think you can both build a tool that solves more problems for users and hopefully build something that's easier for users to learn and understand and master. So those are really the two things that I think about a lot. Performance is more or less the flagship feature of what we build, but we're not just trying to build things that are about performance. We're also trying to build things that I hope solve other problems that people have when they're working in Python around just the complexity of learning a bunch of different tools and figuring out how to get them to work together. Yeah.

SPEAKER_03: 9:15

Developer ergonomics, right? Definitely has been a challenge. Yeah, exactly. In Python and... Yeah, I was learning Rust for a month and working with Cargo has been awesome. It really is like one tool and it works really well. So if that's the vision, then

SPEAKER_01: 9:33

I'm all for it. Yeah, yeah. And there are downsides to that too. I think, you know, like in Python, you know, the fact that there are lots of different tools, like it leads to more like exploration probably. And you have a bunch of different package managers and they do slightly different things and we can learn from those experiences. But a lot of that is also based on the standards have evolved a lot in Python 2 and kind of have enabled us to build things that are very compatible with like other tools effectively. So like we've been able to build tools that follow the standards and they're easier for people to kind of swap in and out with other tools. So, you know, we're trying to, we're trying to build something that works well, like in an ecosystem and also that can be adopted like piece by piece. So for example, like rough has a formatter in a linter, you can like just use the linter. and use black for your formatter. Or you can just use the formatter and use a bunch of other stuff for linting. So we're trying to both build something that works better together, but can be slotted into the way you work today or used alongside other tools, which is very hard. It's hard to do both those things at once, but that is what we want to be doing.

SPEAKER_03: 10:58

Yeah, which brings me to the next question. How do you think about software design and managing complexity? Because you said on another podcast, like the UV CLI was, you left some space there. You didn't do like UV install, but you made like UV pip and then UV another subcommand. So kind of namespaces basically. So yeah, it seems like a really well designed tool, But usually in software design, it's messy, right? You don't have the full picture unless you have the cargo vision, maybe that does give you structure. But overall, how do you think about design and extending systems, making them easy to use and extensible? Because that's often really a challenge.

SPEAKER_01: 11:45

Yeah. So, I mean, I think one thing I value a lot is I want to make things that are easy to adopt. And I think about that a lot. And that is... in large part, why we, you know, I'll maybe talk a little bit more about the way we structured the UV CLI. Cause I think it helps illustrate what we were trying to like thread a very fine needle. Um, and you know, we'll see if we succeed, but the goal was when we first shipped UV, um, the commands it supported were like uv pip install, uv pip uninstall. And a lot of people were like, hey, why can't this just be uv install? It would be so much easier. I don't want to type out the pip. And I'm very empathetic to that because it is annoying. But we did it for a very specific reason, which is we wanted to have a very different interface that wasn't coupled to how pip works today. So You know, more recently we did this big UV 0.3 launch and it has commands like UV sync, UV lock, UV run. It has these top level commands. And so from our perspective, those are kind of like the first class UV APIs that are coupled to a more opinionated workflow. Like if you want to use those, you have to probably, you probably have to change how you work. Like you have to fit into some of the constraints that we're gonna give you. And then in return, you get a bunch of benefits. Like it should just work, quote unquote, right? Like it should, you get like better reproducibility. Like we'll do all these things that hopefully lead to a better workflow, but you have to actually change how you work. So the thing is we wanted to build something that's very easy to adopt and having people change their workflow, like that's actually asking quite a lot. It's what we want to do in the long run. We want people to be working in a different way that we think will lead to a better developer experience. But it's hard to just get people to adopt something that requires them to drastically change their workflow. So from our perspective, we were like, well, why don't we give people something that they can more or less just drop into their workflow? And then... We'll build the thing that requires them to change their workflow and make that separate and something that you can transition to over time as you come to learn about what we're building and maybe even use it in this UV pip form. And so the benefits there were like one, UV, when we launched just the UV pip interface, like the amount of growth we saw from February to March, whenever we launched, I can't remember, August is when we launched the new set of APIs, I think. So in that six-month period, we just saw tremendous growth in UV. So a lot of people were using it. A lot of people were hopefully liking it. It also gave us a ton of testing. So we just made the core internals way, way better over that period of six months. And people could basically just drop it into their workflows without having to... having to change, you know, hopefully change, probably they hadn't changed some things, but ideally not having to change much. So from our perspective, it's like, we want to give people things that are ideally very easy to adopt, but also provide a path to actually changing their workflows. And because often to solve some of the core problems, people do have to change their workflows. And so in that example, like from my perspective, pip is actually like a pretty low level tool. Like when you use pip, you're kind of like manipulating a virtual environment by hand almost. You're like, install this one package, uninstall this other package. And by uninstalling that other package, you might have actually left things in kind of like a broken state. Like pip... you know, maybe like maybe you uninstall the package that someone else needs. Like you probably shouldn't be able to do those things. So like the more opinionated workflow is tell us what you want the state of the environment to be. And then we'll just make sure that everything you do matches that state. So, you know, overall, I would say we try to have a really big focus on building things that are easy to adopt because I think if you build something that's very easy to adopt and has a good value proposition, like, Hey, it's way faster, for example, um, that's very attractive to people and it helps, you know, helps grow the software. It helps provide immediate value to people, but ultimately we want to be solving these other problems. So we have to find a way to go from, we gave you the EV pip interface to we're giving you this other thing. Um, and that's very much a part of, of what we build, like with rough too. Um, as in, sorry, also with Ruff, like the formatter that we shipped, we really focused on trying to ship a Black compatible style. And there were a lot of reasons for that. Like we think Black has a good code style is one of them. We think it's beneficial that the ecosystem has more rather than less like convergence on what code style looks like. But also like we wanted to make it easy for people to adopt. And if you give people a formatter that's like dramatically different from what they do today, that's going to be a lot harder. On the other hand, if you give them something, it's, you know, it's similar code style, but it solves other problems. It's much faster and it's the same tool as your linter. So you can learn one less tool, install one less tool. Like that to me is a really good value proposition. So I would say I just focus a lot on trying to be, I don't know, ruthless is maybe not a good word, but just being ruthless about like, how can I make this pretty easy, like easy for people to use? And how can we help them go from like point A to point B?

SPEAKER_03: 17:12

Yeah, that's fascinating. And a fine line, right? Especially packaging, there's so many opinions and different way of doing things. And then to come up with one default while still being backward compatible in a sense, and partly incorporating many, many feedback and opinions and stuff.

UNKNOWN: 17:32

Yeah.

SPEAKER_03: 17:33

Like, it seems like you guys have a pretty clear vision, but what we might not see is that there's a lot of feedback that doesn't make it or doesn't get implemented. So how do you manage, like, having a vision? And because these are pretty high-level GitHub projects, right? So I can imagine you get hundreds and hundreds of requests. Like, how do you

SPEAKER_01: 17:52

balance that? Yeah, yeah. We get, like, hundreds of notifications a day. It's just... Yeah, it's constant, right? It's, like, 24-7. And... Do you even sleep? I do sleep. I mean, look, we're lucky in that we get to work on this stuff full time. So like it's that helps a lot. I mean, it's still honestly like pretty overwhelming, just the amount of engagement that there is. And, you know, some days you wake up and you have like a list of things you want to do. But there's like 20 new issues and like two or three of them are like actually pretty critical and suddenly your day is like totally different than what you expected sometimes people open an issue and they're really rude or really mean or really demanding and you know you got to try and figure out how to approach that from a perspective of empathy i would say um but you know the real i we can talk more about sort of like the open source dynamics and how to manage that um but i think The real question is like, how do we say no to stuff? Right. And like, how do we figure out what not to do? Um, I think there are, I mean, it's very hard. Um, generally there are, there's sort of different like archetypes of interactions. Like one is, um, you know, in rough, just an example, people will often ask for us to like implement a rule, you know, a new Lent rule. And we might say, hey, we'd love to see evidence that this is useful to more people than just you. We understand what the value is, but we'd like to see more evidence that it's useful. So could you show us other projects that would benefit from it? Maybe can other people chime in if they would find this useful, et cetera, et cetera. I think in UV, or similarly in rough, often people will ask, could we make this a setting? is there, could we add a setting for this, blah, blah, blah. And it is very reasonable, but every setting you add, I mean, you should have some settings, but every setting you add increases the maintenance cost, it increases the cognitive burden for the user. So when people ask for a setting, my first reaction is typically, how can we just make the rule better? to satisfy what you need. You know, it also, sometimes the reaction is you may just have to ask the user to do something different and maybe explain why. And when you do have to do that, when you do have to ask them to do something different and say, Hey, actually, like, we don't think the rule should behave this way for this reason. I think the key thing is like, you want to show that you actually like are listening to what they're saying and you are like, you respect what they're asking for and you explain like why you can't do it. And honestly, that's like, That's pretty taxing. That takes way more time. It takes way more time to write a good, thoughtful response to a user request saying no than it does to just close the issue or to just say, no, we're not going to do that. And it's hard to keep that up over time. So for us, too, a lot of it is about encoding our values as a team. And one of the things that we've decided we value is we want people to have a really good experience when they come to the repo. even if we're going to say no to what they asked for. And that has trade-offs. Like it means we get less done in other dimensions because we're going to invest more time in it. But it's a matter of like, what do we value? It's like, we want to build a community where we want to like the vibes on the repository to be good. Like, I know it sounds silly, but like, that's what we want. And so we want, even when people come and we say no, they should have, they will probably be disappointed, but they should have a good experience. So yeah. Anyway, I think it's both about how you make decisions and then how you communicate those decisions and make them alongside the people who are asking for them. And it's hard. Like when I first started the project, I said yes to everything because I had no users and like people would show up. And I'd be like, this is amazing. Like these people care about what I'm doing. That's so cool. Like, yeah, of course I'm going to do whatever you're asking for. And, you know, there are probably features we have in rough today because of that. Right. And you still have to support those. So, you know, it's also about the mentality shift of the project over time. Like you go from this initial phase of you'll do anything because there's so few people to everything we ship now, we basically have to be saying we're comfortable supporting this for like the next, you know, the next decade, the next like few decades. Right. So So it changes with the maturity of the project. But it is kind of a constant struggle. And I'm sure we make some wrong decisions too. But it is something that we've decided we just care a lot about is we want to try and make good, responsible decisions. And we want people to understand why we make those decisions.

SPEAKER_03: 22:34

Yeah, that's really interesting how that has changed. Like, indeed, when you started, you want all the user feedback you can get, and you might be lenient with features, but now it has 10,000 X, and that's pretty overwhelming, right? Just the amount of interactions, and then also taking the time to respond, and that must be a hard balance. It is, yeah.

SPEAKER_01: 22:57

I mean, that dynamic is actually similar to, like... Like shipping, like early on when I did the project, when I was, it was just me working on the project and like, I didn't really have many users. I was doing like multiple releases a day and I would just like ship breaking changes and like, blah, blah, blah. And then eventually we got to the point that like, as soon as we caught a release, we'd get like three or four issues. And it would be people saying, you broke this, or this doesn't work, or blah, blah, blah. And I realized at that point in time, the way that we were shipping the project no longer made sense for the maturity and scale of the use. We were still on version 0.0.200 or something, but real projects and real companies were using it. So again, like you have to, what's makes sense for one phase of the project, like doesn't make sense for another. And we had to add it. We added a versioning policy. We spent a lot of time thinking about what that looks like. We cut our release cadence down. I mean, it's still like once a week, which is pretty, pretty fast, but like, um, or maybe somewhat overwhelming, but like, uh, but you know, every release has things people want and we want to like get those out to people. But again, it's, um, it's about understanding like the life cycle of the project and like what makes sense at different points in time. And a lot of that's learning for me because I'm not like a career open source maintainer by any means. Like all this stuff, all these progressions that I'm learning are also new. Or all these progressions that the project is going through are also progressions that I'm going through personally.

SPEAKER_03: 24:24

Yeah. Tell me a bit about the open source aspect. How do you manage that whole dynamic with contributions? And actually, another question I had, like the software, of course, grows more complex over time. What tooling have you put in place to guarantee the quality? I mean, I guess unit testing, end-to-end testing, maybe you have manual testing? I don't know. So these two aspects, the growing complexity of the project, how have you automated that in a sense, and how do you manage the whole collaboration of different people working on a growing code base?

SPEAKER_01: 24:57

Yeah, yeah, totally. So... In terms of automating the open source management, I would say we have almost no tooling around that, around contributors and issues and the contributor experience. We triage everything ourselves. And it is a lot to balance on the team because we're both trying to push the company and the project forward and build totally new things and maintain and continue to improve these things that exist today. Right. So like rough is out there. We want rough to keep getting better, but we're also investing a lot of time in like what we think is like our next big release. And like, that's probably pretty separate from the day-to-day maintenance of rough. So again, it's really a question of prioritization and basically go to people on team. We say like, Hey, like rough is like absolutely critical. Like, I think it makes sense for you to be spending like 40% of your time on just on like maintaining rough and like And maintaining rough means fixing bugs, but also responding to user questions, reviewing contributor pull requests, making sure they get merged, all that kind of stuff. So there's not really a better answer than we just put a lot of time into it. And we set goals each month around what we want. We have some subjective goals. We often say, sometimes we'll set a goal that's like, we want the project and the repository to be in as good a place at the end of the month as it is today. So like, and what that means is typically like the number of issues probably will still go up because that is just the arc of the universe. And like issues are just going to go up even if we close a bunch of them. But maybe like the number of open contributor pull requests goes down. Maybe people on the team subjectively don't feel overwhelmed by the amount of maintenance that they've done. So we use some sort of quantitative ways of thinking about it and some qualitative ways of thinking about it. We make it a very explicit goal that like we want to be good stewards of the project. The testing component, we've invested a lot more in. So in rough in particular, we have this very like shipping changes in rough is really high confidence now because both because we have a really big test suite in rough itself and it's all based on snapshot testing. So it's effectively like we run a command in rough and then the test snapshot is the output of the CLI. It's not exactly that, it's some modifications, but the idea is basically to run a test, you give us a Python file, And then you tell us what rule you're testing. We run rough over that and it spits out a bunch of output about what diagnostics it saw, what the auto fixes were. And so if you change something in rough, we run all those tests and then you see all the diffs in the snapshots. And that alone is very high confidence. The problem is like Python is such a diverse ecosystem and a language that you can use in so many different ways that even no amount of snapshot testing is actually going to cover all the things we can get from the linter and the formatter. So we built this ecosystem continuous integration component. So every time you have a pull request to Ruff, we have a list of, I don't know how many it is now, maybe like 50 projects. And we actually clone those projects and we run Ruff over all of those projects. And we post back on the PR diffs of the changes. So if you make a formatter change, you'll actually see the diffs of the before and after of the code style in all these projects. And it's like we basically couldn't build rough without this today because we would just have no visibility into how users are actually being affected by changes. And now if we add a new lint rule, we can see all the violations in I don't know what projects we have in there right now, like Zulip and Apache Airflow. And then we have some smaller projects. We have some projects that are more like web applications, some that are like data science. We basically try and have a representative sample of all Python code out there. That's the idea. So if you're trying to fix a bug in the formatter or you're trying to refactor in the formatter, you want to see no ecosystem changes. And if you see no ecosystem changes, you can be pretty confident in that. So we've invested a lot of tooling there. In UV, it's been a lot harder to do that kind of ecosystem integration. And part of that is because Python, rough is very easy. It's like you give it a bunch of Python source code and we just analyze it. In UV, what is a project's packaging setup? Every project is structured a little differently. It might have slightly different rules around how it puts things together. How do you make sure that the environment you create actually works? Like, every project has a slightly different way of running their tests or importing their code. So it's been a lot harder to build that standardization. It's actually something that UV could help with in the long run. Like, if projects were more standardized about how they were built and run, we could actually do more wide-scale testing of package management. But there again, it's a lot of snapshot testing. And then I know I'm talking about this a lot, but I think it's very interesting. There is one piece of infrastructure we built there that I think is very, very cool, which is we have a tool called Paxi for creating packaging scenarios. And the idea there is that often you want to test really obscure things. not obscure, but often you want to test very specific scenarios in dependency resolution. And like, let's say you wanted to test, you know, a package that's only ever been published as pre-releases. So if you want to test that, like on PyPI, you have to go like find a package that has only ever been published as a pre-release and like blah, blah, blah. We have a tool where you basically define the dependency graph that you want. in json and then we create packages that model that dependency graph and so we can write tests for extremely obscure scenarios like you have a package that depends on a package that was only ever published as a pre-release like should that be allowed how should it result how should it resolve um so we have built a lot of infrastructure there around creating dedicated scenarios for packaging. And that's actually a tool that like other package managers could use. There's nothing about it that's specific to UV. Like ultimately it publishes those packages to a registry and you just test against the registry. But like, again, it would be very, very hard for us to ship UV without the amount of test coverage we have. So yes, we invested a lot into this. We invest a lot into those things. I actually like to invest more because the stability is like, yeah, at this point we just have a lot of, we're just in a lot of projects and a lot of companies. And so if we ship something bad, I mean, we find out about it pretty quickly, typically. Um, but, uh, yeah, the more coverage we can have for those kinds of scenarios, the better.

SPEAKER_03: 31:59

Yeah. The stakes are high. Yeah. Um, so yeah, yeah. Maybe, uh, share a bit, um, what Astral's vision is. So you have rough and UV, of course, a lot of development there and, and plenty to do.

UNKNOWN: 32:14

Uh,

SPEAKER_03: 32:14

Is the strategy to grow out these projects to get UV closer to cargo, which might mean having test integration and package uploading, maybe? Or is there some completely new projects coming as well? In just 12 weeks, PyBytes elevates you from Python coder to confident developer. Build real-world applications, enhance your portfolio, earn a professional certification showcasing tangible skills and unlock career opportunities you might not even imagine right now. Apply now at pybit.es.

SPEAKER_02: 32:53

Yeah,

SPEAKER_01: 32:57

I mean, it's a mix of both. So like, you know, I kind of think of the tooling right now. We have like two major parts of the tooling, right? We have like rough and UV. And when I think about rough, I think about this like world of static analysis tooling. And so that's like the linter, the formatter. And there we have about half the teams working on that right now. And a lot of the new development there is focused on trying to build much smarter and like more powerful infrastructure for type inference um so if you think about like the three legs of static analysis tooling i'm using that term very loosely but i often think of most people would think they have a linter they have a formatter and they have a type checker and so the type checker is the thing that we're really missing um and it's also a thing that's kind of missing from the linter like Imagine if the linter had access to all this type information about your code. It could enforce all sorts of interesting rules that are very hard to build today. Maybe you have a rule that's specific to dictionaries and it should only be enforced on dictionaries. Well, it'd be nice if we could know if a given object is a dictionary. In rough, we tend to make guesses at that. We do some very basic analysis, but imagine if it was fully integrated with a type checker. The linter itself would actually get much better in addition to being able to do type analysis So yeah, with, I would say that broad category of improvements and tooling is where we're spending most of our time right now in rough is we're trying to build a much better, like type inference and semantic analysis system, because we think it will just make, enable us to build a type checker, but also like a bunch of other, like more powerful improvements to the existing tools. So like, that's a big piece is we're just trying to build, uh, that kind of infrastructure. And I guess I'm not trying to like promise that we're gonna ship any specific products on any specific timeline, but like that's what we're working on. And that's all happening publicly. That's all in the repo. It's all under the, I don't know if it's a code name, it's not secret, but like we just call it right now, we're calling it Redknot. And so if you go to rough and you look for Redknot, you'll like see that we're building like a type checker. And that'll happen. That'll ship when it ships. But that's a lot of the focus right now on that side. And UV, I would say the past month or so has really just been reacting to user feedback. Every time we ship, we did a big release in August. And every time we do that, it would just get a lot of user feedback. And there's a lot of reacting, too. people want we've tried to figure out like what where did we miss like what are the things that we didn't do well like what are the things that are working um what are just like actual bugs um yeah and from there i mean we're definitely thinking about like the cargo model um and there are pieces of that that we're missing like and there's something we actually have to make decisions on like do we want to have like a fully integrated linter format or test runner like cargo ships with those things like it has cargo format and that actually sort of like exposes a separate tool called rust format like That is actually not necessarily part of cargo, but it is exposed by cargo. And so in theory, we could have UV format, UV lint, UV test, right? We may do those things. We may not. We're still kind of trying to... I wouldn't say we have a strong answer on that, but that's definitely one direction we could consider taking things. That's sort of what Rye does. If you look at Rye, which is... Rye... is maybe slightly more experimental and opinionated about that stuff. So RIE actually does expose RIE format, which is rough, and RIE test, which uses PyTest. And so we could think about building some of those more first-class things. I think the other part is there's just a lot of... There are a lot of very hard problems in packaging that we haven't been able to solve yet and that I would like to make easier for users. For example... a lot of, we get a lot of issues about using PyTorch with UV and doing anything related to like GPUs and especially if you want to be able to have like mixed GPU and CPU setups, like if you haven't worked with this stuff, great if you have worked with it, you probably understand like some of the kinds of pain points that I'm describing. So I wouldn't say that we have like a solution to that problem, but there are actually a lot of problems in packaging that I think we haven't been able to help with as much as we'd want. And so thinking about how we can advance those, whether it's through standards or whether it's through building our own things or whether it's through building our own things that somehow become standards, maybe, I don't know, like very optimistically, just there are still like, I think some big problems in packaging that we want to be able to solve.

SPEAKER_03: 37:48

Awesome. What about other initiatives like Pixie? I think that tool focuses more on the Conda part. Is that a part you're also looking at?

SPEAKER_01: 38:00

The Conda ecosystem specifically? Yeah, people ask us about it quite a bit. I would say it's something that we need to figure out what we want our answer to be. Do we want to support Conda? We could, but... I would want to see pretty strong user demand for it because I think part of the strength of UV right now is it's very focused on... what I would call like the, well, I don't know. I don't want to say anything negative about condo by putting a different term on it. Like the Python native ecosystem. I don't know, like the PyPI ecosystem. Like we're very focused on that experience. And I think it leads to us being able to build like a better tool for that experience. But, you know, we do need to have, we do need to figure out what we want our position on that to be and like whether we want to support managing those kinds of system dependencies. I mean, one of the really interesting things about UV is that, that I didn't, anticipate but i think is very cool is that it's it ended up being fairly composable so there are actually a lot of tools now that use uv under the hood so pixie for example uses uv for um whenever you install packages from non-conda sources like from pi pi it actually like uses uv as a library um and we've seen that with like other tools too like sometimes it's like PyPA slash build or like talks like those tools have like UV backends now, which is really cool. And then sometimes it's like other packaging tools that wrap it. But it's been it's been that's been a thing that I didn't really anticipate happening. But I think it's actually a really good sign that we built something that can be composed like that. And it's been cool to see it, you know, not just being used as like a first class tool, but also being embedded in other tools. Gotcha.

SPEAKER_03: 39:48

Cool. Well, we're coming out on time, but I have two or three more questions. Yeah, sure. Yeah. I could ask so many more things. I can try and keep the answer in front of you. No, that's fine. That's a, there's just so much that we can talk about. But yeah, on the Rust side, I can imagine some Python developers being concerned that all these tools are now in Rust and they don't necessarily know Rust or want to learn it. Because Rust is a requirement, right? If somebody wants to go in today and do a contribution, this is all in Rust, right? So you would have to know Rust. So, but if people are interested, what are... Because you also learned Rust as an end programming language, right? It was not your first one necessarily. So... How did you learn it, apart from just building these tools and learning it on the fly, which is close to PyBite's heart as well, right? Learning just in time. But if you have any tips for interest or aspiring Rust-tations, let me know.

SPEAKER_01: 40:45

Yeah, yeah, totally. So like, I, yeah, I learned Rust. I had not really done, yeah, I had not really done what I would consider like systems programming before. Like I wrote some C in college. I've never really written any C++. Most of my career has been Python and TypeScript and some Java. So that was my background coming into Rust. I found Rust to be a fairly hard language to learn. I mean, I think the learning curve is quite steep. I think once you're... I think you go through a lot of different phases of actually learning and understanding Rust. And eventually you get to the point where... your code doesn't compile, but you understand why. That's like the real moment of, I think, insights when you start to understand the language. I would say like, I think coming from Python, especially, it's pretty hard because the language is just so different. And the form of thinking around, especially around like the borrow checker is just really, really different. For me, the key thing was like, in my last job, I'd like started being exposed to Rust at work. but like someone else on the team introduced it. And I was mostly going in and doing bug fixes. And like, I was doing that alongside the rest of my job. And so every time I went into that code base, I was like trying to get out as quickly as possible. I was like, I just need to fix this. I just need to get this to compile. I wasn't really learning the language. So there was really no substitute to me for just putting in a lot of time. Like part of the reason I started working on rough was because I wanted to build something in Rust from scratch so that I was really forced to learn the language. And some of that was just putting in a day, trying to understand lifetimes so I could get the AST to compile. There was no substitute for putting in a lot of time. And I do think it's helpful, it was helpful for me at least, to try and build something myself from scratch i think it helps with understanding the ecosystem and the tooling to be forced to like figure out how to glue a lot of those things together i generally i generally encourage people to try and build something that solves a problem they have and that is well scoped so build something that you can actually ship um like going in and saying i'm going to build you It's a bad example because we're trying to do it, but I'm going to build a type checker. That's a big project that takes a lot of time and a lot of expertise. It's very hard to go in and build that, but maybe you want to build a tool that... I don't know, maybe you build something that builds on top of Ruff. Maybe you use Ruff as a library in your Rust project and you're trying to build some kind of analysis tool for Python. Maybe you want to be able to detect cycles in your modules or something. Build something that's useful to you and solves a narrow-scoped problem so that you can be in a position to actually finish and ship it. That, I think, is the best outcome, is you learn the language and you get something out of it that's actually useful to you and that can be useful to others. Um, we have had a lot of success. So I would say with people contributing, like rough has, I think over 400 contributors and rough. I would like to see more contributors in UV because we have helped, we need more help there, but, but I would say rough tends to be an easier project to contribute to because it's slightly more, um, uh, compartmentalized, like implementing a lint rule is mostly isolated to like the file you're working on. It's like implement this contract. Whereas in UV, things tend to be more like slightly more interconnected. There's like, you know, fundamental infrastructure on resolving and installing. So like in rough, I would say like UV, I think has somewhere between 100 and 200 contributors, but rough has a lot more. And for people that are interested in learning rust, I think rough has been a pretty reasonable entry point for people from the Python community who want to learn Rust. One, because you understand the language and the semantics of Python, which is helpful for what you're trying to do. And two, just the community around it is people who have worked in Python and worked on Python. And so we see a lot of people coming in and saying, I wanted to learn Rust. This is my first entry point to it. So I think, yeah, either work on Rust. No, I'm kidding. I'm not trying to advertise our own project. I'm just saying we have had success with that. And I've actually been really impressed by a lot of the contributions we've gotten from people who claim to be writing Rust for the first time.

SPEAKER_03: 45:10

Awesome. That's good to know. Yeah. And yeah, I 100% agree. You need to build something, scratch your own itch. She can read books about it, but you need to build and get into the nitty gritty. Right. And yeah. Yeah. Okay. So yeah. Thanks for sharing. That's a lot of great insight. And again, super exciting where this is going. And I use those tools. I put rough on the coding platform for the linting use UV every day. So it's, it's, And yeah, I think the speed there is a big win. I thought it wouldn't be that much of an impact, but once you start to run rough with pre-commit on every commit, it's like, oh yeah, this better be fast. And same with pip install in a sense, like you cache those dependencies and once they're already pulled in, your subsequent UV syncs or whatever are just so much faster.

SPEAKER_01: 46:03

Yeah, yeah. It can be hard to go back. But in a year, everyone will be saying that, They want, you know, they're used to the speed and they want something even faster. So I have to start planning for that. Can it even be faster? Yeah.

SPEAKER_03: 46:16

Yeah. Um, yeah. So thanks for coming on. Um, we could do a quick book tip if you have one, if you're reading something interesting, otherwise we could skip it. But are you, are you reading something cool or do you have enough time to read? Does it have to be nonfiction or as long

SPEAKER_01: 46:32

as it's not pull request? Uh, no, I, I read, I mostly read like sci-fi and fantasy. And I've just been rereading a lot of the Brandon Sanderson books now, which may not be very new to most people who will listen to read or listen to sci-fi and fantasy. But like I just read it, re-listen to all the Stormlight Archive books. which is excellent. And now I'm listening to the whole Mistborn trilogy. So yeah, I like that kind of stuff a lot. And it fuels a lot of the work that we do. So yeah, thank you to people who write great epic fantasy because it's a lot of how I spend my time. Awesome.

SPEAKER_03: 47:15

And maybe do you have a final piece of advice, shout out for our audience of Python developers?

UNKNOWN: 47:24

Yeah.

SPEAKER_02: 47:28

What could I say here? That would be interesting.

SPEAKER_03: 47:38

Parts from checking out your tools, of course.

SPEAKER_01: 47:40

Yeah. I mean, I, you know, like I'm pretty interested in hearing from people who are using our tools, especially people who are using them from outside open source, because it's a lot harder to get visibility into how people are using your things like outside of open source. But like, I think in general, it turns out like most of your users aren't, you know, open sourcing all the code they're working on. So like, um, I just love to hear from people, whether it's like on Twitter or through GitHub issues, um, about like how they're using our stuff and also like the problems that they're running into. Um, because our, our whole goal is to try and solve those problems. So like I try and keep the barrier to interaction, um, pretty low on our repo and put the responsibility on us to like manage the barrier to interaction being low. But like, yeah, I would just love to hear from people who are using our stuff and like the problems that it's solving and also the problems that it's not.

SPEAKER_03: 48:32

Okay. And maybe, maybe one good question is like, how do people reach out to you? They just, contact you directly or can they also use issues for that? Because sometimes I guess people associate that with actual work and tasks. Can you use issues to

SPEAKER_01: 48:47

feedback? Issues should typically be scoped to some kind of specific problem that you're trying to solve. For general feedback, I think people can either DM me on X and I check all of those. It's just at CharlieRMarsh or join the Discord where they can just chat with us directly. And, um, yeah, we respond to everything there too, or we try to. Okay. That's good to know.

SPEAKER_03: 49:13

Yeah. Awesome. Well, thanks for hopping on and thanks for all the work and, uh, yeah, keep it up. Yeah. Thanks so much. No, it was entirely my pleasure. Okay. Awesome. Have a good day.

SPEAKER_00: 49:23

You too. Hey, everyone. Thanks for tuning into the PyBytes podcast. I really hope you enjoyed it. A quick message from me and Bob before you go to get the most out of your experience with PyBytes, including learning more Python, engaging with other developers, learning about our guests, discussing these podcast episodes and much, much more. Please join our community at pybytes.circle.so. The link is on the screen if you're watching this on YouTube and it's in the show notes for everyone else. When you join, make sure you introduce yourself, engage with myself and Bob and the many other developers in the community. It's one of the greatest things you can do to expand your knowledge and reach and network as a Python developer. We'll see you in the next episode and we will see you in the community.