Runtime Arguments
Conversations about technology between two friends who disagree on plenty, and agree on plenty more.
Runtime Arguments
20: Git protects you
Git is an amazing tool for managing your source code. Lots of people use it every day but most people barely scratch the surface of what git can do.
In this episode, we dive in and explain lots of features from the simple to the complex.
Links:
Julia Evans Wizard Zines
https://wizardzines.com - The main page
https://wizardzines.com/comics/inside-git/ - The git specific 'Zine
Hosts:
Jim McQuillan can be reached at jam@RuntimeArguments.fm
Wolf can be reached at wolf@RuntimeArguments.fm
Follow us on Mastodon: @RuntimeArguments@hachyderm.io
If you have feedback for us, please send it to feedback@RuntimeArguments.fm
Checkout our webpage at http://RuntimeArguments.fm
Theme music:
Dawn by nuer self, from the album Digital Sky
Welcome to another episode of Runtime Arguments. Uh I'm Jim McQuillen, and today, like always, is my best friend Wolf. Say hello, Wolf.
Wolf:Hey everybody.
Jim:So how was your week? I think you got some interesting news to tell us.
Wolf:Uh I had a good week. Um I think the high point of my week was that um I think uh last episode we talked about uh editor theme uh syntax coloring scheme that I like called Alabaster, designed by uh a guy who goes by the name uh the handle Tonsky. Uh lots of other editors. Uh all I did was port it to Helix. Um it was in my dot files, I turned it into its own repo, I got feedback from outside. People generally liked it, and they were incredibly useful suggestions. I made screenshots, I did updates, I copied those theme files into a brand new branch of a fork of Helix and did a pull request. And in the time since me talking about it last episode and now, that pull request has been accepted. So my ports of Alabaster theme um are now in Helix Maine. How long it actually takes for Maine to uh turn into a production build, that I don't know, I'm sorry to say. But pretty happy.
Jim:But that's pretty cool. It really is. And um we're gonna talk about pull requests a little bit today uh as part of our topic. Uh we'll get into that in just a minute. In fact, let me let me say what our topic is. We're talking about git, uh, some of the superpowers with Git. A lot of people are using Git. A lot of people are just barely using Git, uh, kind of like me. And uh Wolf is gonna tell us all kinds of cool things you can do with Git. That some of them are advanced, some of them are things we all should be doing. Uh, but before we get into that, I've had a pretty good couple of weeks. Uh I talked last time about how I moved my database, my production database. Uh, I took it out of Azure. It was uh hosted by Azure, uh managed by Azure, and I I took over hosting of my database in the VM that I have on Azure. So it's not Azure managed Postgres, it's gym-managed Postgres. Um I did that for all kinds of reasons. Uh if you go back to the last last episode, I talk about that. Um uh but the customer loves it. The performance is about double what it used to be, which is great. Uh, it's so much more flexible. I'm loving that. But something else I'm doing, and this is really gonna excite Wolf, I think. I haven't told him anything about this yet. But I created an account on Claude and I downloaded. I did. I downloaded uh Claude code, and um, you know, I avoided AI for a long time. Uh, and I've got reasons or excuses, whatever you want to call it, why I avoided it. Um, you know, I enjoy programming so much. Part of what I enjoy, or or maybe the big thing for me, is the endorphin release that I get when I solve a problem. I I tinker with programming and I figure something out, and when it works, it's like euphoria for me. I I don't know if everybody gets this. I think Wolf probably does. Uh, but you know, when you work on something and you get it working, you know, and you maybe struggled for a while, there's just something about that that is so pleasurable for me. Um, it's it's kind of like uh there's other things in my life that do that too. I used to play blackjack a lot. And uh maybe I played too much, but I did pretty well at it. I won some money, I had a lot of fun, I lost a little bit of money, but overall I'm I'm up. Um uh and I haven't played in a couple of years, really since COVID. I haven't played, but I love playing blackjack because I love the numbers. I love adding up to 21, finding some combination of numbers that that equals 21 or close to 21. And I hope it go I hope it's a better score than the dealer has. And that just that releases the same kind of endorphins for me. Um playing uh uh jigsaw puzzles, and I know Wolf, you've you you've kind of uh kitted me a little bit about why do I like jigsaw puzzles, but it's the same kind of thing. When I figure out a jigsaw puzzle, when I finish it, I I get this little shot of of uh uh of euphoria, and it's just great. Um we play some other games like when we go to Maine, a bunch of us go to Maine. We play Mexican train, and you guys probably think uh we must be a bunch of old people living in Florida. But I I love it because it's numbers. You know, you're you're stringing dominoes together in long strings and solving a problem. And I just I just love that. So I've been afraid that going with AI, I was gonna miss that. Uh if I let AI generate my code for me, I uh and uh now that I'm sort of getting into it a little bit, I realize that's just ridiculous. Um I'm still involved, I'm still driving the car. Um the one of the problems I've had though, uh keeping me from AI is I like understanding how my tools work. Right? I I know how compilers work. I've written compilers, I've worked on compilers, I know deep down really how they work, how the parsing works and the lexical analysis and all that stuff. I've got a pretty good understanding of that. Um I I know how network stacks work, uh, I know how mobile apps work. I've written some of those, and it's not really rocket science how all that stuff works. And I know, uh Wolf, you've worked on browsers, you know, you know intimately how Mozilla uh Firefox works because you were uh uh an engineer way back in the day in the in the 90s. So you know what it's like to work on things that you understand how they work. AI, I don't have a clue how it works. You know, people throw out things like, well, yes, just this graph that that you're traversing and stuff, and that means nothing to me. And I think it probably means nothing to almost everybody. There's probably a very few uh number of people who really understand how these models work. Um and uh that turns me off a little bit. Um I'm kind of not excited about that part of it, but uh I am I am looking forward to the help it can give me. Um I I've I've uh uh there's other reasons why I've avoided it too. I'm a little concerned about the plagiarism aspect of AI, um, but Wolf assures me that uh Claude is using licensed uh data for their models.
Wolf:And I I I think in this case according to their their messaging.
Jim:Yeah, and I hope that's true. And I think at this point, I'm going with plausible deniability. I'm gonna jump into Claude and if they say it's legit, uh, I'm I'm gonna have to go with it because you know what? So many of my friends are using it, so many of the smartest people in the world are using it. Wolf is at the top of that list. Marlon is telling me I really have to be using uh AI. He's telling me how he's using it. Uh Justin is using it. Lots of our friends are using it. So I've decided I'm jumping in. I I installed it yesterday. I'm completely lost. I'm gonna be looking for help from you, Wolf. Um, but uh I'm I'm not gonna get left behind. I I am gonna give it a try.
Wolf:Uh so I'm super happy that you have decided this is a journey worth exploring. Yeah.
Jim:Yeah. You know, I I've got a couple of uh iOS, a little pet projects I've been wanting to start for like two years, and I've just not found the time to do it. And I'm kind of thinking this is a great way to get started. I'll integrate it with Xcode and and see if I can generate a lot of the boilerplate stuff for these little apps. There's nothing complex that I'm doing, but it's it's a lot of mundane stuff, and I'm kind of thinking that uh Claude Code is going to be my uh my my pair programmer uh to help me out here.
Wolf:Yeah, for for me, um using Claude Code is about reducing friction. Um it's not about Claude writing new code for me, because uh I do let it uh generate code for tests and things, but everything that it does, I'm very careful to look at and understand. Um for the most part, for real code, for stuff that does your thing, i it's like working with a very junior programmer. I'm not happy with the kind of code Claude would give me when I let it give me code. But once I tell Claude exactly what I want, then it's like a really good editor. Like it can take that change and apply it in the 13 places that you need it in this directory, for instance. It can um decide which files need to be added. It's great at making a useful commit message, which is nothing but drudgery for me. I've got it hooked up to our ticketing system. We use Jira. Uh, so it makes sure to keep my tickets up to date, to know my priorities from the tickets, to move the tickets from one column to the next. This one is in the backlog, now it's in progress, now it's moved to PR. Oh, it's done. Um, and it just keeps track, which is the thing I'm bad at by myself. That's a growth area for me, and it's a place where I need to be communicating with the rest of my team and my boss. Uh, they need to see my progress by seeing those tickets. Having Claude do that uh makes me set expectations better. It was drudgery. Uh and nobody would look at that use of Claude and say, oh, you're cheating. You're you're letting Claude do all the hard. No. It's it's reducing friction, that's all. I I do want to slip in one thing before we let it go. Um, and that is uh with respect to blackjack. Yes. So um if you go in with the mindset, I'm gonna have some fun with uh eight hundred dollars or two thousand dollars or whatever, and you don't care whether you come out with zero or twenty thousand, I support you a hundred percent. If you decide that blackjack is your profession and your intent is to make money, uh just like you know, there are people who play the lottery this way, um there it's a tax. Uh gambling is not uh a thing that you mathematicians don't gamble. Um and the reason is because of a thing called gambler's ruin. And the the it's not about odds. It's about the fact that the casino has enough money that they last longer. Uh in real uh sequences of wins and loses uh that that are true, there's gonna be long runs of wins or long runs of losses.
Jim:Sure.
Wolf:And the fact is, the casino has enough money that you could l or they could lose for days and still keep going. Yeah. Uh you can't. So even well, no matter what the odds are, gamblers ruin can can cut your winning short.
Jim:Sure. Sure. And for me, it was about the fun. It was never uh a way to make money, because believe me, you uh I I wasn't that good. I don't know if anybody's really that good at blackjack that they can consistently make money. But it was uh it was the the the rush, the adrenaline or the endorphins of uh solving problems that I just love. Absolutely behind you. And I also my rule was always if I come home with enough money in my pocket to get my car out of the parking lot, I've done all right. And I never I I never gambled all that much. I hear you uh money-wise, but it was just the fun. And you know, eventually I got tired of it. I I thought I could, you know, there's other ways I can I can get these uh endorphins. Uh but yeah, yeah. So I hear you about the gambling, and like I said, I haven't I haven't played blackjack in six years, and I don't have any plans uh on the future uh to do that, but uh I I I do enjoy it.
Wolf:Um as for Claude, uh before you go on, I'll just say one quick thing. Claude takes direction in the same way that git uh sort of stacks up.gitignore files. Yeah. Uh Claude Claude uses Claude.md files. So there's one uh in your home directory uh inside.claude. There's one in your project. It'll find anything above your project. Um so my root Claude.md file is public. It's in my dot files. I I need to go get that.
Jim:Um yeah, there are things in there that might be useful. And uh, you know, we did a whole episode on AI. Uh it was episode eight. That was uh today is what, episode 20. So that was quite a while ago, um, three or four months ago. So go back and listen to that. I think that was before you were really using cloud code. Uh I think that was uh maybe you were starting to use it, but now I think you've got a lot more experience with it. And maybe we'll do another episode sometime in the future, uh, getting a little more in-depth in how we use uh cloud code and AI uh for what we do. But let's get on to the feedback from previous episodes. Uh the last episode, uh, 19 was about data centers. And uh that was a fun episode. We got a lot of feedback on that, and it and a lot of people uh really enjoyed it. Uh, there's a couple of points I made that maybe weren't exactly accurate. Uh one of them was uh I you know I mentioned um uh uh 1U servers in Iraq. I had said it's uh an inch and a half. Uh no, it's an inch and three quarters. So uh uh a C a server in Iraq, a one U server takes an inch and three quarters. You can fit at most 42 of those in a standard rack. Most people put no more than 38 because they need room for a network switch and a little bit of cooling space in there. Uh so that was that one little thing. And Wolf had asked a question during that episode. Uh, when I was talking about 1.4 gigawatts, uh the the data center was spec'd out at 1.4 gigawatts. Um he had asked, is that gigawatt hours? Well, no. Uh the the spec on that is that's the most current it can draw at any given instant. Um for instance, my house is a 200-amp service. That means the the the size of the wires coming into my house can deliver up to 200 amps. Uh at 240 volts, that's 48,000 watts. So that's 48k watts. Uh a data center is 1.4 gigawatts. It's you know, it's a couple of orders of magnitude larger than than, you know, several orders of magnitude larger than what my house uh will draw. But that's that that says nothing about per hour. That's just uh how much current that thing can draw. Uh when you start talking about gigawatt hours or kilowatt hours like you see on your electric bill, that's just the electric company charging you for it. They charge you for how many hours of uh uh kilowatt you use. And you know, you might see your electric bill and it might say you're you know 300 kilowatts or 500 kilowatts for the month. Uh, that's all. The the kilowatt hours, that's an electrical company thing for charging you for that. Um one more thing about that episode. Uh, you know, I I mentioned Marlon helped me with a lot of that information because he is so freaking smart with that stuff. Uh, he wanted me to point out that um, you know, I talked about how communities don't like data centers coming in, and and for obvious reasons, you know, the the the noise, the electric usage, the water usage. The thing is, they all love what data centers can do for them, you know. Yeah, take away a data center, take away Facebook, take away uh uh Instagram, take away Google, and people would be screaming because they love what they get out of these data centers. It's kind of like an airport. Nobody wants an airport built near them, but they all want airports so that they can travel. That's that's kind of like how data centers are. So that's a good point that he made, and I and I'm glad to repeat that for you. Um I I don't think we have any other feedback from other episodes, right? So I know I think we can get right into the meat then. You know, 18 minutes in, we're finally talking about what we're talking about. Um today we're talking about Git. Um, the the uh it it's a fast, scalable, distributed revision control system. Uh if you look at the man page for Git, it's kind of funny because it says Git, the stupid content tracker. Because that's all it's doing, is tracking content for you. And in our case, the content is is source code for programs. But you can do a lot with Git. You can you can store your RC files, you can store uh uh documents. Uh you know, if you're an author, you could you could uh if you're using a text editor, uh you could use git to to track that. I guess even if you're using like a word or or something, you you can use git to track that.
Wolf:And although diff is harder.
Jim:Yeah, diff on a on a on a blob of data is is difficult. Um but anyway, uh so Git's been around since 2005. It was written by Linus Torvalds, the Linux kernel guy. Uh he had an itch to scratch. Uh they were managing kernel development using BitKeeper. And uh and there was a um there was a kind of a fallout uh with the company that provides GitKeeper. Uh it's it's a proprietary uh uh source code management system. Uh and uh they were granted a license to use it for Linux kernel development. I think anybody that wanted to develop on the kernel uh could use BitKeeper, the license extended to that. But then um uh Andrew Trigell, the guy behind Samba and Rsync, uh he tried to reverse engineer the protocol for BitKeeper so that he could write some tools that could interact with with the BitKeeper uh repository. And um uh the BitKeeper guys got angry and revoked the license, which is not good if you're a kernel developer and you're relying on that to to uh manage your source code. So Linus did what Linus does, he just wrote his own. And uh apparently it took him like two weeks to get it into a working state, and once he had it up and working, he handed it over to somebody else to maintain it. He's not really involved in it anymore, although he's probably uh a huge user of it. Um but it's a brilliant tool. Uh, and you know, almost everybody writing software is using it. Not completely everybody, there's other solutions out there, but an awful lot of people are. Using it. And I think a lot of people are like me. They kind of use it and they don't really totally understand every aspect of it. So here we have uh Wolf. Wolf knows more about Git than anybody I've ever met. When I have a problem, it is so comforting to know. I can ask Wolf uh, you know, what's going on here, and he explains it to me. And and he turned me on to Git, I think, in 2017. So I've been using it for uh like eight and a half years. And uh every time I've had a problem, he's been there to help me out. And uh I don't call him so much about this stuff anymore because he did such a good job of helping me. Uh, but I have to think that all the questions that I've asked him over the years, uh, a lot of people out there probably have similar questions. Uh so we're gonna talk about those and we're gonna discuss some some features of Git that you probably don't know exist, but once you do, you're gonna think that's pretty cool. I'm gonna start using that. So I don't know, how do we start this thing, Wolf? Uh what do you want to say?
Wolf:There there is something I wanna start us off with, if that if that's okay. Yeah. Um so you know, Git has its own philosophy and its own reasons, but I want to talk for just a second about the thing. Why git? Why should you use well, source code control at all, but git in particular. And the reason is source code control systems, um, and I have experience uh with how and why this is true for Git, they provide a kind of safety um that is uh super, super high level. You are everything that you commit is safe. Um if you also push it, uh which we'll talk about later, um, then it's safe even from the idea of your disk failing utterly. Now, things that you haven't yet committed, uh uncommitted changes, what we modifications in your working directory, those are completely unsafe. So don't get confused. If you delete those, they're gone. If you commit them, they're available forever no matter what happens next. Um as long as there's a repository with those commits in them, you can get them back. So safety is the core. Uh and on because of that safety, because you can get back anything the way it used to be, that means you can experiment. You can try things, and when they don't work out, uh you could go back, and when they do work out, great, that's the new world. That's where you are. So Git is like the answer to all of these problems that you didn't realize were the problem that you are having. Um that's the whole reason to listen to the rest of this episode. Because your source will be safe, you can try things when you didn't realize you could try things, and you can get back to where it was still good. Um so that's something I wanted to get off my chest right away. Uh, and I feel like from here we should talk about the things that you encountered on your Git journey and what uh what's important about those things.
Jim:Why don't why don't you you tell me what it is? The top of the list, you've already mentioned it. Let's get into it a little bit. But commit. Uh what does it mean to commit?
Wolf:So Git is different from any previous source code control system. Uh and since Git, other things do do this. But as far as I am aware, Git was one of the first, and that is this. When you get to a point in time where things are good, um Git takes an atomic snapshot. When you say the word commit, the things that you have collected together to be part of that commit, that exact version gets saved just the way it is. Now, before this, in the olden days, uh they weren't atomic and they weren't snapshots. Instead, um, like CVS, for instance, would say, okay, uh, we're working on the file main.c. Uh main.c got these changes, and now there is a file that essentially is main.c, but with hunks that are, and then these changes, and then these changes, and then these changes. Not at all a snapshot. Um Git, you can actually look in the thing it calls the object database, which is what's inside that.get folder, um, and you can pull out individual objects. Where a file is an object called a blob. Uh they're all compressed, and and there's things layered on top of that that aren't really part of the underlying model that makes it even smaller. But the whole history is there. Um, and except for these things layered on top, it's not about deltas. Yeah, you can diff two things and find out what actually changed between these two things, but that's the thing Git actually does on the fly. Um, so that's what a commit is. A commit is a snapshot, and it's a snapshot at an instant in time. Um there's important stuff to know about commits. Commits are the fundamental unit in Git, and every problem you solve has something to do with commits. So there's a couple of things that when you're building a new commit, you want to make sure you get right so that later, when you have to use one of these techniques to solve a problem, your commits help you instead of hurt you. Um so one thing is there are standards or uh suggestions for how to uh lay out your git commit message, like what to say and what to mention, and that the summary is a short line so that it can show up in all the tools that so show short things and uh you know what kinds of things you should put in it and how to reference things, you know, stuff like that. So definitely format your message correctly. Um your commits in your own before you have arranged them, and there are tools so that you can use commit in the same way that you might use save when you're working on a Word document or in a text file or whatever. Uh you could just commit over and over again, and each commit is just my current save. Um the thing you want to present to the world is a chain of commits, if it takes three, you know, or if you can do all in one, fine. Where each commit in that chain is one thing. A complete idea that runs, builds, does whatever it is by itself, so that if you went back in time with your with your repo to that commit, your thing, whatever you're building, could run. But if you take the whole branch or what whatever you're working on, the chain of three, um, you've added a complete thing that took three features to build. So each commit does one complete thing, each com commit builds, and you try to go in a straight line. When you make a commit, um there's the easy way, you can say get commit minus a. It's stuff that makes it be everything you've changed becomes part of this new commit you're making. I don't do it that way. Um and as you get more advanced, you won't do it that way either. Uh instead, there's this thing called the staging area. It has a couple of different names. But what the staging area is, is it is an exact picture of what will be committed when you say commit. Um so you have modifications. I changed this file and that file and the other thing, but you only add main.c to the to the staging area. You stage it. You say get add main.c. If you say commit right at that moment, you make a brand new commit that only has add.c. And if that was the right thing for your commit, that's that's the right thing. So learning about this extra thing that you never had to worry about, the staging area, that's part of your journey. Um when you want to undo something, uh, modern versions of Git offer the restore command. Um if it's kind of a big thing and you're not gonna you might want there's another command called reset. Sometimes that comes into play. And a really, and this is gonna be the last thing I say about commits, as you're building up that staging area, as you're putting together what's going to be your commit, maybe you were editing main.c and you were actually sticking in two different things, two different features at the same time. Um you're making it take parameters, and you're also making it um not uh work on top of um map tiles that are in Florida. I don't know, something like that. But two totally different features. You just happen to edit them all at once. Um instead of just adding and committing main.c, you can add interactive and you can step through the the your terminal will show you this, the hunks of changes in that file. Each hunk is a uh sort of a touching, connected set of lines that changed. And for each one of those hunks, you can add it or skip it, or um it might be that you can actually split it into smaller hunks. You can't always do that, but it uh sometimes. So add interactive lets you make the two commits that you meant instead of the one commit that would have been easy.
Jim:Um I've used this a few times and it's it amazes me every time. And yeah the way I see it works is you do a git add minus i and it pops up a little user interface, and it's not very intuitive for me anyway. I have to relearn it every time I do it. But the minus i causes it to be interactive, and then it steps through the the hunks and it shows you the little hunk, and you can decide you want to add it. And when you're all done, you've added some hunks, maybe not all of them, and then you do your git commit, and that commits that change, and then you go do the git add again, uh, and maybe you want to get all the rest, so you just say git add, or maybe you want to get part of it again, and you so you do git minus i. So you do this until you've committed everything, but it's all separate commits, and it's it's pretty slick. I like it. That's a superpower, yeah. So uh so you know the first thing you gotta do if you're using a control source code control system is you're gonna make changes and commit them. Uh, once you've committed a few things, then like me, I want to look at what's out there. What did I change? What did I commit? What what are my commits? What's the history of what I did? Um, and there's a number of commands you can use for that. Tell us a little bit about those.
Wolf:Um there's tons of ways of looking at things. Um, probably chief among those is git status. Um I actually and let me say right now, um, git status prints a lot of stuff. It prints the things that you've added that are ready to be committed right the second. Things that are not added, uh, it can also, if you ask it nicely, print the things that are ignored, because we talked about the ability to ignore things. Um use git status with no options, git status prints instructions on how to take things you've already added and make them not be added. Oh, it was a mistake. I don't want to commit main.c. How do I make it get not be added? Right.
Jim:You did a git add on a file and you realized you didn't really want to add that file. So you can undo that with um I I've I forget the command. I I I don't bother committing it to make it. Git restore minus minus staged. Yeah. Get status always tells me what to do.
Wolf:That's right. So now I say git status constantly. And I are I know those instructions. I can always get them by saying git status. But there's a thing in git you can do, which is a configuration option. You can set it in your.git config. Um it's called an alias. And I have aliased uh st git get st to mean status minus s. Uh I think you can also say minus minus short. And that doesn't print any messages. It's actually um a very compact list of what the changes are. I do that constantly. Um I absolutely do it before a commit. I absolutely do it when I get into a brand new directory that I was working on yesterday or whatever. So git status um super important. Uh a thing that you might use git status for is to see do I have any modified files at all? Do I have any added files at all? Uh, for instance. That's so important that to me anyway, and to everyone that I teach this to, it is worth modifying your prompt. Um so my prompt shows me the branch that I'm on, and we're going to talk about branches in a second. If I have any modified files at all, I'll see an asterisk. If I have any added files at all, I'll see a plus. If I have any brand new files that could be added, but haven't ever been part of the repo yet, um, I see a percent sign. There are multiple ways to get your prompt to show you these things. Git comes with, um, if you have the the actual git repo, if you've cloned git itself, um, you can grab the git PS1 prompt, and it's it's a shell function that you can call from your PS1 so you'll see it. Or you can do the easy and in my opinion, significantly a better thing, which is use one of the existing prompts in the world. I use Starship. Um, it's super fast, it has tons of modules. Um Git is one of them, and it does all the things that you want. Um so now you're showing it.
Jim:Yeah, in your in your prompt that you're sitting there in bash or Z shell or whatever, and uh normally it would show like what your current directory is and maybe how you're logged in. You know, if you're logged in as yourself or if you're super user or whatever. Uh but you can show additional stuff in there just by modifying the PS1 environment variable. Uh there's a couple other things you can change too. But in your case and in mine, we display if if we're in a uh a direct a subdirectory of a git repo, it'll tell us a little bit of information about that repo right at the command line. We don't have to type git status, it's just there.
Wolf:Right. Stuff you really want to know. Because in a world where, and this isn't everybody yet, but maybe by the end of this podcast it will be. But in a world where there are multiple branches, knowing which branch you're on, am I committing to the right branch? Because usually a branch is I'm gonna try to do this thing. And when you're editing a file, it's probably about trying to do this thing. So you want to be on the right branch.
Jim:Yeah, I've got the branch in my prompt and it and it helps. It uh it helps me know which system I'm on because we we use git to deploy our software for our clients. And if I'm on their system, they're they're in the main branch. So that's my indicator. I don't want to be making changes here. If I forget which system I happen to be logged in on at that moment, the prompt helps me figure that out.
Wolf:Uh so uh the actual way that commits are all arranged is uh more complicated than I'm about to say. But it's I don't want to get into it because I don't want to uh lead you guys down a path that has nothing to do with the things we want to talk about. But as we've been saying right here, uh often what you care about is that the sequence of git commits that you have made is just that. It's a sequence, it's a chain. There was the very first thing you committed, and then when you made changes and committed that, that came on top of the previous commit. So if you're at commit number eight, you could say git log, that's one of the commands, and see all eight of the commits that you go back to. This is uh these commits define what is in the source code that you are looking at right now. So git log um sounds simple, uh, and I'm not gonna go into all its powers, but if you do help about log, log can be used for so many things. Um you can find commits by specific authors, commits that changed specific functions, added or removed specific um strings, um, happened in a particular date range. You can show just the summary line, you can show the whole message, you can show only commits whose uh commit message named a specific uh ticket, whatever your ticketing system is. Log is gonna be your friend. Um another thing that is your friend is uh git show. Um with no uh anything besides uh git show, it's gonna basically uh show you exactly what happened in the commit that is the thing that just got committed.
Jim:Right. So you let me interrupt you for a second. You do a git log and it shows you a commit ID and then a summary of uh well it shows you the commit message with some other data, like the time and who did it and stuff. It's primarily it's going to show you your commit messages, and it'll show you the most recent one at the top, and then the next one, and the next one access. So you got this commit ID, and with git show, you can say uh git space show space that commit ID, and it'll show you a it'll show you the code that was in that commit. It's like a big diff, right?
Wolf:That is exactly what it is. And you can do that with any commit anywhere in history on any branch. Yeah. You can show uh individual files. Um Git show is super powerful, and Git Show takes a lot of the same uh option flags that diff does.
Jim:Yeah, and you you mentioned uh if you do if you look for help on git, there's a couple of ways to do that. Uh the one I use all the time is uh if you're on a uh a Linux box or a Mac, um the man command, and I'll type it like man space uh git hyphen show. And it's the sh it's the man page for the show, the git show command. Uh man space git hyphen log and it'll show you that. Um you can also isn't isn't there a built-in help in git? Can't I say like git show help or git help show or something?
Wolf:Git show minus minus help. And by the way, the thing you just said is very interesting because if you happen to be a scriptor and you write a script or program, whatever, named git-uh foo, that and you put it on your path somewhere so it's executable. If you're in the terminal in a repo and you could call git, you can say git space foo, just like foo is a git command, and it'll run your script.
Jim:Really?
Wolf:Yeah.
Jim:I did not know that. Anyway, keep keep going. I didn't mean to interrupt you there.
Wolf:So diff is fundamental. Um when you show something, if you're just showing a single commit, that's gonna show you the diff between that commit and the commit right before it. What did this commit introduce? Now remember, a commit is actually a snapshot. So git is calculating that diff on the fly. Uh and just like show and just like log, diff takes a ton of different interesting parameters that can um make it uh do what you want. Uh in particular, uh minus minus uh name only, uh instead of showing you uh changes inside files, it just shows the names that changed.
Jim:Yeah, so if you do a git show, a commit ID, dash dash name only, and and I forget which order those go in. Does the name dash dash name only go after or before? I don't remember. But uh you do that, and instead of showing you the diff in in each file, it just shows you the name of the file. So you can see what files did this commit touch.
Wolf:And if you only want the Perl files out of that, then uh a thing you could do is pipe that into uh grep and then give an expression that said dot pl. Um and then you would have just the perl file. Anyway, um but a thing that people love to do in Git, uh, and then once they run it, they are sad, um, is git blame. Yes. Um I think it was called annotate. Uh I don't know what people want to call it. Blame seems very, I don't know, political to me. But the what Blame does is it will tell you um for any line, uh it'll show you the whole file. Um usually it's integrated with editors as well, so that uh you can s actually see the lines and scroll around. And but what it shows you is the commit ID that that line was last changed in. And from that, you can know if you wrote your commit message right, why did this line change? Who changed it? Now, the reason most people run git blame is because they're like, who in the oh my god, who did this? This is ridiculous. Why would they write that? This is me a couple of times a week. They say get blame just to find out who the you know criminal is. And it's them.
Jim:It's like, who would do that? And then it turns out it was me, yeah, just like the day before yesterday, right?
Wolf:Yeah. So all of these things are non-destructive. They show you information. Um, git log shows you a list of commits strongly related to git log is um a thing you absolutely need to remember, and that is git reflog. R-E-F-L-O-G. Why am I telling you to remember this? Because git reflog is the thing that helps you get back to a place that works. If you didn't do the things you need to do to make experiments be okay, you know, do your experiment on a branch or, you know, whatever it takes. And make your commits the right way, the reflog will show you where it was good and help you get back to that place. Um, and it turns out there's two kinds of reflog. They they both look and act the same. But one is for this branch, um tell me what commits happened on it. Where was the head of this branch? What things did you do? Oh, I merged, oh, I rebated, oh, I other stuff you haven't heard about yet. Um, but there's also one, if you just say get reflog, that's not the reflog for any particular branch. That's all the places your directory has been checked out to. So if you have three branches, main and develop and weird experiment, and you switch between those. Um, so first you're on one and then the next one, and then the third one, there's three entries in the reflog. And they say why. One says on main and one says on develop. Um so you can see where you've been, where you're actually looking, what the editor says. Um reflog is gonna save your butt. Um so I think that's everything I want to talk about about showing things, and we're not going nearly fast enough.
Jim:No, this is gonna be a long, uh, a long episode. Um, but it's still interesting stuff. Um so one thing you you sort of drilled into me way early on when I started using Git was to use branches. Now, primarily we have a main branch and a dev branch, uh, but occasionally I'll create a uh a feature branch. The the idea between main and dev is uh main is what I give my clients, dev is what we're working on. When we're when we've made a bunch of changes, committed them to dev and we're happy with it, then we'll merge dev into main and deploy that to the client. Uh but you you keep telling me I should be using all kinds of branches, and I do somewhat when I'm working on a feature. So tell us a little bit about that.
Wolf:Branching at all is a superpower. Um this gives you uh at least this one really important uh feature, which is instead of a world where all you have is I'm working on the source, and if an experiment goes wrong, I can go back in time with the reflog, instead of that world, you can have this world where you have a branch, which is one course of development, um, and let's call that branch release or main or whatever you want, and you make promises. Git doesn't help you with these, this is just a promise you and your team make together. You say release always builds and passes all tests. If at any time we need to give somebody the very best uh working instance of our product with every bug that we know of fixed, it may have bugs in it, but n we don't know about them yet. We can just build main and hand it to them. Or release, whatever we call it. And then you could have another branch that is develop. And develop is everything that's in main plus some other stuff. Whatever you're working on that isn't ready to be released yet. Um, isn't finished or doesn't you know, might might have bugs.
Jim:You're not ready for other people to see this work yet. You're you're in the middle of making changes. It's a draft. Exactly. Yeah, it's a draft.
Wolf:Yeah. Now if you're on a big team, um, you know, four people, let's say, uh, it's totally reasonable that each person has branches, especially if you're in a situation where you have to do things um at multiple at the same time. So you would have a thing we call a feature branch. And a feature branch is just what it takes to implement this one thing. And when that feature works, maybe you merge it directly back into release after your whole team has code reviewed it. Or maybe that means it's ready for develop to see how it interacts with all the other stuff people are working on. Are there any logical conflicts? Um, but the notion is that if you only look at this one branch, nothing else is there to interfere. It's main plus the stuff you are doing just for this one feature. Another thing you might use a branch for is there are points in time, like you have customer A, and customer A got a release in March. But, you know, now it's December. So Maine has moved on, but for whatever reason, they don't have a dis a December release. Uh what they've got is from March. Um and now they need a hot fix. There's some bug, it's a problem, you didn't know about it before, and you need to fix it. What you do is you go to the commit that is their release. Um, this is the s the soul of a thing we call reproducible releases. Um, and your release is probably tagged, that's a thing we can talk about later, and you make a branch off of that. That branch is a hot fix branch. You figure out how to fix that bug on that branch. And then you, when it works, you ship them a release from that hotfix. So now they have exactly the release they already had, plus only the fix for that one bug. But that bug fix, a commit, isn't on release itself yet. It's branched off of release. So now you also do a thing that we're gonna talk about called cherry pick. You cherry pick that release fix or merge, either one of those is fine, back onto the main release branch. Now everybody has it. They have exactly the release they want, you've got the bug fix, everything is good.
Jim:I've done that. It's very, very powerful. It's really cool.
Wolf:Um yeah. There's two other tiny things I want to say about branches. Um I told you enough that that you would understand why you want to have them. But how do you focus on one branch or the other? How are you making changes on this branch, but not on that branch? And how you do that is you use the command get switch, and then you name the branch you want to go to. Right. So you've you've created a branch. When you do that, that's your current branch now. Um, and you can switch back. Make your commits, make sure that you don't just leave changes hanging around. Right.
Jim:I mean, so now you've got you've got uh your main branch and you've got maybe your dev branch for development, and then you've created a feature branch. You're working on this. Now you've got three branches. You can do the git branch command to list them, right? It'll it'll list all the branches you have. And then you can do git branch uh git switch and then name that branch if you want to get back to main, git switch main. And that takes you back to main. And it's pretty cool.
Wolf:And it's now a thing to know is this if you have modified files in your working directory and you switch back to main, but you didn't add or commit those changes, well, they're not part of a commit. They're files in your working copy. And that means they're still going to be modified when you're back in main. Right.
Jim:They're still in your working directory.
Wolf:Right. In other words, changes follow you. I mean, that's not really what happens, but that's the way to think about it. Changes follow you. Yeah. So if those changes aren't actually appropriate for main, um there's two answers, uh uh a good one and uh okay one. The good one is commit them, um, because maybe they're you're ready to commit them. And the okay one is stash them temporarily. Git stash is a command. Um it doesn't matter how git stash works, just the fact is those files will be safe, as safe as if they were committed, because it turns out they are. But they're in a little holding place called the stash. They're in a little holding place. Uh a thing to know about the stash is don't use the stash for long-term storage. Set some things aside with the stash so you can do some operation and then bring them back quickly. If you have stuff you want to keep around for a long time, commits and branches are the way to do that. Um and finally, there is one other situation that is super confusing to get to the sounds scary. It does sound scary. Um, and and that is this. First, I have to give you this little backing piece of information. A branch is really just a name that points to a specific commit. So when you switch to a commit, all you're doing is saying, hey, working copy, you now need to be a reflection of what the snapshot looked like in such and such a commit. But the magic of a branch is, if you're on a branch, if you've switched to it, if you've checked it out, whatever, and you make a new commit with changes that you've added, the magic of a branch is that name, the branch name, which is called a ref, by the way, gets updated to point to the new commit. It advances as you grow the chain of commits. Okay. But y pointing your working copy to a specific commit doesn't need to have a branch. So I could check out some commit in history, but now I'm not on a branch. Um and if I make another commit, there's no magical name that points to that commit that gets updated and moved. And more importantly, because nothing long-lived points to that brand new commit, um it is dangling. Um that means that if you don't act, that commit will be garbage collected in some amount of time, 30 days, 90 days, uh, and you can set this um in your config. When you do this, the instant you do it, git reports to you the danger of your situation, and it tells you with completely obvious terminology. It says, you are on a detached head. What what does that mean? How are you supposed to know what that means? I don't know. Um, but that's what that means. You're not on a branch. When you're on a detached head, you can make a branch right where you are, and that's probably what what you want. Yeah, this is the same thing.
Jim:The safe thing would be do a uh git switch and then some tag name or some commit ID to get to that point in time, and now you're on a detached head, so now you create a branch and now you're safe, right? You got the right.
Wolf:And in fact, you can do that all in one step if you want. You can say git switch minus C, the new name. Uh minus C means create. Um then as a final argument, where. Um and where is any other ref. It could be a tag, it could it could be a commit hash, whatever.
Jim:Okay, so now you've got a branch, and now you can make your commits and stuff, and uh and uh you're safe, it'll be saved. Um and then you can decide to either throw it away or or merge into dev or into main or whatever you need to do.
Wolf:That's exactly right.
Jim:And if you do throw that branch away, you don't lose your original code. You only lose whatever you committed on that branch, right? If you throw that branch, if you delete that branch. All right, so you've talked a lot about references. You mentioned reflog and and commits and and and stuff, so why don't we talk a little bit about really what references are?
Wolf:Um okay, so the we we talked about the idea of snapshots. And a snapshot means that the entire picture of your uh working directory, your working copy, um is stored in individual objects in the object database. This actually turns out to be way more efficient than you thought. Uh the reason is maybe you commit three files, but every other single file is exactly the same. Well, in Git, a file doesn't have a version. Um there's a piece of content, and it if it the content is the same in the previous version and this version, then it's just the one object, it's the same object. So all that changes when you make a commit is um the chain from the deepest file of this tree of objects all the way up to the root, and then there's a new object added that is this commit object at the top. So the kinds of objects that are in the object uh database, forget, are blobs, that's a file, uh trees, that's a directory, um uh commits, um that's the t the root of one of these snapshots, and this other thing, which is a tag. Uh and that's only annotated tags. Uh there are tags that don't have objects and they can go away pretty easily. So if you really want a tag, you probably want an annotated tag. Now a ref is any way, anyway at all, of um choosing one of those objects. If the object has a name, like a branch uh is a name, um that that ref uh here's some example refs. Every one of those objects has a hash. An ob an object hash for a commit uh that same object hash, we we we often refer to it as the commit hash or the commit ID, but essentially it's the hash of um the underlying uh data that is stored in that in that blob or tree or whatever. So spelling out that hash, um that's a ref. A branch name um is a ref. You can modify any other ref with tildes. For instance, if it's a commit, um tilde goes back one. Like there was the root commit, and then you made a new commit, and it's uh and there's another name for that commit. It's the word head in all caps. Head is a ref. Head tilde means one commit back from head. That's the original commit we just talked about. So head is a ref and head tilde is a ref. It turns out n it's sometimes not a chain. Sometimes, you know, like when you merge or when you branch, if you merge, uh a commit can have two parents. If you branch, then um uh commits have children. So there are other characters that you can use, like the caret, you know, the thing that's the shift six, um lots of things uh that uh can refer to objects in the object database. Every one of those things is a ref. Um there are more complicated things that you can assemble from refs, for instance. Um you can assemble a ref range. Um that usually only applies to commits. Uh and a ref range, there's uh two different forms of it. Um so you would say old commit ID dot dot new commit ID. And what that means is all the commits that you're based on in new commit ID, everything that came before you, everything that if you followed the link of parents back in time that you could reach. Um, and the the word for that is contains. Uh and then that old commit ID. So it's everything you can reach from new commit ID, including new commit, but can't reach from old commit ID, including old commit. So that's you see this a lot. Git log a dot dot b. That's everything in B that's not in A.
Jim:Right. A place I've used it. Um we talked about git diff, and that that shows you the difference between what the file on your disk and the previous commit. If you just say git diff uh file name, it'll show you the differences, how you've changed the file since it was last committed. Uh you can also do git diff and then a range of commit IDs so that you can look at the changes, like you know where you're at now, but let's say you want to see what changed between two commit IDs further back in your in your history. You can do a git diff uh commit ID dot dot git commit ID, and it'll show you those changes from history.
Wolf:And it's obviously I actually have to tell you something kind of embarrassing.
Jim:Yeah?
Wolf:And that that is this. Umrdinary human beings want to use that dot dot in git diffs so much uh because the actual syntax is git diff, one commit, another commit. That's it, no dots. Um because you don't need the range. It's not like it's oh it's gonna look at the snapshot from one and compare it to the snapshot from the other. Like the idea that it's a range is not a thing. But people do it so much, like they do it more than they do the thing that it was designed to do, yeah. That they made it work. Uh and then on top of that, I want to add this. If you say git diff and you give two commit IDs, and then you say dash dash with spaces around it. So it's a dash dash all by itself, the thing that says, and now other stuff that aren't options. Right. Uh, and then you name a file or directory, it narrows the diff down to just be about that file.
Jim:Right, right. Very handy. So yeah. Um I I think we can move on, eh?
Wolf:Yeah, let's do that.
Jim:There's a lot more to refs than what we just talked about. It's it's uh I encourage the listener to go dig into it, uh, take a look at the man page or look online for the information there. Uh, but uh once you've made your changes uh locally and uh you want to share them with the world, uh you would push the the changes. You'd push your commits, right? Is that am I saying that right? You you've committed your changes and now you want to push them up to a uh a canonical repo.
Wolf:Um I think I'm saying Yeah, so there's a couple things to say about this. One is um Git doesn't have a model in mind for how you're going to use more than one instance of the same repo. It's a 100% appropriate, more appropriate than a normal person thinks, to just have one git repo. To go into a directory, get in it, and start editing, and that's all you have. Uh if that's the case, you'll never say push or pull or any of those things. Right. Um another scenario is um everybody has a canonical repo and they're all doing work on this same thing. For instance, um, Git itself is a project that's shared by everybody, and there is one repo that is the truth about Git, and it lives on GitHub. Um, and you can fork it into your own copy and then clone that, or you could directly clone it. Clone is a word you only use when there's already the thing you want somewhere else. And when you do that, fetches it. That's it uh it uh yes. Fetch fetch has specific meanings, but yes, that it turns out that meaning is also true here. Um so the there's the case of you're not sh working with other people or sharing at all, and then there's this model where um you have a specific what we call upstream repo, the one you came from, and typically that upstream repo has a special name uh r to you. Uh to you, you often call it origin. Yeah. And if you if you have one of these, which is called a remote, um, then you can configure exactly what happens, but you can grab the latest changes that other people have put there, um, people who are just like you and have their own copy on their own machine and they're pushing and pulling. You can grab their changes, which you do by saying git pull. You can take your changes and push them somewhere, back, you know, back up to origin, which you do by saying git push. Um if you have a remote, it turns out you have some additional refs in your repo which have special names. For instance, if there is a main branch at the canonical repo, then once you fetch from them, you will have a ref in your system, origin slash main. Now, it counts as a branch, a special branch, a remote branch. Um you you can't edit it. If you want to check out um main, you can check out main, and if you didn't already have your own actual local branch, it sees that there is origin main and says, Oh, you meant that. I'll make you one. And it makes a brand new branch named main that starts at um origin slash main, the r the commit ID that that refers to. And it does this magic thing that it uses metadata, which it stores in the config file for your repo, for this repo, that says this branch pushes and pulls to this remote branch. Um now they're tied together.
Jim:That remote branch might be GitHub or or um uh what are the other GitLab? Uh or you might just run your own Bitbucket or Bitbucket. Yeah, you might you might just run your own uh repo served up through SSH or HTTP. Um but yeah, when people talk about cloning from from GitHub or pushing to GitHub, that's what they're talking about.
Wolf:They are. Uh now Git itself doesn't enforce this model. It doesn't have a model, like I said. So it could be that you and I are connected over SSH, um, and I happen to have a IP address that leads to you, or maybe a name or something, and I know the path that leads to the Git project that you and I are both working on, and there is no the thing that's up on the server has no files. It's it's essentially just what's in the.git directory, it's just the object database and some other things. But there's no place to put actual source files. That's called a bear checkout. Minus, minus bare. Um if you and I are both working on a project and it's not bare, we've got actual files, that doesn't matter. I could still clone your repo. You would be my upstream.
Jim:Um, canonical uh repo and you're pulling for me. Yeah, you're cloning for the remote.
Wolf:Or you can I sure could add remotes where we're both looking at each other. Are these good ideas? Probably not. This uh canonical thing on GitHub works pretty good and is almost certainly what you want in most cases. Yeah, there's funny cases, but okay, whatever.
Jim:Now now I I mentioned that uh for my client, I um I I use Git to manage their release. Uh uh actually have uh Git repo checked out on their server in production. And when I'm when I'm gonna do some cherry picking, I might have a hot fix and I want to make sure that I can uh that I can uh uh do this cherry pick without a problem, without a a conflict or something. What I'll do is they've already uh checked out or cloned from GitHub. I SSH to their machine and I get into the temp directory and I do a clone from their production tree on their on that disk to the temp directory. And then I apply my my hot fix to that version in temp just to make sure it applies cleanly. And if it does, then I I go into the production directory and I and I do the cherry pick there. Uh it's just my little safety thing so that I don't break production by installing something that had a merge conflict or or some other disastrous thing. Uh that seems pretty awesome. Well, you taught me that.
Wolf:Uh I guess uh pushing and pulling, I'll just say one last thing about that. Yeah. This is another place where, just like switching branches, it's another place where your loose files that haven't been added or committed yet can get in the way, can obstruct you. This is another place to use the stash. Yeah.
Jim:Yeah. You know what we we didn't talk about, and I think we kind of need to, is I don't even have it on our outline, and that is uh merge conflicts. Oh, we yeah, we do have that. I'm sorry. We're gonna talk about merge conflicts in a couple of minutes, and Wolf's got some interesting things to say about that. Uh, but why don't why don't we go into let's start talking about merging and and that kind of fun. Okay.
Wolf:So even if you don't have multiple branches, uh, but you are working with other people, in a way, you do have multiple branches. Um there is this idea of combining code that were two different paths that the code took. So an example is uh if you cherry pick a commit from someplace else, if you have two branches and you're ready to incorporate the changes from one branch into the other, if you have been working on main or on develop or whatever, and you pull uh from the remote and the remote has other people's work that is also on that same branch, um, that's like having two branches, their work and your work. And now you have to combine that work. Um this is a fundamental problem in every single source code control system. Uh and there is a thing that happens, and not only is it universal, but it is also the worst and most annoying problem that every source code control system has, and the worst thing about it is it requires human knowledge. Um, and this problem is merge conflicts. In a merge conflict, um a good source control code control system, and there are many, um says, well, you both changed this file, but you only changed line 10. And this code you're trying to merge into it only changed line 50. I'll just take them both. No problem. That might not actually be no problem, because maybe your line says don't run at all, and their line says, um, we absolutely you know, they say something that's logically different, but in totally different places. Well that merges. You know, the thing won't run, but you you got what you got. But the the thing, yeah.
Jim:The the the bigger problem for me is uh you changed line 10 and I changed line eight through twelve. So we both changed overlapping space, and uh uh I I do my uh merge and then you try to you do yours, and and there you get the merge conflict. Uh it doesn't know what to do because we both changed the same lines.
Wolf:So the thing that you have to do when you get a merge conflict is your goal is one file that does the thing you want, which is whatever features the one side made that are worth keeping, plus whatever features the other side made that are worth keeping. And to make that fix, to combine those things, you need knowledge of both sides. A human being has to look or something that can have knowledge and say, you set it to 10 and I set it to 15. Is one of those right? Oh no, it turns out we don't want that thing at all. What we want is a buffer. Or or whatever. Like it's a thing you have to do. Yeah. Right. Um so you have to resolve a mar merge conflict that way. But this leads to a couple of things that are absolutely worth saying. There is a category of I don't know, what's the exact opposite of team player? Um, there's a thing you can do. You can merge someone else's changes, but when you actually get them, essentially just throw them away. Or don't even merge. Just say, oh no, I'm right, and push yours up to the server, completely replacing theirs. Um, we call that a force push. What you're saying is, oh, forget everybody else's work. I mean that's the whole thing. That's what you're saying. Yeah. Forget everybody else's work. Um that is, first of all, if you're the kind of person who is saying that, probably your work was already not good enough to substitute for their work. Um so you're not you're probably not a good programmer. It says more about you than it does about the code, right? Right. Uh and number two is yeah, you're not you're not working for the team. You're not doing the right thing. So force pushing, usually bad, especially in a team scenario. And merge conflicts apply to every source code control system. They are the hardest problem to fix, and um, you know, maybe you know enough to fix it yourself, or maybe you need the other person to come and help you. Um, it can be annoying. So that's merge conflicts, and it happens all the time. It happens when you pull the branch you're on and somebody else did work. It happens when you're doing an actual merge, putting develop into main. It happens when you are cherry picking. I just want this bug fix that I put as a hotfix for that release back in March, and you're bringing that to the tip of main. Uh, it happens um when you do a thing called rebasing, which I'm not sure we're gonna talk about, but it's uh a special kind of combining. Um Anyway, I think that's everything we need to say about combining.
Jim:Yeah. Okay, so we've got a couple of advanced topics. We're already an hour and twenty minutes into this, so I don't know how deep we're gonna go into these. There's a few of them. We should at least mention them. Uh you want to talk about uh those advanced things?
Wolf:Sure. Um w a thing that you're gonna hit on a team is you're almost certainly people don't just no matter what, uh if you're using a uh filled out web UI system to manage your canonical repo, then when somebody says, my change is ready to be released, I want it to go on to main, um, they're not just gonna merge it to main because who who the hell trusts them? They're gonna make a thing that, depending on what system you're using, but it's the same thing, is called either a pull request or a merge request or something. And the web UI understands that this is going to be a merge, some criteria has to be met. Usually your team has to approve it, has to go over it, and you need at least two of the other engineers to say, yeah, this is good enough. And the web UI, looking at the canonical repo, has to be able to tell, oh, and there won't be any conflicts. There's a couple ways it can tell that. And sometimes the tests have to pass, whatever. It's not a git function, it's a function of GitHub or GitLab or BitBucks. Pull requests are not git. Or whatever.
Jim:It's GitHub or or whoever's managing your Git repo. Exactly.
Wolf:Okay. Uh another thing, this is a superpower, no question. It's the super if you have managed your commits correctly, um, and you've done the thing I said where each commit is a full, complete, contained thing, and you've worked hard to make them be a straight line, but now you've suddenly noticed a problem, a bug. Something isn't right. And you know for sure it's wrong here in the latest version, but you know for sure it was right in March. Yeah. You can do a thing called Git Bisect. And what Git Bisect does is with varying degrees of success, and you can automate it and you can make it run tests by itself and whatever, but at its base, it does this. You tell it where to start. What is the bad commit? The one that is the earliest you know of that has wrong behavior, and you tell it what is the good commit? What is the very most recent commit, and this is gonna be older than the bad one. If it's not, that means the problem's already fixed, so forget it. Uh so March is the good commit, and now is the bad commit.
Jim:And once you've told it that, well, you can say what you're saying is it worked in March and now it doesn't, and I want to figure out which commit caused it to not work.
Wolf:That is the problem. That is what we're gonna find the answer to. And so get bisect, uh, I think it's start, uh, after you've said git bisect good and git bisect bad on the ones that meet that criteria, and you can actually give those different names if that's not the thing that's gonna make it easy for you. If, for instance, it's you know, does logging or has a red board or whatever. You can make the command be what you want. Um git starts up and git finds the middle of that list and checks that out.
Jim:Right.
Wolf:Um and if you have set it to run a test script, it'll run that test script on the thing it just checked out. Um and if it doesn't need human input to decide whether that was right or wrong, it'll know that it was right or wrong from the test script, and it will then take the half that is left. Well, that one was good, so it must be newer. So it's now between that new one. Yeah, it's doing a binary search through your commits. That's exactly what it's doing. That's so cool. If you don't write a test for it and make it automatic, then yeah, you have to look and you have to say, oh, this one's good or this one's bad. Um and that'll narrow its focus to the range where the the bad commit must be. Uh, but the instant you decide whether it's good or bad, and you say so, get bisect good, it instantly finds the next commit to check out and does it, and you're set to test again. A hundred commits could be wrong in this range, and it takes, you know, maybe six tests or something to find the bad one. And when it has found the bad one, it leaves you there and prints out the information, the commit ID that was wrong, and you can look at the diff. And by the way, if you made your commits the way I said, that commit is probably going to be fairly small, and whatever is wrong, it's probably gonna stand out in that diff. Um bisect is a reason to make good commits.
Jim:Yep. It's one of the reasons. It's fantastic. All right, quickly, uh, you wanted to mention work trees.
Wolf:I absolutely do. Uh so what is a work tree? Um, let me give you a sense first I'll tell you what they are. A work tree, if you're doing something to the canonical repo, um, and in one case, you specifically want the release branch, and you're going to do a build, and a build takes eight hours, maybe a thing for you to do is just clone it twice and set one clone to the release branch and one clone to the development branch, turn make the release branch start doing its build, and you just keep working. Maybe that's what you want to do. But a modern alternative is to make a work tree. And what a work tree is, is it's a real physical working directory, just like a clone would be. Except instead of having a um dot git directory with the object repository in it, which by the way is at least half the total size of your complete um c working copy. Half of it is the files you have checked out and half of it is the object directory. Instead of a.git directory, it has a.git file, which is uh sort of like a symbolic link. It's a text file that has a path back to the original repo. They're they share. Um if you make a change in some branch, if you pull or fetch, whatever you do, anywhere in any of those, the thing that gets updated is the m the objects associated with the uh principal uh clone. And the working directory just automatically gets all that stuff, but it's half the size or smaller. Um there are some restrictions. Um if you have a work tree, um only and let's count the original clone as a work tree for a second. Of all the work trees, only one of them can have a specific branch checked out at a time. You can make a work tree check out any branch you want, but multiple work trees among that set, there can never be a case where one of those is checked out to the same branch as another. Um but work trees, the a great thing about work trees for me is because I'm working in Python, there are virtual environments. In a work tree, I don't have to rebuild the work the virtual environment every time I switch branches if that branch introduces new version requirements. Um that's all I'm gonna say about work trees at the moment. It it can help you. It's a really good thing, it's worth looking up. Um there is one last um advanced topic I want to talk about.
Jim:This is what I like. I like it a lot.
Wolf:Um I'm all about reducing friction. Um and there's lots of points of friction. You have to remember to run the tests, you have to remember to type check your code, you have to remember um to run the linter to see if it's all formatted. You have to remember to check the spelling everywhere. You have to remember to look for white space at the end of all the lines or extra trailing new lines. You know, a file should end with exactly one. Well, um you can install pre-commit in any Git repo and say what checks it needs to run, and those things become part of a list of things that happen automatically without you um doing anything before a commit will be accepted. So you add all the stuff you want, and you say get commit, blah blah blah, and pre-commit takes over and says, guess what? You had too many um trailing, like new lines, um, you can't commit. Not you can't commit this file. You can't commit. Um everything is refused. Um there are behaviors you can use to make the sequence of steps you have to take shorter. Like, for instance, uh you can ask pre-commit to run by itself without actually being part of the commit, and then add everything once you know it already passes. Um but pre-commit, super useful. I always use it. You have good experiences, Jim?
Jim:Uh yeah, I've only been using it for maybe six weeks. Um and I uh what it does, you you do a git add to add your file. Let's say you're just working on one file. You do a git add, and then you go do git commit, and it runs these tests. Uh, and in my case, it might find uh trailing white space. That's that's a it's one of those things that really annoys me is leaving trailing white space. So it it not only finds the trailing white space, but it modifies your file and removes the trailing white space. So now you're left there with something in the in the staging area uh ready to commit, and the same file in your current working directory with the that's been modified, it's had the trailing spaces removed. So you have to git add that again. So now you've done two ads on the same file, and then you can do your git commit and it'll run the tests again. Uh, but this time it'll succeed because there are no more trailing white space uh on on the on the lines. Um and I like it.
Wolf:And there are ways you can shorten up. You don't if you run it before you do the ads, if you say pre-committee.
Jim:Yeah, I should do that.
Wolf:Then then you only have to add once. But um if it's the case that um your ads are special, you didn't want to commit the whole thing, you were gonna use ad interactive, um, then the way Jim is doing it is the right way.
Jim:And it's uh and a lot of these tools. I'm scratching the surface of what it can do. For me, it's primarily uh removing trailing white space. I need to put some tests in there. We we are not a test-driven shop. Uh we should be, and this is our chance to put some tests in there to make sure that the code passes some level of tests before you actually commit it. But it's it's way cooler. Yeah. All right, man, we've been talking for an hour and 30 minutes or so. Um we uh I think we've covered all the topics I wanted to cover. One final thing is well, Git is an awesome tool. Uh tell us about some of the alternatives to Git.
Wolf:Um, well, first of all, um there that just like anything else, there's a lot of people for whom Git is a religion. Um it's not a tool, it's a um, I don't even know what to call it, a way of life. I happen to be knowledgeable about Git, but it's just a tool to me. If something better comes along, I'm I'm gonna use that. Um, just like I am with everything else, as you well know and kind of despise.
Jim:And I and I tend to lag behind quite a bit.
Wolf:Uh so in the old days we used things like CVS, uh, which was uh lacking in a lot of ways. I'm not gonna go into them. Perforce, which is still in use today because Perforce can do some really important things that Git can't, doesn't want to, isn't convenient to do. Perforce is great with big files and binary files and things like that. Subversion uh was a big deal. Um one of the things that subversion brought into the world was this idea of easy branching. Um now the way Git and Subversion branch is very different, um, and Git's way I feel like is better, but subversion was a huge jump. Uh there's Mercurial, which was invented right around the same time as Git. Uh Mercurial is for the same reason because it's by the way. Yeah, for the same reason. Uh Mercurial is written entirely in Python, so if you wanted to do something, super easy to fix and make it do that thing. Uh the people who wrote Mercurial, I I don't know. But the places where I encounter this, um the people who are on the Mercurial side, and as far as I know, none of these are creators, um, are angry that Git somehow is the winner. Um and winner is not a good word. Git has the most momentum at this moment.
Jim:But new things businesses have launched around it.
Wolf:Yeah, that is true. Uh new things are coming around. Um I'm gonna name some that seem like they could be something, I don't know, and some that probably won't. Um there's this thing, um, I'm just gonna start at the top, jujutsu. Um, and the thing about jujutsu is there are hard things about git. The guy who wrote jujitsu started with the idea that, okay, git. Yeah. But not hard. Let's do git, but not hard. So he works on top of a git repo, and my understanding is that it's possible, although maybe not implemented, to work on top of others. But things that he thought in the model were too hard, he changed the way it looks to a user. But apparently to in jujitsu, I could be using git. The canonical repo could be a git repo, and locally you could be using jujitsu, and I would never know. That's the promise. But in git, you stage, you build, you craft your commit, and then you make it. In jujitsu, you're always at a commit. You can undo things, um, but you only move on to the next commit when you say, Oh, oh, and by the way, um, yeah, that commit's complete. But there's no adding. Everything is already a commit. It just takes saying that you're moving on to be that thing. Um so jujitsu is worth exploring. It's interesting. And on uh Mastodon, I see people that I consider to be super smart people, and they are raving about jujitsu. And it is not jujitsu, it is jujutsu. J-U-J-U-T S U. T S U. That's right. Um, of course, there's Fossil from the SQ Light guy. I never hear about fossil. I don't know if anybody is using it but him. It may be just him. There is sapling. I don't know anything about sapling at all. Um, so can't tell ya. And there's uh I don't even know how to say this, but PyJoule P-I-J-U-L. I've heard about that. I hear some good things, I don't see real broad acceptance, but it's new. It has learned things from Git. It might be interesting. Um There's those four things. Uh and it's just like anything. Like like languages, like cars, like everything. Um Git is constantly changing. New things get added, old things get deprecated. Um learn it. Keep learning it. Um, and if something better comes along and the cost to switch fits into the ROI equation. Yep. Then then maybe it's time to switch. Yeah.
Jim:Alright, so that's a long, uh uh great episode. Uh let's let's wrap this up. What do you think the takeaways are from this episode?
Wolf:I think there's only one takeaway. And that takeaway is Git gives you a place where your work is safe and you have room, because it's safe, and tools, to experiment. Um That's what that's what Git is all about. The time machine that can take you play back to before the experiment failed, the um reproducible releases, the easy way to find fixes, the safety. Um that's what git is about. I there is a thing we we should mention. Yes. And that is um, I want to say our friend, Julia Evans. I don't think I've earned the right to say our friends because I don't think she knows who I am from Adam. But uh Julia Evans uh from Wizard Zines, um, a super, super smart uh computer professional who knows lots of stuff about lots of stuff and teaches it to other people, which means she knows it even better than you thought she knew it because she knows it well enough to teach it. And she teaches it with these hand drawn, illustrated, what she calls zines, um, that not only tell you what you need to know, but tell you in a way you can uh comprehend it and and move forward. Um I Have every single one of her zines. I have learned stuff from everyone. Um man, she's great.
Jim:Yeah, they're fantastic. We will include a link to her site and specifically the the Git zine in our show notes.
Wolf:There's actually two. Uh the the the specific one about git is called How Git Works. Umrigor. Everybody should have this zine. How GitWorks is great. And then she has an older one, um, which she might have done in collaboration. Uh it is called, uh, and I'm gonna just say the title, even though it's slightly uh expletive, Oh shit, git. Um and it talks about when bad things happen and what to do about it.
Jim:And I know recently she's taken on a project of of uh uh updating the Git man pages. She's working quite a bit on that. So yeah, she knows this stuff and you can learn from her. Uh so check that out. Like I said, we'll include a link in the show notes. Uh so that takes us to the end. I want to thank everybody who's listened, uh, all of our friends out there. Uh and Wolf, thank you. This is uh some great, great information. You've been a mentor to me uh for this and many other things. And uh uh I think we've had a chance today to uh share some of that knowledge, uh things that I learned uh from the master. So thanks everyone for listening.
Wolf:You know, you you mentioned that Git was invented in 2005. Yeah. I I'm not a hundred percent sure that I started using Git in 2006. Yeah, earlier. It might have been 2007. Yeah. Uh but I've been using it for a while.
Jim:A lot longer than me. Anyway, thank you for for uh sharing that knowledge with us. Uh as always, uh we have show notes at the end that include uh links uh this time to Julia Evans' work and uh links to our webpage. And uh uh if you want to send us feedback, we would love to have your feedback. Uh send it to runtime uh uh feedback at runtimearguments.fm. Uh you can contact us on Mastodon. All that information is in the show notes. So in your podcast player, take a look at the show notes. You'll see all that kind of stuff. So uh thank you.
Wolf:Don't forget to mention the transcript.
Jim:Uh yes, we're doing transcripts now. I have uh done transcripts for all the past episodes. Um, some of them aren't that great because the tool we're using isn't perfect, but it makes us Googleable, which is good. Um uh so yeah. So thanks everybody. Um look forward to uh uh doing this again in a couple of weeks, and uh tell all your friends how much you enjoyed this podcast. So thanks, Wolf.
Wolf:Uh my pleasure. This is fun. Hi, everybody.
Podcasts we love
Check out these other fine podcasts recommended by us, not an algorithm.
CoRecursive: Coding Stories
Adam Gordon Bell - Software Developer
Two's Complement
Ben Rady and Matt GodboltAccidental Tech Podcast
Marco Arment, Casey Liss, John Siracusa
Python Bytes
Michael Kennedy and Brian Okken