Picture Me Coding

Standing on the Shoulders of Giants: Edsger Dijkstra

Erik Aker and Mike Mull Season 2 Episode 31

This week we talked about the greatest philosopher of our field: Edsger Dijkstra. Most software engineers will immediately recall Dijkstra's Algorithm, for finding the shortest path between two nodes in a graph, but Dijkstra's work covers a large range of topics over 5 decades, from code quality to complexity to concurrent programming, and programming languages.

In this episode, we talked specifically about the following works, including the original paper where Dijkstra first published what we now call "Dijkstra's algorithm":

This episode is part I of our Dijkstra discussion. We'll be back next week for more.

Send us a text

[MUSIC] >> Hello, welcome to Picture Me Coding with Erik Aker and Mike Mull.

Hi, Mike.

>> Hey there.

>> Mike, how's your life?

>> It is completely sufficient.

>> Oh, that's great.

Great to hear.

>> Yeah, that's pretty good for me.

>> That's pretty good.

>> Every week we start by talking about music.

What music do you want to tell me about?

>> So the thing I've probably been listening to the most is the latest solo album from Adrienne Lenker called Bright Future.

She's the front person of the band Big Thief, but she also has done quite a few solo records over the last decade or so.

I am not always a big fan of her solo work.

Sometimes it's maybe a little too introspective for me, but I like this one a lot.

It's kind of folky and country-ish and from what I understand, it was recorded with a group of friends in live sessions.

So not a lot of preparation or post-production.

I like the sound of it.

Interesting harmonies and she has an interesting way with word play sometimes that I enjoy.

It probably would have been my favorite album last month if it wasn't for the Waxahatchee album that came out on the exact same day.

I've been listening to this a lot.

>> I'm not that excited about the Adrienne Lenker stuff.

I like Big Thief.

>> So you've said, yeah, fair enough.

>> It's okay.

Not here to throw rocks.

I've been listening to a band called Grinn from Berlin, and they have an album called Hush.

I told you about this.

I think you might have heard it.

There's something really surprising about them.

They're dark and heavy and they have these songs that evolve in surprising ways.

They have a few really good songs in there I think you'd like it.

There's one that isn't just growling but has some singing on it.

>> That is super heavy song.

It's only a minute and 20 seconds song.

It's called Midnight Blue Sorrow.

So you can try it out and you only have to sit through a minute and 20 seconds if it doesn't hit for you then it's over quickly.

But it's diverse and it's fun, very dark.

>> I have actually listened to it after you recommended it to me.

I do like it.

The music is not uniformly the same from song to song.

So some of it sounds very traditionally doomy.

Other parts are almost death medley.

>> Yeah.

It's got some surprises in it.

I like it.

>> Yeah.

I enjoyed it too.

>> Mike, this week you wanted to talk about Edsger Dykstra.

>> Dykstra, one of the greatest philosophers of our field.

My response is yes, yes, yes, I want to talk about Dykstra but can I have a year to prepare?

>> Yeah.

This is, I guess, the third episode in our Standing on the Shoulders of Giants series.

Yeah.

Dijkstra is definitely one of the giants of the field.

I think as you imply he's hard to cover because his work is so diverse and expansive.

A lot of the stuff that he did was foundational.

You read some of these papers and it's hard to imagine a time when people hadn't already thought of this stuff.

But yeah, definitely an important character in our field and also he's a personality that I enjoy.

He's highly opinionated and a little bit curmudgeonly and not at all unwilling to offer his very strong opinions on things.

>> Yeah.

I think Dijkstra is one of those characters where the quality of his thinking is so solid.

You read this stuff and you think, wow, this is still incredibly relevant today.

And he also seems like someone, the lamport's kind of like this too, where he has these personal attributes that guide his methods of thinking.

I look at it and think, wow, I can never live up to this.

The way in which he thinks about programming is on another, the way he thinks about proving things and the way he thinks about mathematical induction and the way he thinks about how this relates to the day-to-day act of programming is so far beyond what I do on a day-to-day basis.

I find his work really inspiring and I'm in awe of it.

And there's still so much to learn from.

>> Yeah, and I think he writes quite a lot better than the typical computer scientist.

I don't always understand what he's saying, but he seems to write pretty elegantly.

I do think you're right that he probably influenced lamport quite a bit, at least in terms of the sort of rigorous mathematical approach they try to take to software.

>> Mm-hm.

Well, both trained in math, a little bit of bio about Dijkstra.

He was born in 1930.

He died in 2002, taught for a long time at the University of Texas in Austin.

Originally comes from the Netherlands.

He studied originally mathematics and physics and then started working in computing and because, again, the mathematics background really approaches computer science from the perspective of mathematics.

We wanted to talk today about some of his papers, at least one of his books, and then he's got a whole lot of essays and opinions.

I read somewhere that he didn't like to use a computer to type these out.

He would handwrite them or use a typewriter.

He was really resistant to using a computer, which seems pretty funny.

Even some of these PDFs that we have collected for this episode are handwritten.

>> All right, Mike.

I want to talk to you about the book, Structured Programming by Dijkstra.

This is actually one of my favorite books.

I like to read it even now.

I think it's relevant to what we do.

It's very interested in the burden of complexity on the brains of we programmers.

I have a clip from one of Dijkstra's lectures that I pulled for you.

I'm going to play it for you.

Here it goes.

>> Welcome to this talk on reasoning about programmers.

We all know that machines are so fast and storage are so big that they give us plenty of latitude to screw things up.

In order to prevent that, we have to prove that our programs indeed produce the results we would like them to produce.

>> Prove that our programs produce the results we would like them to produce.

It's a good line.

>> I got to say that he's legendary at the pause.

>> Yeah, I like that.

It's nice to think that people as brilliant as he also have deep pauses in there speaking.

>> He was notorious for this, for lecturing in his classes this way.

He would extemporaneously come up with a lecture and pause to really think about what he was doing.

I wanted to start with this quote because he's saying computers have gotten so big in their capacities that it's incumbent upon us to prove that our programs do what we think they do.

Instructure programming talks about this and a later book talks about this as well.

Instructure programming says what I'm concerned about is the composition of large programs.

Instructure programming is in the 60s.

How big are their programs at the time?

I don't know.

Obviously now we have some pretty fabulously large programs.

He's interested in scaling programs.

How big can they get?

How can we keep them manageable?

In a later book, he had this really interesting quote I wanted to give to you.

He writes, "In computer programming, our basic building block, the instruction takes less than a microsecond, but our program may require hours of computation time."

Earlier he has this analogy about a grain, which is like an order of magnitude thing.

He says, "I do not know of any other technology than programming that is invited to cover a grain ratio of 10 to the power of 10 or more."

So going from a microsecond to hours of computation is a massive scale, massive order of magnitude difference.

The automatic computer, by virtue of its fantastic speed, was the first to provide an environment with enough quote, room for highly hierarchical artifacts.

And in this respect, the challenge of the programming task seems indeed without precedent.

For anyone interested in the human ability to think difficult thoughts by avoiding unmastered complexity, the programming task provides an ideal proving ground.

So I wanted to give you these two quotes because on the one hand, we're talking about scale, and maybe orders of magnitude in time, and maybe very large programs.

On the other hand, we're talking about complexity.

And Dijkstra's suggestion is we have to master complexity, we have to demonstrate to ourselves that our programs are correct, and it gets very hard.

And he says programming is different from other technologies because of these order of magnitude differences.

What do you think?

Yeah, I think it's an interesting way to think about the problems in this profession.

He's probably most famous for his go-to-considered harmful note, which again is sort of an aspect of structured programming.

Yeah, I understand go-to-considered harmful within the context of structured programming.

Like it makes sense to me as a critique, as a subset of the critique in structured programming.

Yeah, I have some difficulty actually following what he's trying to say in that note.

But he has this idea of if you want to follow the sort of state of a program, you can build up this kind of index of where you are in the execution of the program when it's running.

And if you insert go-to's in there, it becomes very difficult to keep track of that index.

So I like what this quote is saying, especially about the differences in scales.

And I remember another line in there where he talks about how there are some tasks that we program where it would be literally impossible to test every possible option.

It would take thousands of years to run through every computation.

You can't, you have to think about that in a pretty rigorous way because you may never execute lots of the potential options that that thing could possibly execute, but you have to rely on the idea that if you did try to execute them, it would actually work.

So we start with this idea of human limitations.

He has a later talk called "Programming is Considered a Human Activity."

So he's interested in human limitations.

How is it possible for humans to comprehend the programs that they're writing and use this tool, a computer, which has this vast order of magnitude capability?

So in the first part of structure programming, you just mentioned this, he talks about human reasoning.

And he says, I'm going to talk about three of the most common ways to reason about our programs.

One would be to enumerate all possible states, all possible, like if you look at the domain and range of your program and think about every possible state it could ever achieve, you could try to enumerate those.

That's probably impossible.

This is just for most interesting programs, right?

The next thing you could do is you could use mathematical induction.

Like you could write out a formal proof for every program you write.

Are you going to do that?

No, you're probably not going to do that.

It's very long.

It's very involved.

It's not practical.

It's not pragmatic.

The last way to reason about our programs is purely through abstraction, informal reasoning.

You just kind of put together the pieces of what's possible, what's an invariant, and then you say, well, if these are my invariance, then we should be good to produce the result we're seeking.

This is like a Goldilocks problem.

Enumerating all possible states, it's way too much, but it's not enough.

We'll never get to them.

Induction is way too long.

Informal reasoning, just right.

That's our Goldilocks answer.

Then he says, okay, how can we assist our informal reasoning in our programs?

This is where structured programming is still totally relevant today.

Written in 1968, it's 88 pages.

You can read it now and learn a lot from it, but it also will feel, most contemporary programmers will feel completely intuitive, completely right on.

The thing that fascinated me, and I think this is true in the GoToConsidered harmful paper as well, is what Dijkstra's suggesting ultimately is, let's not use all the tools possible.

Let's artificially restrict the tools we're going to use in order to assist our ability to reason about their programs.

Let's keep it simple.

What is that simplification?

Well, in a program, he's saying we should really just think about the sequence of instructions.

This happens, then that happens.

We should think about branching, if then else, and we should think about iteration, looping.

The order in which things happen are branches within the program, and loops iterating over collections in the program.

Also, we should use subroutines.

The contemporary recommendation is you should write a lot of functions and you should call them in a really obvious order.

If then, else, this should be pretty obvious, and then you should iterate.

Judiciously.

Here's what's interesting to me.

It's an intentional restriction of the tools that you could have.

Saying GoTo is harmful is saying, "Hey, there's this thing called GoTo.

It's a programming construct.

Let's not use that.

It makes our programs hard to reason about."

When I started programming professionally, which was the late 1980s, structured programming was still pretty much the state of the art.

Not that I mean it's irrelevant now.

I just mean that at the time, there was not a lot of people doing object-oriented programming.

There weren't a lot of people doing functional programming except for a few Lisp enthusiasts.

People were writing programs in C.

People were writing programs in Fortran languages like that, probably Algal, Pascal, things like that.

Structured programming was what you were trying to attain.

I remember there was a point in my first job when I was interviewing these people to become interns.

I had this question about structured programming that I would ask them.

It was along the lines of what's the fundamental rule of structured programming.

The answer that I was looking for was a procedure should have only one entry point and only one exit point.

The first couple of people just had no idea.

Then as the day went on, people started to get the answer right.

I was like, "Wow, these people seem to be better than the people I talked to this morning."

One of the older guys I was working with said, "You realize that they're going out in the hall and telling the next person to come in?"

The questions that you asked them, right?

It's funny.

To this day, when I see a Python program with an early return, it gets my hackles up.

Even though I do it myself all the time, there's just lots of cases where trying to structure things where you only have one return is uglier than the alternative, but it still gets my attention.

Yeah.

I wanted to talk about early returns in particular.

Structure programming really became, I think, the de facto way to write and think about programming people who are obsessive about code cleanliness and code quality.

They're often really reaching for these same ideas that Dijkstra was talking about.

Go to then becomes an example of on-structure programming.

As you pointed out, it gives you this escape hatch from your computation.

Now, you look at most contemporary programming languages.

They're written in a way where they've inherited this de facto way of thinking about programming, structure programming.

Most programming languages now, they have a call stack.

There's a hierarchy.

There's a structure to it.

That's just baked in.

That's just part of our expectation.

Thinking about examples of violations of structure programming, which would be on-structure programming.

For Dijkstra, it would be multiple entrances, multiple exits.

Sometimes people complain about raising an exception from within a function as an early return or multiple exit as a way to make it more confusing.

One thing that I think about a lot when I think about structure programming is the co-routine.

Over the last, I don't know when I learned co-routines, but it's one of those things where probably once or twice a year, I'll be working on a problem and I'll think, "Oh, this might make sense for a co-routine.

I can just feed it an answer back in.

I could pause this computation and kick it over here, get an answer back."

And inevitably, I never ship that code.

I always find some easier way to write it.

So, early returns.

What do you think?

Now, here's what I think about early returns.

People write early returns a lot and still find that to be comprehensible.

So, I think there must be a way in which an early exit from a function is intuitive for developers.

It makes sense that in that way, it's not such a bad on-structured programming trait.

What do you think?

In general, I think that's probably true.

I think, at least in terms of code comprehension, in most cases where people do this, it doesn't make the code that much harder to read.

There are some cases that still kind of bug me and do, I think, add to complexity.

You know, I'll see, I'll see occasionally functions where the function itself is doing quite a bit.

There might be tens or hundreds of lines in that function.

But if you look really closely, there's a little bit at the top that will check some condition.

And if it's not true, it'll return.

Now, I do that a lot.

Yeah.

I find that a little bit confusing.

And it's easy to miss, I guess, you know, that this function, in certain cases, this function is not really doing anything because you short-circuit it.

But I can sort of see, you know, back in the good old days when you were using go-tos to sort of jump out of the middle of code to some label, I can see how that would make things extremely hard to follow.

But especially if you don't have a real call stack that is getting unwound or something like that.

What about async code or threading code?

Those seem like pretty glaring neon examples of unstructured programming to me.

We're going to pause our computation.

Well, we play some commercials.

Yeah.

Well, I think there's no question that threading and, well, essentially any type of concurrency makes programs harder to understand.

I think it was back in the days when I was doing super computer stuff and working with Cray's.

I think it was Seymour Cray who was adamantly against any sort of concurrency and kind of tried to work around it in his hardware designs.

But yeah, I mean, it seems to have proven itself valuable in terms of delivering certain types of performance.

But there's definitely no question that it makes things harder to understand.

Well, there's a whole push in certain areas of argument.

There's a whole argument that has emerged for something called structured concurrency.

Does that sound familiar?

I've heard the term.

I'm not really sure what it refers to, to be honest.

It's an attempt to build a structured programming model into async code.

So there is a hierarchical call stack for async code in a more obvious way.

So cancellations can unwind to parents.

So it's almost like an async invocation is a subroutine of the parent that allows that.

You can kind of imagine that this chunk is only allowed to take up this much time.

And if I want to cancel, I will cancel up to this hierarchical parent, that type of stuff.

I'm not sure that makes it simpler for me upon immediate consideration.

All I want to do is point out, structured programming hugely influential, defecto way that we think about code quality and clean code.

And if you're like me, where you're obsessively like, this code doesn't make enough sense.

Can I make it dumb or can I make it simpler?

We really have Dijkstra to thank for that.

And we have people who are inspired by Dijkstra's structure programming even today and trying to say, let's build structured concurrency.

Now, here's another element of structured programming I want to think about.

Let's say you're writing within structured programming style, you're really trying to adhere to it, single entrance, single exit.

What are ways in which your code could still be hard to comprehend, do you think?

Well, I think things we've talked about in the past that are related to complexity, just sheer code size.

And this algorithm that I tried to describe on a podcast could clearly not something that's easy to follow on first pass.

Yeah, I remember once working on a piece of code that was 6,000 lines of C that was all one procedure.

You can write completely incomprehensible code that's more or less structured.

Yeah, I thought about clean code guidelines, magic constants, magic numbers, magic strings.

You could write structured programming that makes liberal use of those and you're like, why do we have a 24 on line 840?

What does that mean?

Spaghetti code.

You could still write spaghetti code.

You could write tornado code.

This is why the early returns, tornado code is where it just keeps getting indented further and further and further to the right of the screen.

Ravioli code, we're going to make liberal use of calling functions while I could end up calling a huge amount of functions.

And then here's a potentially controversial one.

Object-oriented programming.

I think we will probably end up talking a little more about this in a minute.

But I gather that Dykstra, not a huge fan of object-oriented programming where you've got hidden state you're potentially updating.

That sounds harder to reason about.

Yeah, I don't remember the exact quote, but I do remember a line from him about only somebody from California could have invented object-oriented programming.

I think that might be spurious.

I've seen that too.

I couldn't actually check that quote down.

So as an alternative here, if I'm writing Rust and I write the return keyword, return something semicolon in Rust, I really feel like, oh, no, I'm doing something dirty in this Rust function.

And it's an expression-based language like Haskell.

So all expressions evaluate down to a value.

This seems to me like the most beautiful instance of structured programming.

Every time you see a chunk, you evaluate that chunk, you will get a value as a result of evaluating that chunk.

Early return is bad style in this case.

So you're sort of coerced into doing more, being more following that structured programming style.

Definitely an extent to which the language you are using encourages it or not.

Okay, so he is probably most famous for Dykstra's algorithm, a graph algorithm, finding the shortest path.

It's a thing that pretty much every undergraduate computer science student will learn at some point.

And even people like me who didn't study computer science have to have learned this at some point.

Pretty sure, actually, surprising paper.

Do you want to talk about that first?

Sure, yeah.

The paper itself is called Two Problems in Connection with Graphs.

And there are, as it says, two problems discussed in the paper.

We're actually only going to talk about the second one, which is what is now called Dykstra's algorithm or Dykstra's shortest path algorithm.

The first problem is sort of a minimum spanning tree algorithm, which is now called Prim's algorithm or the Prim Dykstra algorithm, something like that.

But in any case, it's problem two that we're interested in, which is the shortest path in a graph that's either directed or undirected and has positive edge weights.

You could think of this almost like distances between cities is like a common example.

Right, yeah.

And the paper is only two and a half pages long, by the way.

If you're thinking, "Can I read this paper?"

It's very short.

Right, it's very short.

It's a little dense, but two pages for two problems, it's easy to get through.

So the shortest path algorithm is problem two.

And it's, there's sort of, the way that it's discussed in the book is not the way that people would typically implement it now.

Yeah, I didn't see anything about previous nodes.

That's what confused me because I always thought you had, for Dykstra's, you have like, "Here's my last node for this shortest path."

That's what I thought.

There's quite a few details that are left out here.

And I'll try to go over the way it's described in the paper and then I'll try to mention some of the ways that it's typically implemented now.

So in the paper, he starts out with these six, what he calls sets.

Three of them are to hold vertices of the graph, A, B, and C.

And three of them are to hold edges of the graph, which he calls one, two, and three with Roman numerals.

I maybe also note that this algorithm does not work if any of the weights are negative.

So, oh, okay.

Well, we've got to throw it away then.

Sorry, sorry, sorry.

Poking holes in it.

So you're looking for the shortest path between two nodes in this graph and he calls them P and Q.

The first thing you do is you put all the nodes into set C and all of the edges into set three.

This sounds like a little bit like breadth-first search.

Yeah.

And I'll talk about that a little bit later.

But you can think of this in terms of breadth-first search as sort of a way to simplify the idea.

So we start with, we start with the start node, which is P, and we add it to set A.

And then we take, he takes all the edges that start from that node, node P.

And he first adds them to, he adds the node to the set B and he adds the edges to the set two, Roman numeral two.

Okay, wait.

So I'm getting lost with this terminology.

I apologize for jumping in.

So I think I understand A, B, and C, but I would probably use different terms that are like breadth-first search terms.

I would probably have unvisited and known, discovered, and then visited.

That would be A, would be visited, like I went to this one, I found its edges.

B would be, I know this one because I've seen it from visiting one nearby.

And C would be, I haven't even gotten to this one.

That's pretty close to right.

I've been thinking of set A as things that we've already determined are part of the minimum path.

Well, I don't understand the sets for edges then.

If I can map the A, B, and C to my BFS implementation, why do we have three sets for edges?

Yeah, so it's a good question.

And if you look at most modern implementations now, they don't actually keep track of the edges in separate sets like this.

And the way he's implementing it, let me move on to the next step here.

So at this point, we've got the initial node in set A.

And in set B, we have the nodes that are connected to the original initial point.

And then we've in set two, we've got the edges that connect those vertices to the original point.

At that point, what we do, we look in set B and try to find the vertex, which has the shortest distance from the original point.

And then we move that node to set A and that edge that connects to that node to set one.

And so what we're left with is we still have vertices, potentially at least have vertices in B and edges in set two that were not added to the minimum path, but are still there to be considered later.

And that makes a little bit more sense when you move on to the next step, because in the first set, there's nothing in B, so it's kind of uninteresting.

But in subsequent steps, there may already be nodes in B and edges in this set two.

So when you're looking at a new node, something connected to this new node that you're considering might already be in the set B.

And so you need to connect, you need to compare the connection between this new node that connects to B and the original connection to that node in B and see which one is shorter.

And I think this is why he keeps these sets of edges, because the edges are there for him to compare newly encountered edges with edges that were previously encountered to compare which one is the longer and which one's the shorter.

Because you might end up discovering a shorter path to a known vertex.

Correct.

Yeah, that's, you know, can I ask though, like if we have edge sets one, two, and three, and vertex sets A, B, and C, is it a plane mapping where A and one are going to be my answer, ultimately B and two, like two is the edge version of the discovered nodes?

Yeah.

So when, when you reach the end, which in his formulation was when the node Q, which is the one that you're trying to find the path to when that gets transferred to the set vertex set A, you're done.

That means you, you have the shortest path now.

And the shortest path is going to be determined by the nodes that are in set A and the edges that are in set one Roman numeral one.

In the way people implement it now, we typically don't keep track of the edges independently.

The the edges are implicitly there based on the distance in the priority queue.

When I was reading the paper, I thought, wow, does he have these, these separate sets for vertexes and edges?

Because this was written in 1959 or 1958.

Maybe the way in which they would represent a graph at that time meant it's required to an algal, oh, it's a bit of pre-algal 60, right?

I don't know.

It's a good point.

And I think probably, I don't know, like in 1956, when he, when he first thought this up and in 1958 or 59, when he wrote the paper on it, it's not clear to me that things like, you know, a men priority queue or a heap or common things that were used in programming.

So may just be that that wasn't something that he had immediate access to in his brain to use for this purpose.

I honestly don't know that.

So anyway, I forget exactly where it was with that, but the, you know, the the essentially repeat, we're doing a great job though.

Yeah, it's impossible to explain a graph algorithm without, you know, having a picture, which is hard to do on a podcast.

But I'm giving you a C plus on this one.

You're doing great.

Yeah, I hope I would get a passing grade, but probably not an A.

So anyway, you repeat this process and you eventually end up with the node that you're seeking in the set A, and then things are done.

The one thing that was not clear to me about this before I started looking at it is that you're not really generating one unique path from node P to point to node Q.

You're actually building a sort of tree that has the minimum path to every node that you encounter on the way as well.

And so, you know, to find that one unique path from P to Q, you'd have to go through the vertex and edge sets to find that particular path from Q to P.

That wasn't clear to me actually, reading this paper, that you're building a tree.

Yeah, it wasn't obvious to me either.

Once you write down a few graphs and start running through them, it becomes clear.

And one thing that happens a lot in implementations of this algorithm now is that instead of specifying node P and node Q, you just specify the starting point, and then you do this process across the whole graph.

And so, you can pick essentially any node in the graph at that point because you will have data on the shortest path to that node since it's building this tree for you.

I suspect that was probably not at all illuminating, but...

So, with Dijkstra's, I have this gut level intuitive feeling like, wow, isn't it possible to trick that algorithm?

And I know it works.

I know it's right.

I know it's proved correct.

I just said, I always look at it and go, what if I could create a pathological example that tracks it into finding not the shortest path?

There are some issues, some places I got stuck trying to trace through what he describes.

For instance, if you start off at P and the initial closest node that you find is a terminal node, the way he describes the algorithm is a little bit unclear because if you arrive at that terminal node, the description in the paper is that after you've done the first node, you find the shortest path and transfer that to the A set.

And then you start from the A set and you find the things that are connected to it and find the next shortest path.

And so, you're working on this process where at each step, you have some node that you just transferred into the A set and that's where you start from in the next step.

But if you reach a terminal node, there's nothing connected to it and so there's nothing getting transferred into the A set.

So, it helps to think of it in terms of breadth-first search.

Then the other thing that is still not entirely clear to me is you can, since these are weighted edges, you can have paths which are essentially ties.

But then it doesn't matter.

We're looking for the shortest path and I don't care.

I could pick either if they're equivalent.

Probably what you end up with is one or the other.

And it doesn't matter in terms of both paths are equivalently short.

It's just it's not something that's explicitly discussed in the algorithm.

So, like in most, if you look in an algorithms book now, this is probably going to be implemented with in priority queue.

So, instead of having all these different sets, you have all the edge distances between the vertices set already and that's one of your inputs.

And then you put everything into this priority queue.

You just start pulling things off the priority queue in terms of their shortest distance from whatever the starting point is.

Right.

So, the goal there is like we'll just evaluate the shortest first.

Right.

And as you go through the graph, you know, the distance from a particular node might have to get updated to a shorter distance because you've now found a path to that previously discovered node that is shorter than the original one.

So, the time that this algorithm takes is going to be dependent on the implementation of the priority queue.

And so, people have done a lot of work on trying to optimize the thing that's serving as the priority queue.

And there's this paper from the 1980s that seems to be the optimal case.

It uses something called a Fibonacci heap.

I saw that.

I saw that, yeah.

Which I guess people don't use in practice very often because it's complicated.

I asked one of the LLMs to write me a version of Dijkstra's that uses the Fibonacci heap and it imported a library that doesn't exist.

So, then I tried to write that class Fibonacci heap that it was calling and even once I had it, and I was reasonably sure I had it, it still was the code that it produced was still kind of broken.

I was surprised.

I asked Copilot.

I just asked it implement Dijkstra's shortest path algorithm.

I was working at Rust and it did a pretty good job.

See, that doesn't surprise me because I would think it's a really commonly implemented thing.

So, there's a lot of examples.

There's a lot of things that LLM can just dream up.

Yeah.

I mean, it's probably not the implementation that most people would hand code in particular because it kind of invents its own graph data structure.

You wonder if people are going to write this type of stuff in the future.

Okay.

Dijkstra's algorithm.

Any final comments on it?

If you find yourself in a situation where you need to implement this, I would probably not start with the paper.

I was like, I think I get Dijkstra's, I'm reading the paper like, "Wow, I guess I never understood this.

This is horribly confusing me."

Yeah.

I can't imagine anyone actually writing this algorithm the way it's described in the paper.

But once you've sort of got a feel for it, it's you can go to the paper and see how it is equivalent and how it was a pretty clever, clever thing he devised.

Well, here's something that should encourage anybody listening to this.

Somebody says, "Hey, can you write Dijkstra's algorithm?

Just keep in mind that it took Dijkstra two to three years to actually write this down in paper form.

Okay.

So, give yourself two to three years."

Right.

Mike, have you ever used Dijkstra's algorithm at work?

I never have.

Yeah.

Not consciously.

No.

I think the only time I've actually coded it was in a Coursera class.

I think it's probably pretty likely that at some point in my career, I have used an implementation of it that already existed.

Oh, sure.

Yeah, that makes sense.

Yeah.

There was a time in my life when I did a lot of work with molecules and they are almost always described coded up as graphs.

The carbon and oxygen are connected.

The bonds between the atoms are vertices and the bonds are edges.

Bonds.

That's the word the chemists use.

It's almost certain that I've used code to find the shortest path from one atom to another atom.

But I don't think I had to code the algorithm for it.

So, probably already existed somehow.

So, that's the Dijkstra's algorithm paper, 1959.

You also raised this paper where Dijkstra really invents the mutex.

Paper is called "Solution of a Problem in Concurrent Program Control."

He uses these terms for the first time in this paper as far as I'm aware.

Maybe they were used elsewhere.

But he talks about "mutual exclusion," which we now call a mutex.

And he talks about "critical sections of work."

And the first time I encountered that was in Landport's work, Landport's clock paper, "critical sections of work."

And Landport actually uses that in his Baker algorithm paper too.

So, Dijkstra coins these terms.

The mutex paper, it's even shorter than the shortest path paper.

So, only a page fits on one page.

I actually really like this paper.

I find it much more comprehensible than the shortest path paper.

But I have to admit that when I look at his pseudocode sample, based on, I don't know what, COBOL or something?

It's like L0, do this.

L1, do this.

L4, go back to L1.

Else, begin, end.

So, I asked another LLM for translation of this one.

It's like a six-line function.

And I went, oh, yeah, okay, yeah, I get it.

That's a mutex.

Yeah, it is difficult to follow the code in there.

And it's maybe slightly ironic that it uses go-to's.

But before you discovered that we should never have to.

Well, I think the mutex implementation is roughly, if I have an array of bits that represent concurrent processes, I can set them all to true, roughly, for they're all available.

And then I can, when a concurrent process wants to run, it can toggle its bit to false.

And if anybody else is false, then it just doesn't run.

If it's the only false, then it can enter its critical section of work.

That's pretty much it.

That's pretty simple, right?

It's simple.

It's, again, kind of interesting to me.

There was a time when people needed to solve this problem.

When a mutex did not exist, now I just import it and use it.

I don't remember what year this paper came out, but I am uncertain about what the hardware and software environment was at the time and how this became a concern. 1965, 1965, this paper came out.

Yeah.

I mean, in my head, I'm thinking of machines with multiple threads and processors with multiple cores and stuff like that.

But I assume that's not what he was dealing with in the 1960s.

It's one of those papers where you read it.

It's short.

It's pretty, for me at least, I felt like, okay, I get it.

There is still an intuition thing here where I think, wait a minute, if I've got a bunch of independent processes that are all using the same array to toggle bits, and then they're going to iterate through, isn't there a deadlock possibility?

Yeah, that's one of my favorite aspects of this paper is that his description of deadlock is, after you, after you, he's imagining two people trying to go through a door or something.

I guess deadlock was not a common term yet.

All right.

Well, this has been Picture Recoding with Eric Aker and Mike Moll.

This has been episode one of our talk about Dijkstra, the inspiring figure with a lot of hot takes.

And that kind of mean girls quotes.

We will return next week with more commentary on Dijkstra.

Stay tuned.

Thanks, Mike.

Thank you.

See you next time.

Bye-bye.

People on this episode