DX Today | No-Hype Podcast & News About AI & DX

Claude on Mars: Anthropic's Frontier Model Just Planned a Real Drive for NASA's Perseverance Rover - May 5, 2026

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 14:23

Send us Fan Mail

Claude on Mars: Anthropic's Frontier Model Just Planned a Real Drive for NASA's Perseverance Rover - May 5, 2026 Anthropic's Claude wrote the actual command file for two NASA Perseverance rover drives across 455.9 meters of Jezero Crater, generalizing its coding skills into a niche XML dialect called Rover Markup Language. We unpack the engineering choreography behind the first AI planned drives in interplanetary history, the simulation safety stack that vetted them, and what this template means for Mars Sample Return, lunar operations, and high stakes agentic deployments back on Earth. Hosted by Chris and Laura. The DX Today Podcast brings you daily deep dives into the most consequential stories in the AI ecosystem. Send us fan mail: https://dxtoday.com/contact #AI #Anthropic #NASA #Mars #AgenticAI
SPEAKER_00

Welcome to the DX Today Podcast, your daily deep dive into the AI ecosystem. I'm Chris, and joining me as always is Laura.

SPEAKER_01

Hi, Chris, and hello to everyone listening. Today we have one of those stories that genuinely sounds like science fiction, but happened in the last few months and is finally getting the attention it deserves.

SPEAKER_00

We do, and I'll just tease it now because I want everyone to lean in for this one. We are talking about an AI model that planned and executed an actual driving route for an actual rover on the surface of Mars.

SPEAKER_01

That model is Anthropics Claude. The rover is NASA's Perseverance. The terrain is a boulder field on the floor of Gizero Crater, and the drives happened on December 8th and December 10th of 2025.

SPEAKER_00

When I first read this story, I had to pause and reread it because there is a meaningful difference between an I assisting with route planning and an AI literally writing the commands that drive a vehicle on another planet across hundreds of meters.

SPEAKER_01

That is exactly the right framing, and it is the part most coverage has actually understated. Claude did not advise. Claude did not summarize options. Claude wrote the actual command file in the language that Perseverance speaks.

SPEAKER_00

Let's start there because that details wild on its own. The language is called Rover Markup Language, or RML. Tell people what that is, because it is not a programming language any of us have ever touched in our lives.

SPEAKER_01

RML is a bespoke XML-based dialect that was originally created for the Mars Exploration Rover program back in the early 2000s. Spirit and opportunity used it, curiosity uses a descendant of it, and Perseverance still uses it today.

SPEAKER_00

So this is a domain-specific language with an extremely small public footprint, very little training data on the open internet, and the kind of strict syntax and semantics that planetary safety requires. That is not a forgiving environment for any model.

SPEAKER_01

It really is not, and that is the part of the story I find most technically interesting. Claw generalized its coding skills into a niche dialect with virtually no public corpus, and then it generated commands precise enough to clear ground review.

SPEAKER_00

For anyone listening who works in software, think about the worst legacy system you have ever inherited. And now imagine the bug surfaces on another planet, 200 million miles away, with a round trip light delay of up to 40 minutes.

SPEAKER_01

That latency is actually the design constraint that makes this whole approach plausible. You cannot drive a rover in real time. You batch a day's worth of activity into a command sequence, you uplink it, and the rover executes it autonomously.

SPEAKER_00

Which means whoever or whatever writes that sequence has to be incredibly careful because there is no joystick override coming from Earth in time to save you. So how exactly did Claude approach the root planning problem on the ground?

SPEAKER_01

According to JPL, Claude was given orbital imagery and the local terrain data the rover already had on hand. It identified hazards in the imagery, classifying things like boulder fields, loose regolith, and slip-prone slopes that could endanger the wheels.

SPEAKER_00

That is the vision language piece. Claude looking at an actual image of the Martian surface and saying, that patch is a hazard. That patch is traversable. This slope might cause the wheels to slip. What was the reported accuracy on that hazard identification?

SPEAKER_01

The figure JPL has shared publicly is 98.4% on hazard classification across the test corpus they ran. That is high enough to be useful, low enough that you absolutely still want a simulation step and human review before you actually drive.

SPEAKER_00

Once Claude had labeled the terrain, what happened next? Because identifying hazards is not the same as drafting a multi-leg drive plan that gets the rover from point A to point B safely across a real Martian surface?

SPEAKER_01

This is the part that genuinely sounds eugenic. Claude broke the route into roughly 10-meter segments, strung them together into a continuous path, and then iterated on the result by critiquing its own waypoints and proposing meaningful revisions.

SPEAKER_00

So we have a self-critique loop, which is something we talk about a lot in the context of agentic reasoning on enterprise tasks. Coding agents do this, research agents do this, and now apparently Mars rover planners do this too.

SPEAKER_01

Right, and that is one of the most quietly important findings in the whole story. The same agenic patterns that are reshaping enterprise workflows are the same patterns that scale up to what is functionally a safety critical control problem on another planet.

SPEAKER_00

After Claude finishes its self-critique loop and produces what it thinks is a good route, what does NASA do next? Because I cannot imagine they just took the output and shipped it to Mars without rigorous verification on the ground?

SPEAKER_01

They absolutely did not. The proposed waypoints went through Perseverance's standard pre-drive simulation, which models more than 500,000 variables. Wheel slip, suspension dynamics, terrain contact, projected positions across the entire route, you name it, all of it.

SPEAKER_00

That is a pretty extraordinary safety net. 500,000 variables modeled per drive is the kind of belt and suspenders engineering that gets you mission lifetimes measured in years instead of weeks. So how did Claude's routes actually perform in simulation?

SPEAKER_01

They cleared the simulation and then human controllers reviewed and approved them. The rover then executed the routes autonomously across two driving days. Total distance covered was 455.9 meters, roughly 1,496 feet.

SPEAKER_00

That is meaningful Martian distance for a slow methodical rover like Perseverance. Help everyone understand the context of how slowly these vehicles move and how much human time normally goes into planning each day's drive sequence on a typical mission.

SPEAKER_01

Perseverance moves at a top speed of around 4 centimeters per second, which is glacial by Earth standards. Each drive day takes hours of human planning. JPL engineers estimate that using Claude in this way could cut that route planning time roughly in half.

SPEAKER_00

Cutting human planning time in half on a Mars rover is not a marginal productivity gain. That is a substantial reallocation of scientist hours from routing towards science. So, what does that unlock for the broader Mars program over the coming years?

SPEAKER_01

It unlocks more drives per week, longer total traverses, faster exploration of geographically diverse sites, and more time for the science team to actually analyze the data instead of stitching together the next Souls command sequence by hand on a whiteboard.

SPEAKER_00

I want to make sure listeners do not misread this as a self-driving rover story. Perseverance already had something called Autonav for obstacle avoidance. So what specifically is Claude adding here that Autonav by itself simply cannot do?

SPEAKER_01

Auto Nav is a local system. It looks at the immediate scene from the rover's own cameras and avoids what it can see right in front of it. What it cannot do is look at orbital imagery and plan a 400-meter route end-to-end.

SPEAKER_00

So the human used to do that long horizon route planning, and now Claude is doing the first draft of it with a self-critique pass. And then humans and simulation are checking the work. That is a really clean division of labor.

SPEAKER_01

It is, and it maps onto a pattern we keep seeing in well-designed agenic deployments. The model handles the high context, low, real-time stakes part. The deterministic systems and humans handle the safety critical decision boundary at the very end.

SPEAKER_00

Let's pull back and talk about why Mars in particular is such a fascinating proving ground for this kind of AI-assisted operation. Because there are a few features of the Martian environment that make it almost ideal for current generation models.

SPEAKER_01

There are three big ones.

SPEAKER_00

Compare that to, say, a self-driving car on Earth, where you have milliseconds to react, no simulation safety net at runtime, and consequences that play out in real time. Mars is honestly a friendlier integration environment than a busy freeway.

SPEAKER_01

Counterintuitively, yes. Mars is patient, Mars gives you time to think, Mars gives you a chance to run your route through 500,000 variables of physics simulation before you commit. A freeway gives you milliseconds and forgiveness measured in inches.

SPEAKER_00

Which is one of the reasons I find this milestone so interesting strategically. We tend to imagine AI piloting things in fast, dangerous environments. The actual first wins are happening in slow, deliberate, well-instrumente environments. That is a useful lesson for everyone.

SPEAKER_01

It is the lesson in some ways. Wherever you have rich telemetry, batched cadence, expensive expert time, and a strong simulation layer, that is exactly where current frontier models are most ready to deliver real, durable operational value to a serious organization.

SPEAKER_00

Let's talk about the broader anthropic and JPL collaboration that sits inside because this drive was not a one-off promotional stunt. There is a longer arc here about agentic AI and space exploration as a category of long-term opportunity.

SPEAKER_01

That is right. Anthropic and JPL have been collaborating on a portfolio of use cases. Route planning is the first to make it to a public executed milestone, but you can see why this becomes a template for sample return operations and beyond.

SPEAKER_00

Mars sample return is the one I keep coming back to because the operational complexity is enormous. Multiple vehicles, multiple stages, sample handoffs, an ascent vehicle, an Earth return orbiter. That is an awful lot of long horizon planning to coordinate.

SPEAKER_01

It is a planning nightmare with a budget that has been under intense pressure. Anything that lets the human teams operate at higher leverage on routine sequences and reserve their time for genuinely novel decisions is enormous mission insurance for that program.

SPEAKER_00

There's also a forward-looking angle around lunar surface operations. NASA, multiple commercial providers, and international partners are all targeting the South Pole of the Moon. And the same long horizon planning problem applies there too at growing scale.

SPEAKER_01

Lunar light delay is short, only a couple of seconds, but the operations cadence problem is similar. Many vehicles, many concurrent activities, many specialists on Earth. The economic case for AI assisted planning becomes even sharper at that operational scale and tempo.

SPEAKER_00

I want to spend a moment on the cultural significance of seeing Claude credited on a JPL press release. This is a frontier model showing up in a very different kind of deployment than the ones we usually cover on this show.

SPEAKER_01

It really is. We have spent the last few months talking about defense procurement debates, agent commerce protocols, healthcare partnerships, and financial eugenic platforms. Adding planetary exploration to that list is a strong signal about how broad the surface area has become.

SPEAKER_00

And it is happening at a moment when the public conversation about AI is mostly about jobs and risk and regulation. A clean, optimistic, well-executed milestone like this is a useful counterweight to the tone of a lot of recent coverage.

SPEAKER_01

It is, and I think the optimism is earned. But I also want to be careful not to overstate what this drive represents. This was not autonomous Martian decision making. This was AI-assisted route planning with a heavy human and simulation safety stack.

SPEAKER_00

That distinction matters a lot. When we hear AI drives a rover on Mars, the headline does a lot of work that the underlying engineering does not actually claim. So what is your honest, sober summary of the milestone in plain language?

SPEAKER_01

My honest summary is that a frontier model just generalized its coding skills into a niche, safety, critical domain, specific language with almost no public corpus, drafted a rote that cleared rigorous simulation and saved JPL meaningful planning time on real drives.

SPEAKER_00

That is a more grounded version of the story than the headlines, and I think it is actually more impressive because the headlines make it sound like magic, and the reality is a much more interesting piece of engineering choreography.

SPEAKER_01

Engineering choreography is the right phrase. The magic is in the integration. It is not clawed alone on Mars. It is clawed inside a workflow with hazard maps and simulation and human review and a pre-existing autonomous obstacle avoidance system underneath.

SPEAKER_00

For our listeners who think about AI deployment in their day jobs, what is the takeaway pattern from how JPL did this? Because a lot of enterprises are still trying to figure out what high-stakes AI integration is supposed to look like.

SPEAKER_01

The takeaway is that you do not let the model own the decision boundary. You let the model own the draft, the synthesis, and the self-critique. You keep your simulation, your verification, and your human review on the actual safety surface itself.

SPEAKER_00

That matches almost exactly what the best enterprise teams are converging on for agentic deployments in finance, in healthcare, and in legal review. The model drafts. The deterministic systems verify. The humans approve. The model never actually pushes the final button.

SPEAKER_01

Exactly. And the fact that this same pattern works on a vehicle 200 million miles away as well as it works in a back office is, in my view, the most generalizable insight the whole story actually offers to practitioners.

SPEAKER_00

Let's wrap with what to watch next, because this is clearly the beginning of something, rather than a one-time stunt. What are the markers you will be looking for over the next six to twelve months in this particular space?

SPEAKER_01

I will be watching for a regular cadence of clod planned drives instead of one-off milestones. I will be watching for the methodology to extend to robotic arm planning, and I will be watching for similar patterns showing up at other space agencies.

SPEAKER_00

I will add one more, which is whether the JPL methodology gets written up and shared publicly enough that other operators of remote autonomous systems can adopt it. Subsea robotics, polar science stations, and orbital servicing all have very similar shapes.

SPEAKER_01

That is a great call-out. The same pattern of long horizon planning, slow cadence, rich simulation, and expensive expert time shows up everywhere from undersea research vessels to satellite servicing missions. The market for this approach is much bigger than Mars itself.

SPEAKER_00

One last reflection before we close. We have spent a lot of episodes lately on stories that are uncomfortable. Layoffs, regulation, geopolitics, procurement fights. It is genuinely refreshing to spend an episode on a story that is mostly about people doing extraordinary work together.

SPEAKER_01

It really is, and I will close on that note. A team at JPL, a team at Anthropic, and a decades-old XML dialect teamed up to drive a robot across an alien landscape using a model that was originally trained to write Python.

SPEAKER_00

That's all for today's episode of the DX Today Podcast. Thanks for listening, and we'll see you next time.