LessWrong (Curated & Popular)

"Product Alignment is not Superintelligence Alignment (and we need the latter to survive)" by plex

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 4:20
tl;dr: progress on making Claude friendly[1] is not the same as progress on making it safe to build godlike superintelligence. solving the former does not imply we get a good future.[2] please track the difference.

The term Alignment was coined[3] to point to the technical problem of understanding how to build minds such that if they were to become strongly and generally superhuman, things would go well.

It has been increasingly adopted by frontier AI labs and much of the rest of the AI safety community to mean a much easier challenge, something like "having AIs that are empirically doing approximately what you ask them to do".[4]

If it's possible to use an intent-aligned product to build a research system which discovers a new paradigm and breaks your guardrails, then it is not Aligned in the original sense.

If you can use your intent aligned system to write code which jailbreaks other LLMs and enables them to do dangerous ML research, it is also not Aligned in the original sense.

Conflating progress on product alignment with progress on superintelligence alignment seems to be lulling much of the AI safety community into a false sense of security.

Why is Superintelligence [...]

---

Outline:

(01:18) Why is Superintelligence Alignment less prominent?

(02:21) Why do we need Superintelligence Alignment to survive?

The original text contained 10 footnotes which were omitted from this narration.

---

First published:
March 31st, 2026

Source:
https://www.lesswrong.com/posts/mrwYCNocXCP2hrWt8/product-alignment-is-not-superintelligence-alignment-and-we

---



Narrated by TYPE III AUDIO.