Podcast Episode

Chapters

I. If superhuman AI systems are built, any given system is likely to be ‘goal-directed’

II. If goal-directed superhuman AI systems are built, their desired outcomes will probably be about as bad as an empty universe by human lights

III. If most goal-directed superhuman AI systems have bad goals, the future will very likely be bad

Counterarguments

A. Contra “superhuman AI systems will be ‘goal-directed’”

Different calls to ‘goal-directedness’ don’t necessarily mean the same concept

Ambiguously strong forces for goal-directedness need to meet an ambiguously high bar to cause a risk

B. Contra “goal-directed AI systems’ goals will be bad”

Small differences in utility functions may not be catastrophic

Differences between AI and human values may be small

Maybe value isn’t fragile

Short-term goals

C. Contra “superhuman AI would be sufficiently superior to humans to overpower humanity”

Human success isn’t from individual intelligence

AI agents may not be radically superior to combinations of humans and non-agentic machines

Intelligence may not be an overwhelming advantage

Unclear that many goals realistically incentivise taking over the universe

Quantity of new cognitive labor is an empirical question, not addressed

Speed of intelligence growth is ambiguous

Key concepts are vague

D. Contra the whole argument

The argument overall proves too much about corporations

I. Any given corporation is likely to be ‘goal-directed’

II. If goal-directed superhuman corporations are built, their desired outcomes will probably be about as bad as an empty universe by human lights

III. If most goal-directed corporations have bad goals, the future will very likely be bad

LessWrong (Curated & Popular)

Counterarguments to the basic AI x-risk case

Nov 04, 2022

LessWrong

Share Episode

Share on Facebook Share on Twitter Share on LinkedIn Download

Subscribe

Apple Podcasts Spotify More