Is Power-Seeking AI an Existential Risk? Artwork

Joe Carlsmith Audio

Audio versions of essays by Joe Carlsmith. Philosophy, futurism, and other topics. Text versions at joecarlsmith.com.

Joe Carlsmith Audio

Is Power-Seeking AI an Existential Risk?

January 24, 2023 • Joe Carlsmith

Audio version of my report on existential risk from power-seeking AI. Text here: https://arxiv.org/pdf/2206.13353.pdf. Narration by Type III audio.

Abstract

1 Introduction

1.1 Preliminaries

1.2 Backdrop

1.2.1 Intelligence

1.2.2 Agency

1.2.3 Playing with fire

1.2.4 Power

2 Timelines

2.1 Three key properties

2.1.1 Advanced capabilities

2.1.2 Agentic planning

2.1.3 Strategic awareness

2.2 Likelihood by 2070

3 Incentives

3.1 Usefulness

3.2 Available techniques

3.3 Byproducts of sophistication

4 Alignment

4.1 Definitions and clarifications

4.2 Power-seeking

4.3 The challenge of practical PS-alignment

4.3.1 Controlling objectives

4.3.1.1 Problems with proxies

4.3.1.2 Problems with search

4.3.1.3 Myopia

4.3.2 Controlling capabilities

4.3.2.1 Specialization

4.3.2.2 Preventing problematic improvements

4.3.2.3 Scaling

4.3.3 Controlling circumstances

4.4 Unusual difficulties

4.4.1 Barriers to understanding

4.4.2 Adversarial dynamics

4.4.3 Stakes of error

5 Deployment

5.1 Timing of problems

5.2 Decisions

Image: assessment of expected value of deployment

5.3 Key risk factors

5.3.1 Externalities and competition

5.3.2 Number of relevant actors

5.3.3 Bottlenecks on usefulness

5.3.4 Deception

5.4 Overall risk of problematic deployment

6 Correction

6.1 Take-off

6.2 Warning shots

6.3 Competition for power

6.4 Corrective feedback loops

6.5 Sharing power

7 Catastrophe

Marker 53

8 Probabilities

Acknowledgments