existential risks

All posts tagged existential risks

Book review: Genesis: Artificial Intelligence, Hope, and the Human Spirit, by Henry A. Kissinger, Eric Schmidt, and Craig Mundie.

Genesis lends a bit of authority to concerns about AI.

It is a frustrating book. It took more effort for me read than it should have taken. The difficulty stems not from complex subject matter (although the topics are complex), but from a peculiarly alien writing style that transcends mere linguistic differences – though Kissinger’s German intellectual heritage may play a role.

The book’s opening meanders through historical vignettes whose relevance remains opaque, testing my patience before finally addressing AI.

Continue Reading

TL;DR:

  • Corrigibility is a simple and natural enough concept that a prosaic AGI can likely be trained to obey it.
  • AI labs are on track to give superhuman(?) AIs goals which conflict with corrigibility.
  • Corrigibility fails if AIs that have goals which conflict with corrigibility.
  • AI labs are not on track to find a safe alternative to corrigibility.

This post is mostly an attempt to distill and rewrite Max Harm’s Corrigibility As Singular Target Sequence so that a wider audience understands the key points. I’ll start by mostly explaining Max’s claims, then drift toward adding some opinions of my own.

Continue Reading

Book review: On the Edge: The Art of Risking Everything, by Nate Silver.

Nate Silver’s latest work straddles the line between journalistic inquiry and subject matter expertise.

“On the Edge” offers a valuable lens through which to understand analytical risk-takers.

The River versus The Village

Silver divides the interesting parts of the world into two tribes.

On his side, we have “The River” – a collection of eccentrics typified by Silicon Valley entrepreneurs and professional gamblers, who tend to be analytical, abstract, decoupling, competitive, critical, independent-minded (contrarian), and risk-tolerant.

Continue Reading

Nearly a book review: Situational Awareness, by Leopold Aschenbrenner.

“Situational Awareness” offers an insightful analysis of our proximity to a critical threshold in AI capabilities. His background in machine learning and economics lends credibility to his predictions.

The paper left me with a rather different set of confusions than I started with.

Rapid Progress

His extrapolation of recent trends culminates in the onset of an intelligence explosion:

Continue Reading

[I mostly wrote this to clarify my thoughts. I’m unclear whether this will be valuable for readers. ]

I expect that within a decade, AI will be able to do 90% of current human jobs. I don’t mean that 90% of humans will be obsolete. I mean that the average worker could delegate 90% of their tasks to an AGI.

I feel confused about what this implies for the kind of AI long-term planning and strategizing that would enable an AI to create large-scale harm if it is poorly aligned.

Is the ability to achieve long-term goals hard for an AI to develop?

Continue Reading

Book review: The Coming Wave: Technology, Power, and the Twenty-first Century’s Greatest Dilemma, by Mustafa Suleyman.

An author with substantial AI expertise has attempted to discuss AI in terms that the average book reader can understand.

The key message: AI is about to become possibly the most important event in human history.

Maybe 2% of readers will change their minds as a result of reading the book.

A large fraction of readers will come in expecting the book to be mostly hype. They won’t look closely enough to see why Suleyman is excited.

Continue Reading

Context: looking for an alternative to a pause on AI development.

There’s some popular desire for software decisions to be explainable when used for decisions such as whether to grant someone a loan. That desire is not sufficient reason for possibly crippling AI progress. But in combination with other concerns about AI, it seems promising.

Much of this popular desire likely comes from people who have been (or expect to be) denied loans, and who want to scapegoat someone or something to avoid admitting that they look unsafe to lend to because they’ve made poor decisions. I normally want to avoid regulations that are supported by such motives.

Yet an explainability requirement shows some promise at reducing the risks from rogue AIs.

Continue Reading