Science and Technology

I recently noticed similarities between how I decide what stock market evidence to look at, and how the legal system decides what lawyers are allowed to tell juries.

This post will elaborate on Eliezer’s Scientific Evidence, Legal Evidence, Rational Evidence. In particular, I’ll try to generalize about why there’s a large class of information that I actively avoid treating as Bayesian evidence.

Continue Reading

AI looks likely to cause major changes to society over the next decade.

Financial markets have mostly not reacted to this forecast yet. I expect it will be at least a few months, maybe even years, before markets have a large reaction to AI. I’d much rather buy too early than too late, so I’m trying to reposition my investments this winter to prepare for AI.

This post will focus on scenarios where AI reaches roughly human levels sometime around 2030 to 2035, and has effects that are at most 10 times as dramatic as the industrial revolution. I’m not confident that such scenarios are realistic. I’m only saying that they’re plausible enough to affect my investment strategies.

Continue Reading

Blog post review: LOVE in a simbox.

Jake Cannell has a very interesting post on LessWrong called LOVE in a simbox is all you need, with potentially important implications for AGI alignment. (LOVE stands for Learning Other’s Values or Empowerment.)

Alas, he organized it so that the most alignment-relevant ideas are near the end of a long-winded discussion of topics whose alignment relevance seems somewhat marginal. I suspect many people gave up before reaching the best sections.

I will summarize and review the post in roughly the opposite order, in hopes of appealing to a different audience. I’ll likely create a different set of misunderstandings from what Jake’s post has created. Hopefully this different perspective will help readers triangulate on some hypotheses that are worth further analysis.

Continue Reading

Book review: What We Owe the Future, by William MacAskill.

WWOTF is a mostly good book that can’t quite decide whether it’s part of an activist movement, or aimed at a small niche of philosophy.

MacAskill wants to move us closer to utilitarianism, particularly in the sense of evaluating the effects of our actions on people who live in the distant future. Future people are real, and we have some sort of obligation to them.

WWOTF describes humanity’s current behavior as reckless, like an imprudent teenager. MacAskill almost killed himself as a teen, by taking a poorly thought out risk. Humanity is taking similar thoughtless risks.

MacAskill carefully avoids endorsing the aspect of utilitarianism that says everyone must be valued equally. That saves him from a number of conclusions that make utilitarianism unpopular. E.g. it allows him to be uncertain about how much to care about animal welfare. It allows him to ignore the difficult arguments about the morally correct discount rate.

Continue Reading

In 1986, Drexler predicted (in Engines of Creation) that we’d have molecular assemblers in 30 years. They would roughly act as fast, atomically precise 3-d printers. That was the standard meaning of nanotech for the next decade, until more mainstream authorities co-opted the term.

What went wrong with that forecast?

In my review of Where Is My Flying Car? I wrote:

Josh describes the mainstream reaction to nanotech fairly well, but that’s not the whole story. Why didn’t the military fund nanotech? Nanotech would likely exist today if we had credible fears of Al Qaeda researching it in 2001.

I recently changed my mind about that last sentence, partly because of what I recently read about the Manhattan Project, and partly due to the world’s response to COVID.

Continue Reading

Approximately a book review: Eric Drexler’s QNR paper.

[Epistemic status: very much pushing the limits of my understanding. I’ve likely made several times as many mistakes as in my average blog post. I want to devote more time to understanding these topics, but it’s taken me months to produce this much, and if I delayed this in hopes of producing something better, who knows when I’d be ready.]

This nearly-a-book elaborates on his CAIS paper (mainly chapters 37 through 39), describing a path for AI capability research enables the CAIS approach to remain competitive as capabilities exceed human levels.

AI research has been split between symbolic and connectionist camps for as long as I can remember. Drexler says it’s time to combine those approaches to produce systems which are more powerful than either approach can be by itself.

He suggests a general framework for how to usefully combine neural networks and symbolic AI. It’s built around structures that combine natural language words with neural representations of what those words mean.

Drexler wrote this mainly for AI researchers. I will attempt to explain it to a slightly broader audience.

Continue Reading

This post is mostly a response to the Foresight Institute’s book Gaming the Future, which is very optimistic about AI’s being cooperative. They expect that creating a variety of different AI’s will enable us to replicate the checks and balances that the US constitution created.

I’m also responding in part to Eliezer’s AGI lethalities, points 34 and 35, which say that we can’t survive the creation of powerful AGI’s simply by ensuring the existence of many co-equal AGI’s with different goals. One of his concerns is that those AGI’s will cooperate with each other enough to function as a unitary AGI. Interactions between AGI’s might fit the ideal of voluntary cooperation with checks and balances, yet when interacting with humans those AGI’s might function as an unchecked government that has little need for humans.

I expect reality to be somewhere in between those two extremes. I can’t tell which of those views is closer to reality. This is a fairly scary uncertainty.

Continue Reading

[Epistemic status: mostly writing to clarify my intuitions, with just a few weak attempts to convince others. It’s no substitute for reading Drexler’s writings.]

I’ve been struggling to write more posts relating to Drexler’s vision for AI (hopefully to be published soon), and in the process got increasingly bothered by the issue of whether AI researchers will see incentives to give AI’s broad goals that turn them into agents.

Drexler’s CAIS paper convinced me that our current trajectory is somewhat close to a scenario where human-level AI’s that are tool-like services are available well before AGI’s with broader goals.

Yet when I read LessWrong, I sympathize with beliefs that developers will want quite agenty AGI’s around the same time that CAIS-like services reach human levels.

I’m fed up with this epistemic learned helplessness, and this post is my attempt to reconcile those competing intuitions.

Continue Reading

I’ve been pondering whether we’ll get any further warnings about when AI(s) will exceed human levels at general-purpose tasks, and that doing so would entail enough risk that AI researchers ought to take some precautions. I feel pretty uncertain about this.

I haven’t even been able to make useful progress at clarifying what I mean by that threshold of general intelligence.

As a weak substitute, I’ve brainstormed a bunch of scenarios describing not-obviously-wrong ways in which people might notice, or fail to notice, that AI is transforming the world.

I’ve given probabilities for each scenario, which I’ve pulled out of my ass and don’t plan to defend.

Continue Reading

Book review: Why Everyone (Else) Is a Hypocrite: Evolution and the Modular Mind, by Robert Kurzban.

Minds are Modular

Many people explain minds by positing that they’re composed of parts:

  • the id, ego, and super-ego
  • the left side and the right side of the brain
  • System 1 and System 2
  • the triune brain
  • Marvin Minsky’s Society of Mind

Minsky’s proposal is the only one of these that resembles Kurzban’s notion of modularity enough to earn Kurzban’s respect. The modules Kurzban talks about are much more numerous, and more specialized, than most people are willing to imagine.

Here’s Kurzban’s favorite Minsky quote:

The mind is a community of “agents.” Each has limited powers and can communicate only with certain others. The powers of mind emerge from their interactions for none of the Agents, by itself, has significant intelligence. […] Everyone knows what it feels like to be engaged in a conversation with oneself. In this book, we will develop the idea that these discussions really happen, and that the participants really “exist.” In our picture of the mind we will imagine many “sub-persons”, or “internal agents”, interacting with one another. Solving the simplest problem—seeing a picture—or remembering the experience of seeing it—might involve a dozen or more—perhaps very many more—of these agents playing different roles. Some of them bear useful knowledge, some of them bear strategies for dealing with other agents, some of them carry warnings or encouragements about how the work of others is proceeding. And some of them are concerned with discipline, prohibiting or “censoring” others from thinking forbidden thoughts.

Let’s take the US government as a metaphor. Instead of saying it’s composed of the legislative, executive, and judicial modules, Kurzban would describe it as being made up of modules such as a White House press secretary, Anthony Fauci, a Speaker of the House, more generals than I can name, even more park rangers, etc.

In What Is It Like to Be a Bat?, Nagel says “our own mental activity is the only unquestionable fact of our experience”. In contrast, Kurzban denies that we know more than a tiny fraction of our mental activity. We don’t ask “what is it like to be an edge detector?”, because there was no evolutionary pressure to enable us to answer that question. It could be most human experience is as mysterious to our conscious minds as bat experiences. Most of our introspection involves examining a mental model that we construct for propaganda purposes.

Is Self-Deception Mysterious?

There’s been a good deal of confusion about self-deception and self-control. Kurzban attributes the confusion to attempts at modeling the mind as a unitary agent. If there’s a single homunculus in charge of all of the mind’s decisions, then it’s genuinely hard to explain phenomena that look like conflicts between agents.

With a sufficiently modular model of minds, the confusion mostly vanishes.

A good deal of what gets called self-deception is better described as being strategically wrong.

For example, when President Trump had COVID, the White House press secretary had a strong incentive not to be aware of any evidence that Trump’s health was worse than expected, in order to reassure voters without being clearly dishonest. Whereas the White House doctor had some reason to err a bit on the side of overestimating Trump’s risk of dying. So it shouldn’t surprise us if they had rather different beliefs. I don’t describe that situation as “the US government is deceiving itself”, but I’d be confused as to whether to describe it that way if I could only imagine the government as a unitary agent.

Minds work much the same way. E.g. the cancer patient who buys space on a cruise that his doctor says he won’t live to enjoy (presumably to persuade allies that he’ll be around long enough to be worth allying with), while still following the doctor’s advice about how to treat the cancer. A modular model of the mind isn’t surprised that his mind holds inconsistent beliefs about how serious the cancer is. The patient’s press-secretary-like modules are pursuing a strategy of getting friends to make long-term plans to support the patient. They want accurate enough knowledge of the patient’s health to sound credible. Why would they want to be more accurate than that?

Self-Control

Kurzban sees less value in the concept of a self than do most Buddhists.

almost any time you come across a theory with the word “self” in it, you should check your wallet.

Self-control has problems that are similar to the problems with the concept of self-deception. It’s best thought of as conflicts between modules.

We should expect context-sensitive influences on which modules exert the most influence on decisions. E.g. we should expect a calorie-acquiring module to have more influence when a marshmallow is in view than if a path to curing cancer is in view. That makes it hard for a mind to have a stable preference about how to value eating a marshmallow or curing cancer.

If I think I see a path to curing cancer that is certain to succeed, my cancer-research modules ought to get more attention than my calorie-acquiring modules. I’m pretty sure that’s what would happen if I had good evidence that I’m about to cure cancer. But a more likely situation is that my press-secretary-like modules say I’ll succeed, and some less eloquent modules say I’ll fail. That will look like a self-control problem to those who want the press secretary to be in charge, and look more like politics to those who take Kurzban’s view.

I could identify some of my brain’s modules as part of my “self”, and say that self-control refers to those modules overcoming the influence of the non-self parts of my brain. But the more I think like Kurzban, the more arbitrary it seems to treat some modules as more privileged than others.

The Rest

Along the way, Kurzban makes fun of the literature on self-esteem, and of models that say self-control is a function of resources.

The book consists mostly of easy to read polemics for ideas that ought to be obvious, but which our culture resists.

Warning: you should skip the chapter titled Morality and Contradictions. Kurzban co-authored a great paper called A Solution to the Mysteries of Morality. But in this book, his controversial examples of hypocrisy will distract attention of most readers from the rather unremarkable wisdom that the examples illustrate.