Blog post review: LOVE in a simbox.
Jake Cannell has a very interesting post on LessWrong called LOVE in a simbox is all you need, with potentially important implications for AGI alignment. (LOVE stands for Learning Other’s Values or Empowerment.)
Alas, he organized it so that the most alignment-relevant ideas are near the end of a long-winded discussion of topics whose alignment relevance seems somewhat marginal. I suspect many people gave up before reaching the best sections.
I will summarize and review the post in roughly the opposite order, in hopes of appealing to a different audience. I’ll likely create a different set of misunderstandings from what Jake’s post has created. Hopefully this different perspective will help readers triangulate on some hypotheses that are worth further analysis.
A conflict is brewing between China and the West.
Beijing is determined to reassert control over Taiwan. The US, and likely most of NATO, seem likely to respond by, among other things, boycotting China.
We should, of course, worry that this will lead to war between China and the US. I don’t have much insight into that risk. I’ll focus in this post on risks about which I have some insight, without meaning to imply that they’re the most important risks.
Such a boycott would be more costly than the current boycott of Russia, and the benefits would likely be smaller.
How can I predict whether the reaction to China’s action against Taiwan will be a rerun of the response to the recent Russian attack on Ukraine?
I’ll start by trying to guess the main forces that led to the boycott of Russia.
I previously sounded vaguely optimistic about the Baze blood test technology. They shut down their blood test service this spring, “for the foreseeable future”. Their web site suggests that they plan to resume it someday. I don’t have much hope that they’ll resume selling it.
Shortly after I posted about Baze, they stopped reporting numbers for magnesium, vitamin D, and vitamin B12. I.e. they only told me results such as “low”, “optimal”, “normal”, etc. This was apparently was due to FDA regulations, although I’m unclear why.
I’d like to believe that Baze is working on getting permission to report results the way that companies such as Life Extension report a wide variety of tests that are conducted via LabCorp.
At roughly the same time, Thorne Research announced study results of a device that sounds very similar to the Baze device (maybe a bit more reliable?).
Thorne is partly a supplement company, but also already has enough of a focus on testing that I don’t expect it to use tests primarily for selling vitamins, the way Baze did.
I’m debating whether to invest in Thorne.
Book review: What We Owe the Future, by William MacAskill.
WWOTF is a mostly good book that can’t quite decide whether it’s part of an activist movement, or aimed at a small niche of philosophy.
MacAskill wants to move us closer to utilitarianism, particularly in the sense of evaluating the effects of our actions on people who live in the distant future. Future people are real, and we have some sort of obligation to them.
WWOTF describes humanity’s current behavior as reckless, like an imprudent teenager. MacAskill almost killed himself as a teen, by taking a poorly thought out risk. Humanity is taking similar thoughtless risks.
MacAskill carefully avoids endorsing the aspect of utilitarianism that says everyone must be valued equally. That saves him from a number of conclusions that make utilitarianism unpopular. E.g. it allows him to be uncertain about how much to care about animal welfare. It allows him to ignore the difficult arguments about the morally correct discount rate.
Book review: True Age: Cutting-Edge Research to Help Turn Back the Clock, by Morgan Levine.
Another year, another book on aging. This one comes close to saying important things about how to slow down aging, then chickens out just before reaching the finish line.
In 1986, Drexler predicted (in Engines of Creation) that we’d have molecular assemblers in 30 years. They would roughly act as fast, atomically precise 3-d printers. That was the standard meaning of nanotech for the next decade, until more mainstream authorities co-opted the term.
What went wrong with that forecast?
In my review of Where Is My Flying Car? I wrote:
Josh describes the mainstream reaction to nanotech fairly well, but that’s not the whole story. Why didn’t the military fund nanotech? Nanotech would likely exist today if we had credible fears of Al Qaeda researching it in 2001.
I recently changed my mind about that last sentence, partly because of what I recently read about the Manhattan Project, and partly due to the world’s response to COVID.
Book review: Now It Can Be Told: The Story Of The Manhattan Project, by Leslie R. Groves.
This is the story of a desperate arms race, against what turned out to be a mostly imaginary opponent. I read it for a perspective on how future arms races and large projects might work.
What Surprised Me
It seemed strange that a large fraction of the book described how to produce purified U-235 and plutonium, and that the process of turning those fuels into bombs seemed anticlimactic.
Approximately a book review: Eric Drexler’s QNR paper.
[Epistemic status: very much pushing the limits of my understanding. I’ve likely made several times as many mistakes as in my average blog post. I want to devote more time to understanding these topics, but it’s taken me months to produce this much, and if I delayed this in hopes of producing something better, who knows when I’d be ready.]
This nearly-a-book elaborates on his CAIS paper (mainly chapters 37 through 39), describing a path for AI capability research enables the CAIS approach to remain competitive as capabilities exceed human levels.
AI research has been split between symbolic and connectionist camps for as long as I can remember. Drexler says it’s time to combine those approaches to produce systems which are more powerful than either approach can be by itself.
He suggests a general framework for how to usefully combine neural networks and symbolic AI. It’s built around structures that combine natural language words with neural representations of what those words mean.
Drexler wrote this mainly for AI researchers. I will attempt to explain it to a slightly broader audience.
This post is mostly a response to the Foresight Institute’s book Gaming the Future, which is very optimistic about AI’s being cooperative. They expect that creating a variety of different AI’s will enable us to replicate the checks and balances that the US constitution created.
I’m also responding in part to Eliezer’s AGI lethalities, points 34 and 35, which say that we can’t survive the creation of powerful AGI’s simply by ensuring the existence of many co-equal AGI’s with different goals. One of his concerns is that those AGI’s will cooperate with each other enough to function as a unitary AGI. Interactions between AGI’s might fit the ideal of voluntary cooperation with checks and balances, yet when interacting with humans those AGI’s might function as an unchecked government that has little need for humans.
I expect reality to be somewhere in between those two extremes. I can’t tell which of those views is closer to reality. This is a fairly scary uncertainty.
[Epistemic status: mostly writing to clarify my intuitions, with just a few weak attempts to convince others. It’s no substitute for reading Drexler’s writings.]
I’ve been struggling to write more posts relating to Drexler’s vision for AI (hopefully to be published soon), and in the process got increasingly bothered by the issue of whether AI researchers will see incentives to give AI’s broad goals that turn them into agents.
Drexler’s CAIS paper convinced me that our current trajectory is somewhat close to a scenario where human-level AI’s that are tool-like services are available well before AGI’s with broader goals.
Yet when I read LessWrong, I sympathize with beliefs that developers will want quite agenty AGI’s around the same time that CAIS-like services reach human levels.
I’m fed up with this epistemic learned helplessness, and this post is my attempt to reconcile those competing intuitions.