Archives

All posts by Peter

Book review: Notes on a New Philosophy of Empirical Science (Draft Version), by Daniel Burfoot.

Standard views of science focus on comparing theories by finding examples where they make differing predictions, and rejecting the theory that made worse predictions.

Burfoot describes a better view of science, called the Compression Rate Method (CRM), which replaces the “make prediction” step with “make a compression program”, and compares theories by how much they compress a standard (large) database.

These views of science produce mostly equivalent results(!), but CRM provides a better perspective.

Machine Learning (ML) is potentially science, and this book focuses on how ML will be improved by viewing its problems through the lens of CRM. Burfoot complains about the toolkit mentality of traditional ML research, arguing that the CRM approach will turn ML into an empirical science.

This should generate a Kuhnian paradigm shift in ML, with more objective measures of the research quality than any branch of science has achieved so far.

Burfoot focuses on compression as encoding empirical knowledge of specific databases / domains. He rejects the standard goal of a general-purpose compression tool. Instead, he proposes creating compression algorithms that are specialized for each type of database, to reflect what we know about topics (such as images of cars) that are important to us.
Continue Reading

MIRI has produced a potentially important result (called Garrabrant induction) for dealing with uncertainty about logical facts.

The paper is somewhat hard for non-mathematicians to read. This video provides an easier overview, and more context.

It uses prediction markets! “It’s a financial solution to the computer science problem of metamathematics”.

It shows that we can evade disturbing conclusions such as Godel incompleteness and the paradox of the liar, by expecting to only be very confident about logically deducible facts (as opposed to being mathematically certain). That’s similar to the difference between treating beliefs about empirical facts as probabilities, as opposed to boolean values.

I’m somewhat skeptical that it will have an important effect on AI safety, but my intuition says it will produce enough benefits somewhere that it will become at least as famous as Pearl’s work on causality.

Book review: The Moral Economy: Why Good Incentives Are No Substitute for Good Citizens, by Samuel Bowles.

This book has a strange mixture of realism and idealism.

It focuses on two competing models: the standard economics model in which people act in purely self-interested ways, and a more complex model in which people are influenced by context to act either altruistically or selfishly.

The stereotypical example comes from the semi-famous Haifa daycare experiment, where daycare centers started fining parents for being late to pick up children, and the parents responded by being later.

The first half of the book is a somewhat tedious description of ideas that seem almost obvious enough to be classified as common sense. He points out that the economist’s model is a simplification that is useful for some purposes, yet it’s not too hard to find cases where it makes the wrong prediction about how people will respond to incentives.

That happens because society provides weak pressures that produce cooperation under some conditions, and because financial incentives send messages that influence whether people want to cooperate. I.e. the parents appear to have previously felt obligated to be somewhat punctual, but then inferred from the fines that it was ok to be late as long as they paid the price.[*].

The book advocates more realism on this specific issue. But it’s pretty jarring to compare that to the idealistic view the author takes on similar topics, such as acquiring evidence of how people react, or modeling politicians. He treats the Legislator (capitalized like that) as a very objective, well informed, and altruistic philosopher. That model may sometimes be useful, but I’ll bet that, on average, it produces worse predictions about legislators’ behavior than does the economist’s model of a self-interested legislator.

The book becomes more interesting around chapter V, when it analyzes the somewhat paradoxical conclusion that markets sometimes make people more selfish, yet cultures that have more experience with markets tend to cooperate more.

He isn’t able to fully explain that, but he makes some interesting progress. One factor that’s important to focus on is the difference between complete and incomplete contracts. Complete contracts describe everything a buyer might need to know about a product or service. An example of an incomplete contract would be an agreement to hire a lawyer to defend me – I don’t expect the lawyer to specify how good a defense to expect.

Complete contracts enable people to trade without needing to trust the seller, which can lead to highly selfish attitudes. Incomplete contracts lead to the creation of trust between participants, because having frequent transactions depends on some implicit cooperation.

The book ends by promoting the “new” idea that policy ought to aim for making people be good. But it’s unclear who disagrees with that idea. Economists sometimes sound like they disagree, because they often say that policy shouldn’t impose one group’s preferences on another group. But economists are quite willing to observe that people generally prefer cooperation over conflict, and that most people prefer institutions that facilitate cooperation. That’s what the book mostly urges.

The book occasionally hints at wanting governments to legislate preferences in ways that go beyond facilitating cooperation, but doesn’t have much of an argument for doing so.

[*] – The book implies that the increased lateness was an obviously bad result. This seems like a plausible guess. But I find it easy to imagine conditions where the reported results were good (i.e. the parents might benefit from being late more than it costs the teachers to accommodate them).

However, that scenario depends on the fines being high enough for the teachers to prefer the money over punctuality. They appear not to have been consulted, so success at that would have depended on luck. It’s unclear whether the teachers were getting overtime pay when parents were late, or whether the fines benefited only the daycare owner.

One of the weakest claims in The Age of Em was that AI progress has not been accelerating.

J Storrs Hall (aka Josh) has a hypothesis that AI progress accelerated about a decade ago due to a shift from academia to industry. (I’m puzzled why the title describes it as a coming change, when it appears to have already happened).

I find it quite likely that something important happened then, including an acceleration in the rate at which AI affects people.

I find it less clear whether that indicates a change in how fast AI is approaching human intelligence levels.

Josh points to airplanes as an example of a phase change being important.

I tried to compare AI progress to other industries which might have experienced a similar phase change, driven by hardware progress. But I was deterred by the difficulty of estimating progress in industries when they were driven by academia.

One industry I tried to compare to was photovoltaics, which seemed to be hyped for a long time before becoming commercially important (10-20 years ago?). But I see only weak signs of a phase change around 2007, from looking at Swanson’s Law. It’s unclear whether photovoltaic progress was ever dominated by academia enough for a phase change to be important.

Hypertext is a domain where a clear phase change happened in the earl 1990s. It experienced a nearly foom-like rate of adoption when internet availability altered the problem, from one that required a big company to finance the hardware and marketing, to a problem that could be solved by simply giving away a small amount of code. But this change in adoption was not accompanied by a change in the power of hypertext software (beyond changes due to network effects). So this seems like weak evidence against accelerating progress toward human-level AI.

What other industries should I look at?

I started writing morning pages a few months ago. That means writing three pages, on paper, before doing anything else [1].

I’ve only been doing this on weekends and holidays, because on weekdays I feel a need to do some stock market work close to when the market opens.

It typically takes me one hour to write three pages. At first, it felt like I needed 75 minutes but wanted to finish faster. After a few weeks, it felt like I could finish in about 50 minutes when I was in a hurry, but often preferred to take more than an hour.

That suggests I’m doing much less stream-of-consciousness writing than is typical for morning pages. It’s unclear whether that matters.

It feels like devoting an hour per day to morning pages ought to be costly. Yet I never observed it crowding out anything I valued (except maybe once or twice when I woke up before getting an optimal amount of sleep in order to get to a hike on time – that was due to scheduling problems, not due to morning pages reducing the available of time per day).
Continue Reading

Why do people knowingly follow bad investment strategies?

I won’t ask (in this post) about why people hold foolish beliefs about investment strategies. I’ll focus on people who intend to follow a decent strategy, and fail. I’ll illustrate this with a stereotype from a behavioral economist (Procrastination in Preparing for Retirement):[1]

For instance, one of the authors has kept an average of over $20,000 in his checking account over the last 10 years, despite earning an average of less than 1% interest on this account and having easy access to very liquid alternative investments earning much more.

A more mundane example is a person who holds most of their wealth in stock of a single company, for reasons of historical accident (they acquired it via employee stock options or inheritance), but admits to preferring a more diversified portfolio.

An example from my life is that, until this year, I often borrowed money from Schwab to buy stock, when I could have borrowed at lower rates in my Interactive Brokers account to do the same thing. (Partly due to habits that I developed while carelessly unaware of the difference in rates; partly due to a number of trivial inconveniences).

Behavioral economists are somewhat correct to attribute such mistakes to questionable time discounting. But I see more patterns than such a model can explain (e.g. people procrastinate more over some decisions (whether to make a “boring” trade) than others (whether to read news about investments)).[2]

Instead, I use CFAR-style models that focus on conflicting motives of different agents within our minds.

Continue Reading

Book review: Are We Smart Enough to Know How Smart Animals Are?, by Frans de Waal.

This book is primarily about discrediting false claims of human uniqueness, and showing how easy it is to screw up evaluations of a species’ cognitive abilities. It is best summarized by the cognitive ripple rule:

Every cognitive capacity that we discover is going to be older and more widespread than initially thought.

De Waal provides many anecdotes of carefully designed experiments detecting abilities that previously appeared to be absent. E.g. asian elephants failed mirror tests with small, distant mirrors. When experimenters dared to put large mirrors close enough for the elephants to touch, some of them passed the test.

Likewise, initial observations of behaviorist humans suggested they were rigidly fixated on explaining all behavior via operant conditioning. Yet one experimenter managed to trick a behaviorist into demonstrating more creativity, by harnessing the one motive that behaviorists prefer over their habit of advocating operant conditioning: their desire to accuse people of recklessly inferring complex cognition.

De Waal seems moderately biased toward overstating cognitive abilities of most species (with humans being one clear exception to that pattern).

At one point he gave me the impression that he was claiming elephants could predict where a thunderstorm would hit days in advance. I checked the reference, and what the elephants actually did was predict the arrival of the wet season, and respond with changes such as longer steps (but probably not with indications that they knew where thunderstorms would hit). After rereading de Waal’s wording, I decided it was ambiguous. But his claim that elephants “hear thunder and rainfall hundreds of miles away” exaggerates the original paper’s “detected … at distances greater than 100 km … perhaps as much as 300 km”.

But in the context of language, de Waal switches to downplaying reports of impressive abilities. I wonder how much of that is due to his desire to downplay claims that human minds are better, and how much of that is because his research isn’t well suited to studying language.

I agree with the book’s general claims. The book provides evidence that human brains embody only small, somewhat specialized improvements on the cognitive abilities of other species. But I found the book less convincing on that subject than some other books I’ve read recently. I suspect that’s mainly due to de Waal’s focus on anecdotes that emphasize what’s special about each species or individual. Whereas The Human Advantage rigorously quantifies important ways in which human brains are just a bigger primate brain (but primate brains are special!). Or The Secret of our Success (which doesn’t use particularly rigorous methods) provides a better perspective, by describing a model in which ape minds evolve to human minds via ordinary, gradual adaptations to mildly new environments.

In sum, this book is good at explaining the problems associated with research into animal cognition. It is merely ok at providing insights about how smart various species are.

Will young ems be created? Why and how will it happen?

Any children that exist as ems will be important as em societies mature, because they will adapt better to em environments than ems who uploaded as adults, making them more productive.

The Age of Em says little about children, presumably in part because no clear outside view predictions seem possible.

This post will use a non-Hansonian analysis style to speculate about which children will become ems. I’m writing this post to clarify my more speculative thoughts about how em worlds will work, without expecting to find much evidence to distinguish the good ideas from the bad ones.

Robin predicts few regulatory obstacles to uploading children, because he expects the world to be dominated by ems. I’m skeptical of that. Ems will be dominant in the sense of having most of the population, but that doesn’t tell us much about em influence on human society – farmers became a large fraction of the world population without meddling much in hunter-gatherer political systems. And it’s unclear whether em political systems would want to alter the relevant regulations – em societies will have much the same conflicting interest groups pushing for and against immigration that human societies have.

How much of Robin’s prediction of low regulation is due to his desire to start by analyzing a relatively simple scenario (low regulation) and add complexity later?

Continue Reading

Book review: Made-Up Minds: A Constructivist Approach to Artificial Intelligence, by Gary L. Drescher.

It’s odd to call a book boring when it uses the pun “ontology recapitulates phylogeny”[1]. to describe a surprising feature of its model. About 80% of the book is dull enough that I barely forced myself to read it, yet the occasional good idea persuaded me not to give up.

Drescher gives a detailed model of how Piaget-style learning in infants could enable them to learn complex concepts starting with minimal innate knowledge.
Continue Reading

One of most important assumptions in The Age of Ems is that non-em AGI will take a long time to develop.

1.

Scott Alexander at SlateStarCodex complains that Robin rejects survey data that uses validated techniques, and instead uses informal surveys whose results better fit Robin’s biases [1]. Robin clearly explains one reason why he does that: to get the outside view of experts.

Whose approach to avoiding bias is better?

  • Minimizing sampling error and carefully documenting one’s sampling technique are two of the most widely used criteria to distinguish science from wishful thinking.
  • Errors due to ignoring the outside view have been documented to be large, yet forecasters are reluctant to use the outside view.

So I rechecked advice from forecasting experts such as Philip Tetlock and Nate Silver, and the clear answer I got was … that was the wrong question.

Tetlock and Silver mostly focus on attitudes that are better captured by the advice to be a fox, not a hedgehog.

The strongest predictor of rising into the ranks of superforecasters is perpetual beta, the degree to which one is committed to belief updating and self-improvement.

Tetlock’s commandment number 3 says “Strike the right balance between inside and outside views”. Neither Tetlock or Silver offer hope that either more rigorous sampling of experts or dogmatically choosing the outside view over the inside view help us win a forecasting contest.

So instead of asking who is right, we should be glad to have two approaches to ponder, and should want more. (Robin only uses one approach for quantifying the time to non-em AGI, but is more fox-like when giving qualitative arguments against fast AGI progress).

2.

What Robin downplays is that there’s no consensus of the experts on whom he relies, not even about whether progress is steady, accelerating, or decelerating.

Robin uses the median expert estimate of progress in various AI subfields. This makes sense if AI progress depends on success in many subfields. It makes less sense if success in one subfield can make the other subfields obsolete. If “subfield” means a guess about what strategy best leads to intelligence, then I expect the median subfield to be rendered obsolete by a small number of good subfields [2]. If “subfield” refers to a subset of tasks that AI needs to solve (e.g. vision, or natural language processing), then it seems reasonable to look at the median (and I can imagine that slower subfields matter more). Robin appears to use both meanings of “subfield”, with fairly similar results for each, so it’s somewhat plausible that the median is informative.

3.

Scott also complains that Robin downplays the importance of research spending while citing only a paper dealing with government funding of agricultural research. But Robin also cites another paper (Ulku 2004), which covers total R&D expenditures in 30 countries (versus 16 countries in the paper that Scott cites) [3].

4.

Robin claims that AI progress will slow (relative to economic growth) due to slowing hardware progress and reduced dependence on innovation. Even if I accept Robin’s claims about these factors, I have trouble believing that AI progress will slow.

I expect higher em IQ will be one factor that speeds up AI progress. Garrett Jones suggests that a 40 IQ point increase in intelligence causes a 50% increase in a country’s productivity. I presume that AI researcher productivity is more sensitive to IQ than is, say, truck driver productivity. So it seems fairly plausible to imagine that increased em IQ will cause more than a factor of two increase in the rate of AI progress. (Robin downplays the effects of IQ in contexts where a factor of two wouldn’t much affect his analysis; he appears to ignore them in this context).

I expect that other advantages of ems will contribute additional speedups – maybe ems who work on AI will run relatively fast, maybe good training/testing data will be relatively cheap to create, or maybe knowledge from experimenting on ems will better guide AI research.

5.

Robin’s arguments against an intelligence explosion are weaker than they appear. I mostly agree with those arguments, but I want to discourage people from having strong confidence in them.

The most suspicious of those arguments is that gains in software algorithmic efficiency “remain surprisingly close to the rate at which hardware costs have fallen. This suggests that algorithmic gains have been enabled by hardware gains”. He cites only (Grace 2013) in support of this. That paper doesn’t comment on whether hardware changes enable software changes. The evidence seems equally consistent with that or with the hypothesis that both are independently caused by some underlying factor. I’d say there’s less than a 50% chance that Robin is correct about this claim.

Robin lists 14 other reasons for doubting there will be an intelligence explosion: two claims about AI history (no citations), eight claims about human intelligence (one citation), and four about what causes progress in research (with the two citations mentioned earlier). Most of those 14 claims are probably true, but it’s tricky to evaluate their relevance.

Conclusion

I’d say there’s maybe a 15% chance that Robin is basically right about the timing of non-em AI given his assumptions about ems. His book is still pretty valuable if an em-dominated world lasts for even one subjective decade before something stranger happens. And “something stranger happens” doesn’t necessarily mean his analysis becomes obsolete.

Footnotes

[1] – I can’t find any SlateStarCodex complaint about Bostrom doing something in Superintelligence that’s similar to what Scott accuses Robin of, when Bostrom’s survey of experts shows an expected time of decades for human-level AI to become superintelligent. Bostrom wants to focus on a much faster takeoff scenario, and disagrees with the experts, without identifying reasons for thinking his approach reduces biases.

[2] – One example is that genetic algorithms are looking fairly obsolete compared to neural nets, now that they’re being compared on bigger problems than when genetic algorithms were trendy.

Robin wants to avoid biases from recent AI fads by looking at subfields as they were defined 20 years ago. Some recent changes in AI are fads, but some are increased wisdom. I expect many subfields to be dead ends, given how immature AI was 20 years ago (and may still be today).

[3] – Scott quotes from one of three places that Robin mentions this subject (an example of redundancy that is quite rare in the book), and that’s the one place out of three where Robin neglects to cite (Ulku 2004). Age of Em is the kind of book where it’s easy to overlook something important like that if you don’t read it more carefully than you’d read a normal book.

I tried comparing (Ulku 2004) to the OECD paper that Scott cites, and failed to figure out whether they disagree. The OECD paper is probably consistent with Robin’s “less than proportionate increases” claim that Scott quotes. But Scott’s doubts are partly about Robin’s bolder prediction that AI progress will slow down, and academic papers don’t help much in evaluating that prediction.

If you’re tempted to evaluate how well the Ulku paper supports Robin’s views, beware that this quote is one of its easier to understand parts:

In addition, while our analysis lends support for endogenous growth theories in that it confirms a significant relationship between R&D stock and innovation, and between innovation and per capita GDP, it lacks the evidence for constant returns to innovation in terms of R&D stock. This implies that R&D models are not able to explain sustainable economic growth, i.e. they are not fully endogenous.