One of most important assumptions in The Age of Ems is that non-em AGI will take a long time to develop.
Scott Alexander at SlateStarCodex complains that Robin rejects survey data that uses validated techniques, and instead uses informal surveys whose results better fit Robin’s biases . Robin clearly explains one reason why he does that: to get the outside view of experts.
Whose approach to avoiding bias is better?
- Minimizing sampling error and carefully documenting one’s sampling technique are two of the most widely used criteria to distinguish science from wishful thinking.
- Errors due to ignoring the outside view have been documented to be large, yet forecasters are reluctant to use the outside view.
So I rechecked advice from forecasting experts such as Philip Tetlock and Nate Silver, and the clear answer I got was … that was the wrong question.
Tetlock and Silver mostly focus on attitudes that are better captured by the advice to be a fox, not a hedgehog.
The strongest predictor of rising into the ranks of superforecasters is perpetual beta, the degree to which one is committed to belief updating and self-improvement.
Tetlock’s commandment number 3 says “Strike the right balance between inside and outside views”. Neither Tetlock or Silver offer hope that either more rigorous sampling of experts or dogmatically choosing the outside view over the inside view help us win a forecasting contest.
So instead of asking who is right, we should be glad to have two approaches to ponder, and should want more. (Robin only uses one approach for quantifying the time to non-em AGI, but is more fox-like when giving qualitative arguments against fast AGI progress).
What Robin downplays is that there’s no consensus of the experts on whom he relies, not even about whether progress is steady, accelerating, or decelerating.
Robin uses the median expert estimate of progress in various AI subfields. This makes sense if AI progress depends on success in many subfields. It makes less sense if success in one subfield can make the other subfields obsolete. If “subfield” means a guess about what strategy best leads to intelligence, then I expect the median subfield to be rendered obsolete by a small number of good subfields . If “subfield” refers to a subset of tasks that AI needs to solve (e.g. vision, or natural language processing), then it seems reasonable to look at the median (and I can imagine that slower subfields matter more). Robin appears to use both meanings of “subfield”, with fairly similar results for each, so it’s somewhat plausible that the median is informative.
Scott also complains that Robin downplays the importance of research spending while citing only a paper dealing with government funding of agricultural research. But Robin also cites another paper (Ulku 2004), which covers total R&D expenditures in 30 countries (versus 16 countries in the paper that Scott cites) .
Robin claims that AI progress will slow (relative to economic growth) due to slowing hardware progress and reduced dependence on innovation. Even if I accept Robin’s claims about these factors, I have trouble believing that AI progress will slow.
I expect higher em IQ will be one factor that speeds up AI progress. Garrett Jones suggests that a 40 IQ point increase in intelligence causes a 50% increase in a country’s productivity. I presume that AI researcher productivity is more sensitive to IQ than is, say, truck driver productivity. So it seems fairly plausible to imagine that increased em IQ will cause more than a factor of two increase in the rate of AI progress. (Robin downplays the effects of IQ in contexts where a factor of two wouldn’t much affect his analysis; he appears to ignore them in this context).
I expect that other advantages of ems will contribute additional speedups – maybe ems who work on AI will run relatively fast, maybe good training/testing data will be relatively cheap to create, or maybe knowledge from experimenting on ems will better guide AI research.
Robin’s arguments against an intelligence explosion are weaker than they appear. I mostly agree with those arguments, but I want to discourage people from having strong confidence in them.
The most suspicious of those arguments is that gains in software algorithmic efficiency “remain surprisingly close to the rate at which hardware costs have fallen. This suggests that algorithmic gains have been enabled by hardware gains”. He cites only (Grace 2013) in support of this. That paper doesn’t comment on whether hardware changes enable software changes. The evidence seems equally consistent with that or with the hypothesis that both are independently caused by some underlying factor. I’d say there’s less than a 50% chance that Robin is correct about this claim.
Robin lists 14 other reasons for doubting there will be an intelligence explosion: two claims about AI history (no citations), eight claims about human intelligence (one citation), and four about what causes progress in research (with the two citations mentioned earlier). Most of those 14 claims are probably true, but it’s tricky to evaluate their relevance.
I’d say there’s maybe a 15% chance that Robin is basically right about the timing of non-em AI given his assumptions about ems. His book is still pretty valuable if an em-dominated world lasts for even one subjective decade before something stranger happens. And “something stranger happens” doesn’t necessarily mean his analysis becomes obsolete.
 – I can’t find any SlateStarCodex complaint about Bostrom doing something in Superintelligence that’s similar to what Scott accuses Robin of, when Bostrom’s survey of experts shows an expected time of decades for human-level AI to become superintelligent. Bostrom wants to focus on a much faster takeoff scenario, and disagrees with the experts, without identifying reasons for thinking his approach reduces biases.
 – One example is that genetic algorithms are looking fairly obsolete compared to neural nets, now that they’re being compared on bigger problems than when genetic algorithms were trendy.
Robin wants to avoid biases from recent AI fads by looking at subfields as they were defined 20 years ago. Some recent changes in AI are fads, but some are increased wisdom. I expect many subfields to be dead ends, given how immature AI was 20 years ago (and may still be today).
 – Scott quotes from one of three places that Robin mentions this subject (an example of redundancy that is quite rare in the book), and that’s the one place out of three where Robin neglects to cite (Ulku 2004). Age of Em is the kind of book where it’s easy to overlook something important like that if you don’t read it more carefully than you’d read a normal book.
I tried comparing (Ulku 2004) to the OECD paper that Scott cites, and failed to figure out whether they disagree. The OECD paper is probably consistent with Robin’s “less than proportionate increases” claim that Scott quotes. But Scott’s doubts are partly about Robin’s bolder prediction that AI progress will slow down, and academic papers don’t help much in evaluating that prediction.
If you’re tempted to evaluate how well the Ulku paper supports Robin’s views, beware that this quote is one of its easier to understand parts:
In addition, while our analysis lends support for endogenous growth theories in that it confirms a significant relationship between R&D stock and innovation, and between innovation and per capita GDP, it lacks the evidence for constant returns to innovation in terms of R&D stock. This implies that R&D models are not able to explain sustainable economic growth, i.e. they are not fully endogenous.