Book review: Superforecasting: The Art and Science of Prediction, by Philip E. Tetlock and Dan Gardner.
This book reports on the Good Judgment Project (GJP).
Much of the book recycles old ideas: 40% of the book is a rerun of Thinking Fast and Slow, 15% of the book repeats Wisdom of Crowds, and 15% of the book rehashes How to Measure Anything. Those three books were good enough that it’s very hard to improve on them. Superforecasting nearly matches their quality, but most people ought to read those three books instead. (Anyone who still wants more after reading them will get decent value out of reading the last 4 or 5 chapters of Superforecasting).
The book’s style is very readable, using an almost Gladwell-like style (a large contrast to Tetlock’s previous, more scholarly book), at a moderate cost in substance. It contains memorable phrases, such as “a fox with the bulging eyes of a dragonfly” (to describe looking at the world through many perspectives).
2.
The book overstates how surprising its research results are. E.g. when describing how superteams beat prediction markets, they say
Most economists would say it’s no contest. Prediction markets would mop the floor with the superteams.
But I would have expected most economists to have merely said that prediction markets would do at least as well as competing approaches, at least when writing for academic rather than popular audiences.
How robust are superteams to conflicts of interest if used to guide important decisions? Superteams would have been subject to very different pressures if their predictions were used to decide whether Saddam Hussein had WMDs. Those pressures would probably have made them act more like the CIA acted, to some unknown degree. We’ve got lots of evidence about markets which suggests they won’t be corrupted much by such pressures [1]. There are probably ways to minimize superteams’ susceptibility to such pressures, but that’s a difficult topic.
The GJP might have succeeded partly by appealing only to participants who cared about accuracy. High stakes would affect the recruiting for future superteams. It’s easy to imagine a forecasting team attracting more partisan participants if its predictions were intended to affect big decisions.
The book says little about those risks, and mainly speculates that their prediction markets might not have been liquid enough to elicit the full benefits of markets. That seems like a fairly minor concern to me [2].
So I tracked down a paper that better analyzes these results: Atanasov et al, Distilling the Wisdom of Crowds: Prediction Markets versus Prediction Polls. It mentions some more plausible possibilities:
- Participants assigned to prediction markets were less active than participants assigned to teams. The social feedback of a good team probably encourages activity. I’d expect this to matter less if participants had monetary rewards for good performance, as is the case with typical markets.
- Team predictions were weighted by prior performance. Real-money markets would also tend to work that way, due to poor performers dropping out, and good performers accumulating more money. In the GJP, traders in prediction markets started each year with equal amounts of play money, and a year wasn’t much time to generate differences in money due to performance.
Still, it seems quite possible that the GJP achieved better sharing of evidence in teams than is feasible in markets, while also matching the good incentives that markets create. Evidence from team performance in contexts outside of the GJP leads me to suspect it will be hard, but not impossible, to replicate the cooperation that the superteams achieved.
3.
Most of the problem that Tetlock is trying to solve is that people are pursuing goals other than accuracy.
Big name pundits avoid Tetlock-style track records, and often deny that they’re forecasting at all. The book demonstrates that they end up making claims that sound indistinguishable from forecasts. Yet they act like they prefer to be entertaining rather than make good descriptions of the future.
The book singles out Larry Kudlow as a stereotypical example of a pundit who pushes one Big Idea at the expense of good forecasts. He’s a better example now than when the book was written of someone whose career wasn’t hurt by inaccurate predictions: he’s rumored to be in line for a job in the Trump administration (the authors deny being partisan about how they choose examples, but I’m unsure whether to believe them).
The book describes many of the top superforecasters as coming from fairly ordinary backgrounds, with their success being mostly due to having the right attitude toward forecasting, rather than expert domain knowledge.
It seemed a bit incongruous that one superforecaster has a TV show. The book repeatedly mentions the conflict between being accurate and sounding interesting enough to be on TV. Why was this person an exception to that pattern?
Chapter 10 (The Leader’s Dilemma) is a weird attempt to explain how to resolve conflicts between desire for accuracy and demands that forecasters sound confident. It covers material that’s well outside Tetlock’s area of expertise, so it’s harder to judge whether he’s right.
4.
The world would be a much better place if everyone who read this book were to follow its advice. But the book provides plenty of reason to doubt that many readers will want to make accurate forecasts when that conflicts with partisan politics or with the desire for confident leadership.
For people trying to learn how to do well on the stock market, I’m almost tempted to say this book belongs in the top 10. But I’m concerned that the book’s style will cause people to nod in agreement, but not have the important concepts sink in.
Footnotes
[1] – because anyone who wants to maintain inaccurate prices for a significant time needs to effectively write blank checks to anyone who notices that the price is inaccurate.
[2] Their prediction markets had other, potentially more serious, limitations. They used automated market makers in an attempt to improve the liquidity, but in two of the four years that I participated, those market makers had weaknesses that enabled me to score well, even though I was too lazy to develop much understanding of the relevant issues. That enabled me to qualify as a superforecaster for the final season, by which time they implemented a good enough market maker that I wasn’t able to do well, given the limited time that I was willing to devote to trading.
Pingback: Inadequate Equilibria | Bayesian Investor Blog