Book review: Expert Political Judgment: How Good Is It? How Can We Know? by Philip E. Tetlock
This book is a rather dry description of good research into the forecasting abilities of people who are regarded as political experts. It is unusually fair and unbiased.
His most important finding about what distinguishes the worst from the not-so-bad is that those on the hedgehog end of Isaiah Berlin’s spectrum (who derive predictions from a single grand vision) are wrong more often than those near the fox end (who use many different ideas). He convinced me that that finding is approximately right, but leaves me with questions.
Does the correlation persist at the fox end of the spectrum, or do the most fox-like subjects show some diminished accuracy?
How do we reconcile his evidence that humans with more complex thinking do better than simplistic humans, but simple autoregressive models beat all humans? That seems to suggest there’s something imperfect in using the hedgehog-fox spectrum. Maybe a better spectrum would use evidence on how much data influences their worldviews?
Another interesting finding is that optimists tend to be more accurate than pessimists. I’d like to know how broad a set of domains this applies to. It certainly doesn’t apply to predicting software shipment dates. Does it apply mainly to domains where experts depend on media attention?
To what extent can different ways of selecting experts change the results? Tetlock probably chose subjects that resemble those who most people regard as experts, but there must be ways of selecting experts which produce better forecasts. It seems unlikely they can match prediction markets, but there are situations where we probably can’t avoid relying on experts.
He doesn’t document his results as thoroughly as I would like (even though he’s thorough enough to be tedious in places):
I can’t find his definition of extremists. Is it those who predict the most change from the status quo? Or the farthest from the average forecast?
His description of how he measured the hedgehog-fox spectrum has a good deal of quantitative evidence, but not quite enough for me check where I would be on that spectrum.
How does he produce a numerical timeseries for his autoregressive models? It’s not hard to guess for inflation, but for the end of apartheid I’m rather uncertain.
Here’s one quote that says a lot about his results:
Beyond a stark minimum, subject matter expertise in world politics translates less into forecasting accuracy than it does into overconfidence
Pingback: Everything about Prediction Markets » Blog Archive » Expert Political Judgment
Hal Finney has apparently created a test that implements Tetlock’s Fox versus Hedgehog scale. I took it, trying to correct for the bias I got by reading Tetlock’s book (although I doubt I know how well I corrected for that bias), and got +2 (weakly toward the fox side of neutral).
Pingback: Mike Linksvayer » Bias enumeration
Re: why do simple autoregressive models outperform humans?
Meehl found that even _randomly_ weighted regressions outperformed humans.
One explanation: even randomly weighted models tend to put nonzero weight on all factors being considered. People have a hard time with more than 2 or 3, effectively zeroing the rest.
Pingback: Simple Heuristics That Make Us Smart « Bayesian Investor Blog
Pingback: AGI Timelines | Bayesian Investor Blog
Pingback: Superforecasting | Bayesian Investor Blog