Book review: Expert Political Judgment: How Good Is It? How Can We Know? by Philip E. Tetlock
This book is a rather dry description of good research into the forecasting abilities of people who are regarded as political experts. It is unusually fair and unbiased.
His most important finding about what distinguishes the worst from the not-so-bad is that those on the hedgehog end of Isaiah Berlin’s spectrum (who derive predictions from a single grand vision) are wrong more often than those near the fox end (who use many different ideas). He convinced me that that finding is approximately right, but leaves me with questions.
Does the correlation persist at the fox end of the spectrum, or do the most fox-like subjects show some diminished accuracy?
How do we reconcile his evidence that humans with more complex thinking do better than simplistic humans, but simple autoregressive models beat all humans? That seems to suggest there’s something imperfect in using the hedgehog-fox spectrum. Maybe a better spectrum would use evidence on how much data influences their worldviews?
Another interesting finding is that optimists tend to be more accurate than pessimists. I’d like to know how broad a set of domains this applies to. It certainly doesn’t apply to predicting software shipment dates. Does it apply mainly to domains where experts depend on media attention?
To what extent can different ways of selecting experts change the results? Tetlock probably chose subjects that resemble those who most people regard as experts, but there must be ways of selecting experts which produce better forecasts. It seems unlikely they can match prediction markets, but there are situations where we probably can’t avoid relying on experts.
He doesn’t document his results as thoroughly as I would like (even though he’s thorough enough to be tedious in places):
I can’t find his definition of extremists. Is it those who predict the most change from the status quo? Or the farthest from the average forecast?
His description of how he measured the hedgehog-fox spectrum has a good deal of quantitative evidence, but not quite enough for me check where I would be on that spectrum.
How does he produce a numerical timeseries for his autoregressive models? It’s not hard to guess for inflation, but for the end of apartheid I’m rather uncertain.
Here’s one quote that says a lot about his results:
Beyond a stark minimum, subject matter expertise in world politics translates less into forecasting accuracy than it does into overconfidence