Book review: The Measure of All Minds: Evaluating Natural and Artificial Intelligence, by José Hernández-Orallo.
Much of this book consists of surveys of the psychometric literature. But the best parts of the book involve original results that bring more rigor and generality to the field. The best parts of the book approach the quality that I saw in Judea Pearl’s Causality, and E.T. Jaynes’ Probability Theory, but Measure of All Minds achieves a smaller fraction of its author’s ambitions, and is sometimes poorly focused.
Hernández-Orallo has an impressive ambition: measure intelligence for any agent. The book mentions a wide variety of agents, such as normal humans, infants, deaf-blind humans, human teams, dogs, bacteria, Q-learning algorithms, etc.
The book is aimed at a narrow and fairly unusual target audience. Much of it reads like it’s directed at psychology researchers, but the more original parts of the book require thinking like a mathematician.
The survey part seems pretty comprehensive, but I wasn’t satisfied with his ability to distinguish the valuable parts (although he did a good job of ignoring the politicized rants that plague many discussions of this subject).
For nearly the first 200 pages of the book, I was mostly wondering whether the book would address anything important enough for me to want to read to the end. Then I reached an impressive part: a description of an objective IQ-like measure. Hernández-Orallo offers a test (called the C-test) which:
- measures a well-defined concept: sequential inductive inference,
- defines the correct responses using an objective rule (based on Kolmogorov complexity),
- with essentially no arbitrary cultural bias (the main feature that looks like an arbitrary cultural bias is the choice of alphabet and its order)[1],
- and gives results in objective units (based on Levin’s Kt).
Yet just when I got my hopes up for a major improvement in real-world IQ testing, he points out that what the C-test measures is too narrow to be called intelligence: there’s a 960 line Perl program that exhibits human-level performance on this kind of test, without resembling a breakthrough in AI.
Continue Reading