Updated: July 2017
To better inform our thinking about long-term philanthropic investment and hits-based giving, I (Luke Muehlhauser) have begun to investigate the historical track record of long-range forecasting and planning. I hope to publish additional findings later, but for now, I’ll share just one example finding from this investigation,1 concerning one of the most famous and respected products of professional futurism: the 1967 book The Year 2000: A Framework for Speculation on the Next Thirty-three Years, co-authored by Herman Kahn and Anthony J. Wiener.
1 Background information on Herman Kahn
Herman Kahn was one of the most prominent futurists and military strategists of the 20th century, and is sometimes cited as a “father of scenario analysis.”2 During the 1950s and 60s he became well-known for his contributions to nuclear war strategy. His early research was conducted at RAND, the original research arm of the U.S. Air Force, which was in charge of the country’s nuclear arsenal.3
In 1961, Kahn co-founded his own “high-class RAND” called the Hudson Institute, which soon employed at least 80 research analysts, research aides, and other staff.4 For a while he continued to publish on nuclear strategy and consult for the Department of Defense, but in the mid-1960s he began to turn his attention from military strategy to long-term “futurology.” In 1967, Kahn and Wiener (hereafter “K&W”) cobbled together five years of future-oriented work by the Hudson Institute into the book The Year 2000. Kahn biographer (and occasional co-author) B. Bruce-Briggs later described it as “the fundamental text” of futurology and “the single most important work in the field,” and wrote that “as a result of the book, [Kahn] became established in a second public career… as the leader of futurology.”5
2 Richard Albright’s assessment of technology forecasts in The Year 2000
Perhaps the easiest-to-evaluate part of The Year 2000 is its list of 100 technology predictions (for the year 2000), which appear on pp. 51-55. Conveniently, Albright (2002) assessed the accuracy of these 100 forecasts using a method reasonable enough that I don’t think it would add much value for me to perform my own, independent assessment of them.
Albright assembled a panel of eight experts, who were “experienced in a range of scientific fields with a mix of industrial and academic backgrounds,” to (independently) judge the accuracy of the 100 forecasts on a 5-point scale:
1. Bingo: a truly remarkable prediction that has materialized.
2. Okay: a good prediction of innovation that has materialized.
3. Not Yet: a prediction that might occur but has not happened yet.
4. Oops: just wrong.
5. What?: as in: “What were they thinking?”
The final rating for each forecast was the average rating assigned by all panelists, such that a forecast was judged as accurate if it scored an average rating of 2 or lower.6
To illustrate some of Albright’s results, here are the four best-rated forecasts (I’ve preserved K&W’s original forecast numbering):
- 71. Inexpensive high-capacity, worldwide, regional, and local (home and business) communication (perhaps using satellites, lasers, and light pipes)
- 74. Pervasive business use of computers
- 82. Direct broadcasts from satellites to home receivers
- 1. Multiple applications for lasers and masers for sensing, measuring, communication, cutting, welding, power transmission, illumination, and destructive (defensive)
And here are the four worst-rated forecasts:
- 35. Human hibernation for relatively extensive periods (months to years)
- 27. The use of nuclear explosives for excavation and mining, generation of power, creation of high-temperature-pressure environments, or as a source of neutrons or other radiation
- 79. Inexpensive and reasonably effective ground-based BMD (ballistic missile defense)
- 19. Human hibernation for short periods (hours or days)
Unsurprisingly, panelists differed greatly on how many forecasts they rated as having occurred (see Albright Table 4), and some forecasts were rated with much more consensus than others (see Albright Table 6). The table of forecasts with greatest and least consensus makes intuitive sense to me: those with greatest consensus seem to have relatively straightforward interpretations (e.g. “Direct broadcasts from satellites to home receivers”), and those with least consensus are phrased ambiguously and thus easily allow many interpretations (e.g. “New techniques for keeping physically fit and/or acquiring physical skills”).
Overall, about 45% of the forecasts were judged as accurate.
3 How good are these results?
The obvious thing to say is that it’s hard to tell. First, we don’t have any baseline to which we can compare K&W’s performance, such as a contemporaneous poll of experts asked to predict the likelihood of each of these same 100 innovations occurring by 2000. Without such a baseline, it’s difficult for us to understand how surprising or obvious each of these forecasts would have seemed to informed experts at the time. Second, many of the forecasts were stated ambiguously and thus were difficult for the judges to assess for accuracy. Third, K&W’s book is somewhat ambiguous about the degree to which they were trying to forecast the future.7
One could argue that K&W seem to have been hugely overconfident in these forecasts, given their statement (p. 50) that for each of these forecasts, “a responsible opinion can be found to argue a great likelihood that the innovation will be achieved before the year 2000 — usually long before. (We would probably agree with about 90-95 per cent of these estimates.)” If we interpret this to mean that they thought 90-95% of the 100 forecasts would come true, and we now know that ~45% of them came true, then K&W exhibited extreme overconfidence in their forecasts — substantially worse than is typical of, say, untrained subjects asked general knowledge questions, or political pundits asked to make short- and medium-term geopolitical forecasts.8
It seems plausible (but not obvious) to me that K&W’s performance on these forecasts really was this poor: after all, I haven’t seen evidence that K&W engaged in any probability calibration training prior to making these forecasts, nor even that they were aware of the contemporary probability calibration literature. Also, it seems likely to be even more difficult to make well-calibrated long-term forecasts than it is to (e.g.) make well-calibrated estimates about general knowledge questions or about geopolitical events occurring in the short- to medium-term future.
Intuitively, it is somewhat hard to imagine K&W thinking that 90-95% of these predictions would come true, given how radical and specific many of them are. Perhaps when they wrote that they “would probably agree with about 90-95 per cent of these estimates,” what they meant is that they thought 90-95% of these predictions had “a great likelihood” of coming true by 2100, where a “great likelihood” meant something like 65%-90%. In that case, it would still seem that K&W were overconfident, though less grossly so than if they really expected 90-95% of these 100 forecasts to come true by 2000.
However, I am inclined to believe the interpretation that K&W expected 90-95% of these 100 forecasts to come true by the year 2000. This is because, immediately after their list of “one hundred technical innovations very likely in the last third of the twentieth century,” K&W provide a shorter list of 25 “less likely but important possibilities,” clarifying that by “less likely,” they mean that these are “areas in which technological success by the year 2000 seems substantially less likely (even money bets, give or take a factor of five)…” (p. 55). This seems to mean that they’d bet at odds somewhere between 1:5 to 5:1 (1:1 being an “even money bet”), which implies a confidence of 16.67% to 83.33% for each of the 25 “less likely” forecasts. For these forecasts to be strictly “less likely” than the previously-listed 100 forecasts, K&W must have considered each of the previous 100 forecasts to be >83.33% likely to occur, which seems to vindicate the interpretation that they thought 90-95% of those forecasts would come true.
And that, in turn, implies that K&W were hugely overconfident in those 100 forecasts.
4 Sources
DOCUMENT | SOURCE |
---|---|
Albright (2002) | Source (archive) |
Bruce-Briggs (2000) | Source (archive) |
Cooke (1991) | Source (archive) |
Kaplan (1968) | Source |
Kuosa (2012) | Source (archive) |
Lichtenstein et al. (1982) | Source (archive) |
Menand (2005) | Source (archive) |
Tetlock (2005) | Source (archive) |
The Year 2000: A Framework for Speculation on the Next Thirty-three Years | Source |
Wikipedia, Herman Kahn | Source (archive) |