Tetlock & Gordon, Superforecasting,forecasters, forecasting teams, best practices

Superforecasting: The Art and Science of Prediction
by
Philip E. Tetlock and Dan Gardner

Measurement of forecast accuracy:

Reality combines the "clocklike and cloudlike," Philip E. Tetlock and Dan Gardner explain in "Superforecasting: The Art and Science of Prediction."
&

Particular targets vary from that which is sufficiently narrow to permit scientific "clocklike" certainty to that which involves degrees of "cloudlike" complexity and duration that render forecasts totally unreliable.

"Forecast, measure, revise. Repeat. It's a never-ending process of incremental improvement that explains why weather forecasts are good and slowly getting better."

Real world complexity places limits on predictability that increase exponentially as we look further into the future, and as we attempt to narrow ranges of probability. "The laws of physics aside," particular targets vary from that which is sufficiently narrow to permit scientific "clocklike" certainty to that which involves degrees of "cloudlike" complexity and duration that render forecasts totally unreliable.

Of course, even scientific certainty falls short of absolute certainty and varies with the subject matter, but is as close as is currently possible.

The iconic metaphor is weather forecasting, which is the subject of never-ending efforts at improvement.

"Forecast, measure, revise. Repeat. It's a never-ending process of incremental improvement that explains why weather forecasts are good and slowly getting better. There may be limits to such improvements, however, because weather is the textbook illustration of nonlinearity. The further out the forecaster tries to look, the more opportunity there is for chaos to flap its butterfly wings and blow away expectations. Big leaps in computer power and continued refinement of forecasting models may nudge the limits a little further but those advances gradually get harder and the payoffs shrink to zero. How good can it get? No one knows. But knowing the current limits is itself a success."

The authors fall from grace in placing central bank macroeconomic forecasters in the same category as weather forecasters and marketing and financial professionals. As long as central bank economists employ macroeconomic mathematical models that have never been validated and are prey to political, ideological and/or professional biases, their efforts remain useless except for political propaganda. In short, dedication to logic and the practice of ruthless objectivity are essential for forecasting success, but such practices are unfortunately not part of the central bank toolkit.

Forecasting is frequently less concerned with accuracy than with achieving other objectives - ideological, entertainment, propaganda, sensationalism, agenda support, personal ego, or business. Objectivity is all too often viewed as irrelevant.

The interests of self and tribe often determine forecasts. Forecasting is frequently less concerned with accuracy than with achieving other objectives - ideological, entertainment, propaganda, sensationalism, agenda support, personal ego, or business. Objectivity is all too often viewed as irrelevant. Public forecasts are seldom about accuracy and truth. They are all too often mere instruments for tangential interests. "It's a messy business that doesn't seem to be getting any better."

"It follows that the goal of forecasting is not to see what's coming. It is to advance the interests of the forecaster and the forecaster's tribe. Accurate forecasts may help do that sometimes, and when they do accuracy is welcome, but it is pushed aside if that's what the pursuit of power requires."

Beware the authoritative word,
it's meant not to benefit you.
It's meant to benefit others,
the authoritative few.

Political polling, for example, demonstrates a clear divide between those that reflect little effort at examining past successes and failure and those that rigorously examine past results.
&
Catastrophist forecasters try to frighten people into actions desired by the forecasters or just to gain notoriety. They are usually wrong, but nevertheless successful in their real aims. Crying "wolf" is often successful even when there is no wolf.
&

Attaching numbers to forecasting is essential for scorekeeping.

The authors mention credit scores that, despite obvious weaknesses, are a big improvement on the discretionary, sometimes arbitrary and capricious prior methods.

Scorekeeping is essential to improvement in forecasting, the authors emphasize. Attaching numbers to forecasting is essential for scorekeeping. "Where wisdom once was, quantification will now be." The authors speculate on the possibility for evolution towards a more testable, results-oriented, evidence-based forecasting profession much like the transformation of medical practice a century ago.
&
The authors concede that methods of scoring forecasting accuracy, such as "Brier scoring" referred to in the book, fall well short of scientific accuracy.

"There are 'metrics' for phenomena that cannot be measured. Numerical values are assigned to things that cannot be captured by numbers."
&
"The truly numerate know that numbers are tools, nothing more, and their quality can range from wretched to superb. [Hospital outcomes can be twisted by turning away the sickest patients, for example.] Numbers must be constantly scrutinized and improved, which can be an unnerving process because it is unending. Progressive improvement is attainable. Perfection is not."

By measuring, by keeping account of accuracy, considerable improvement in forecasting practice is possible. The authors mention credit scores that, despite obvious weaknesses, are a big improvement on the discretionary, sometimes arbitrary and capricious prior methods.
&

The IARPA tests posed narrow, short term, less important questions that could be scored. They were often constituent parts of bigger, more important questions that couldn't be scored. However, examining small but pertinent questions together can provide insight into a big question.

Forecast scoring is Tetlock's life work, and the objective of the "Good Judgment Project" (hereinafter "GJP").

"Cumulatively, more than twenty thousand intellectually curious laypeople tried to figure out if protests in Russia would spread, the price of gold would plummet, the Nikkei would close above 9,500, war would erupt on the Korean peninsula, and many other questions about complex, challenging global issues. By varying experimental conditions, we could gauge which factors improved foresight, by how much, over which time frames, and how good forecasts could become if best practices were layered on each other. Laid out like that, it sounds simple. It wasn't."

The effort is centered in U. Cal. Berkeley and U. Penn., and is supported by the Intelligence Advanced Research Project Activity (hereinafter "IARPA"). The IARPA tests posed narrow, short term, less important questions that could be scored. They were often constituent parts of bigger, more important questions that couldn't be scored. However, examining small but pertinent questions together can provide insight into a big question.

"Over four years, IARPA posed nearly five hundred questions about world affairs. Time frames were shorter than in [Tetlock's] earlier research, with the vast majority of forecasts extending out more than one month and less than one year. In all, we gathered over one million individual judgments about the future."
&
"After two years, GJP was doing so much better than its academic competitors that IARPA dropped the other teams."

Intelligence, expert numeracy, news-junkies, a willingness to constantly update views - all matter - all are essential - but all are insufficient.

"Our analyses have consistently found commitment to self-improvement to be the strongest predictor of performance."

The difference is not who the forecasters are but what they do! Intelligence, expert numeracy, news-junkies, a willingness to constantly update views - all matter - all are essential - but all are insufficient.

"[Broadly] speaking, superforecasting demands thinking that is open-minded, careful, curious, and -- above all -- self critical. It also demands focus. The kind of thinking that produces superior judgment does not come effortlessly. Only the determined can deliver it reasonably consistently, which is why our analyses have consistently found commitment to self-improvement to be the strongest predictor of performance."

Ideologues, advocacy scholars, propagandists, sensationalists, and the self satisfied need not apply.
&
For one prominent example, Keynesians do not even attempt forecasts of the outcomes of their economic policies. They thus don't even attempt to validate their concepts. Instead, they provide mere tentative "projections." They insist, like some clerisy, that their pronouncements be taken on faith. They cannot forecast the failure of their own policies, no matter how often Keynesian-type policies have failed during the previous century. Mathematical economists - often themselves Keynesians - have experienced similar futility at the macroeconomic level.

For over half a century, tests have demonstrated the predictive superiority of "well validated algorithms" over human subjective judgment.
&

Curse of the authoritative word:

The history of medicine is full of futile - often harmful - treatments continued for centuries on the basis of "expert" subjective judgment mulishly maintained and supported by various psychological biases.
&

Experts usually quash doubt in their expert judgments, thus enshrining ignorance. "It was the absence of doubt -- scientific rigor -- that made medicine unscientific and caused it to stagnate for so long."

The authors describe the curse of the authoritative word as "the God complex." Modern randomized controlled testing didn't arrive until the early 20th century.

Medical "science" often currently goes too far in the opposite direction. Acting like mere technicians, doctors often reject clinical evidence of problems that do not show up in scientific tests. Even with tools thankfully received from science, the delivery of health care remains a non-scientific practical art -- a "practice," a "profession." Painting, for example, remains an art, even with paints and brushes made scientifically by Dupont.

One great weakness in expert judgment was pointed out by physicist Robert Feynman. Experts usually quash doubt in their expert judgments, thus enshrining ignorance. "It was the absence of doubt -- scientific rigor -- that made medicine unscientific and caused it to stagnate for so long."

This "God complex" afflicts the other non-scientific practical arts as well, especially those involved in the formation of public policy, like law and education and sociology as well as economics. Policies are argued and adopted on the basis of expert opinion and then supported by professional and/or political bias, actively denigrating doubt and disagreement.
&
Even the sciences can be afflicted by "the God complex." For decades, the "Clovis archeologists" actively suppressed evidence of earlier human habitation in North America. Scientific evidence of the germ theory of disease was rejected for decades, even as women died in childbirth because doctors refused to sufficiently wash their hands. Science is forever underestimating the capabilities of pre-literate peoples. However, at least in the sciences, there is a recognized discipline - a "scientific method" - for determining validity.

The scientific response neither accepts nor rejects any of such conclusions but approaches them with caution as "plausible hypotheses" to be subjected to further examination and, if possible, scientific experiment.

We rationalize to support our judgments -whether based on "snap" judgments ("common sense" based on experience and immediately available evidence) or on "expert opinion" (based on experience and a lifetime of pertinent study). Confirmation bias then sets in naturally to block objective evaluation of conflicting evidence. "[We] are creative confabulators hardwired to invent stories that impose coherence on the world." The authors cite Daniel Kahneman's dictum:

"It is wise to take admissions of uncertainty seriously, but declarations of high confidence mainly tell you that an individual has constructed a coherent story in his mind, not necessarily that the story is true."

The scientific response neither accepts nor rejects any of such conclusions but approaches them with caution as "plausible hypotheses" to be subjected to further examination and, if possible, scientific experiment. Like others, scientists must resist becoming attached to their own hunches and other prejudgments. Even after scientific confirmation, there must remain residual doubt.

Common sense and expert opinion can be remarkably accurate and useful when originating from long experience and objective observation. They can be remarkably unreliable when originating from inexperience or biased preconceptions. Even for the expert, much depends on the reliability of the clues immediately available from which judgment arises. The authors properly advise caution and at least residual doubt and the practice of habitual objectivity.

Judging forecast accuracy:

Judging forecast accuracy involves questions of time and meaning and scope. The authors emphasize the importance of precision for the evaluation process. The meaning, scope and timeline must be clear for meaningful forecasts capable of being tested for accuracy. Bald probabilities are impossible to evaluate for accuracy.
&

Unfortunately, forecasts made for public consumption rarely bother to clarify such details. The authors point to some notorious examples: the forecasts in the early 1980s of the prospects for nuclear war sometime in the 1980s; the likelihood that Federal Reserve quantitative easing policies would result in price inflation; and the prospect as of 2007 for iPhone success.
&
The experts who provided such forecasts "overflowed with intelligence and integrity," just like the pre-scientific 19th century doctors. "Tip-of-your-nose delusions fool everyone," the authors point out, "even the best and brightest - perhaps especially the best and brightest."

The experts who forecast that there would be a nuclear conflict during the 1980s were frequently anti-Reagan ideologues. As usual, ideology contorted their expectations. Similar groups of intellectuals disparaged the anti-nuclear missile ("Star Wars") defense programs that are now so vital for Israeli security in its very dangerous neighborhood. Anti-missile defense systems are increasingly vital in case of future Western confrontations with North Korea or other nuclear proliferation states.
&
FUTURECASTS offers the following mathematical model:
i x i = i² (Intellectuality times ideology equals ineptness).

Serious evaluation requires "clearly defined terms and timelines." A single probability forecast cannot be evaluated. "They must use numbers" and there must be a continuous series of forecasts so probability forecasts can be evaluated.

Unfortunately, various degrees of doubt are inherent in forecasts for consequential subjects in the non-scientific practical arts. ("Predictions are always uncertain, especially about the future." (Berra, Y.)). It is thus difficult to evaluate the accuracy of such forecasts whether the event occurs or does not, unless the forecast is expressed as a certainty. Is a forecast of a 70% chance of rain revealed to be wrong by a sunny day?
&
Serious evaluation requires "clearly defined terms and timelines." A single probability forecast cannot be evaluated. "They must use numbers" and there must be a continuous series of forecasts so probability forecasts can be evaluated.
&
Moreover, the inertia of the forecast event can create evaluation problems. Forecasting sunshine for Phoenix is easier than for Seattle or Tampa. Comparisons among forecasters or with the consensus forecast requires that the forecast event be the same and accurately described.
&

Ideologues, like theologians, cherry-pick the facts to support their prejudgments. They rationalize more than reason.

Objectivity is essential for forecasting excellence. Unfortunately, objective analysts don't fare so well in the media.

Ideologues know one big thing and are confident in applying it to their forecasts. They are thus generally among the worst forecasters. Ideologues, like theologians, cherry-pick the facts to support their prejudgments. They rationalize more than reason.

"With their one-perspective analysis, [ideologues] can pile up reasons why they are right -- 'furthermore,' 'moreover' -- without considering other perspectives and the pesky doubts and caveats they raise. [They] are likelier to say something definitely will or won't happen. For many audiences, that's satisfying. People tend to find uncertainty disturbing and 'maybe' underscores uncertainty with a bright red crayon. The simplicity and confidence of the [ideologue] impairs foresight, but it calms nerves -- which is good for the careers of [ideologues].

The authors emphasized Larry Kudlow and his mulishly optimistic forecasts prior to the 2007-2009 Great Recession. Nevertheless, he prospers as an authoritative voice. His clarity and confidence trump his forecasting inaccuracy. ( i x i = i² (Intellectuality times ideology equals incompetence.)) Objectivity is essential for forecasting excellence. Unfortunately, objective analysts don't fare so well in the media.

On the left, John Kenneth Galbreath and Lester Thurow had vast numbers of followers while achieving spectacular decades-long records of forecast inaccuracy. Their reputations among their claques of ideological followers were not dented until clobbered by the reality of widespread collapse of left-wing governments in the 1990s. ( i x i = i² (Intellectuality times ideology equals imbecility.)) See, Modern Advocacy Scholars
&
Marx proved himself spectacularly ignorant of economic reality with a record of 100% inaccuracy for the 12 forecasts in the first volume of "Das Kapital. See, Karl Marx, Capital (Das Kapital)(vol 1)(I) and Karl Marx, Capital (Das Kapital)(vol I)(II). i x i = i² (Intellectuality times ideology equals ignorance.) However, Marx nevertheless proved spectacularly successful as a propagandist, achieving worldwide influence. His erroneous forecasts generally remain buried and forgotten deep in the dense verbiage of "Das Kapital," while many of his concepts survive in the Keynesian concepts that misguide U.S. economic policy to this day. These men were serious propagandists who viewed forecasts just as tools for their efforts to achieve ideological agendas.
&
On the right, those who chafe at government economic regulations neglect the inconvenient fact that it is governance regulations that create and differentiate the wealth-building competitive capitalist markets from the Middle East bazaars or market days in town squares that do little more than facilitate subsistence living. See, Scott, Capitalism, Origins and Evolution as System of Governance," Part I: "Concept of Capitalism." They insist on the continuing superiority of the markets even while government policies disable market disciplines with interest rate suppression, consumer subsidies and too-big-to fail credit subsidies.

The more famous an expert was, the less accurate his forecasts. Those who confront and attempt to resolve doubts generally achieve less fame and fortune.

The authors found "an inverse correlation" between fame and accuracy: The more famous an expert was, the less accurate his forecasts. Those who confront and attempt to resolve doubts generally achieve less fame and fortune.

"They are less confident, less likely to say something is 'certain' or 'impossible,' and are likelier to settle on shades of 'maybe.' And their stories are complex, full of 'howevers' and 'on the other hands,' because they look at problems one way, then another, and another. This aggregation of many perspectives is bad TV. But it's good forecasting. Indeed, it's essential."

Best practices:

The importance of aggregating many sources of information and many perspectives on the information is emphasized by the authors. Aggregations of forecasts - polls of polls - often achieve the most consistent results.
&

Objectivity and hard work in gathering and evaluating facts and perspectives are essential for repetitive forecasting success. However, objectivity and effort are relative qualities, and most people are "hybrids," existing in between the extremes.

"What matters most is that [the Expert Political Judgment research program] found modest but real foresight, and the critical ingredient was the style of thinking."

Objectivity is a practice. Even though imperfect, as with so many other of the common virtues, a cornucopia of benefits flow from any earnest effort.

The possibility of error had never seriously been explored. There was no "red team" or "devil's advocate." The possibility that the WMD program had actually been ended was never even considered.

There must be serious consideration of the possibility of error. The intelligence concerning Iraq's WMD prior to the 2002 Iraq war is used by the authors as a graphic example. The author's reject the view that the WMD finding was unwarranted. Much more damning was the finding that it was expressed without examination of the uncertainties that always exist in intelligence analyses.
&
It "fell prey to hubris." "It wasn't merely wrong. It was wrong when it said it couldn't be wrong." The possibility of error had never seriously been explored. There was no "red team" or "devil's advocate." The possibility that the WMD program had actually been ended was never even considered.
&

The IARPA challenge:

The Intelligence Advanced Research Projects Activity (IARPA) was formed in 2006 in the wake of the Iraq WMD fiasco to develop and test forecasting methods.
&

Studies had already shown that "human cognitive systems will never be able to forecast turning points in the lives of individuals or nations several years into the future."

A National Research Council committee had concluded that forecasting methods cannot be trusted until they are tested, so IARPA proposed a wide-ranging test of short-to-medium range forecasts and techniques involving questions similar to intelligence agency concerns.

"IARPA was looking for questions in the Goldilocks zone of difficulty, neither so easy that any attentive reader or the New York Times could get them right nor so hard that no one on the planet could. IARPA saw this Goldilocks zone as the best place for finding new forecasting talent and for testing new methods of cultivating talent."

Studies had already shown that "human cognitive systems will never be able to forecast turning points in the lives of individuals or nations several years into the future."
&

Forecasts were supplemented by "wisdom of crowd" techniques based on the insight that a crowd will have knowledge unavailable to any individual. This was tweaked by giving extra weight to the forecasts of the top forty.

Finally, forecasts were "extremized" by, for example, pushing 70% forecasts up to 85% or 30% forecasts down to 15%.

Teams of forecasters were challenged to beat the consensus forecast by meaningful degrees during four separate trials. In forming a team, the authors' group sought volunteer forecasters. Initial "psychometric" tests produced 3,200 acceptable forecasters. The initial IARPA tests revealed about forty as the best forecasters.
&
Forecasts were supplemented by "wisdom of crowd" techniques based on the insight that a crowd will have knowledge unavailable to any individual. This was tweaked by giving extra weight to the forecasts of the top forty.
&
Finally, forecasts were "extremized" by, for example, pushing 70% forecasts up to 85% or 30% forecasts down to 15%. The authors' explain that if all the information known in bits and pieces among the crowd could be given to each member of the crowd, their forecasts would become more confident, justifying the "extremizing" adjustment.
&
Such techniques, used by the authors' group, not only repeatedly beat all the other groups and all other forecasting methods by significant margins, they also routinely bested intelligence forecasts that had the advantages of all manner of intelligence information.
&

Forecasting techniques:

Several predominant forecasting techniques are discussed by the authors.
&

Superforecasters constantly look for other views they can synthesize into their own.

Break down tough problems initially into several preliminary questions, a practice they attribute to Enrico Fermi.
Establish an unbiased initial "anchor" estimate to work from by first evaluating the question as a generalized likelihood - an "outside view" - unassociated with the question at hand. Establishing the "inside view" then involves work on the constituent sub-questions that lead to adjustments of the anchor estimate.
Most important, obtain additional perspectives by discussions with other competent forecasters. "Superforecasters constantly look for other views they can synthesize into their own."
Remain skeptical of your own views. Write out the reasoning and conclusions and examine them for flaws. (Best practices involve actively working to exclude egotistical, professional, ideological or other biases from influencing the process. i x i = i²)
Assume error and reevaluate. This simple practice is often sufficient to generate alternative estimates.
Let some time pass before a subsequent reevaluation.
Change the wording of the problem, perhaps from a positive (Will it happen?) to a negative (Will it be avoided?). This simple practice can diminish the risk of unconscious confirmation bias.

Subjective judgment is the dominant overall factor. Balancing, finding relevant information and judging relevance and impact are predominant practices. The authors expect computers will be used as an increasingly useful tool in supporting expert predictive judgment, but will not displace the expert.
&

Probability:

The inherent uncertainty of life - and thus of forecasting - is emphasized by the authors. "Nothing is absolutely certain," not even scientifically determined facts. Today's science may be qualified or even overturned by tomorrow's science.
&

"Nothing is absolutely certain," not even scientifically determined facts. Today's science may be qualified or even overturned by tomorrow's science.

The result is probability. Forecasters must deal with probability. "Certainty is illusory."

Negatives can be forecast with more certainty than positives if expressed with a precise timeframe and as an absolute. That something will not happen or will not work is inherently a measurable forecast.
&
For example, the publisher of FUTURECASTS online magazine has persistently forecast the failure of Keynesian-type economic policies, and for over half a century substantial recourse to Keynesian-type policies have obligingly repeatedly failed. The gross stupidity of Keynesian concepts - a substantial proportion of which are derived from the Marxian propaganda myth - support the absolute certainty of policy failure. See, "Understanding the Economic Basics and Modern Capitalism" at introduction to Keynesian theory, §2) "The Influence of Karl Marx." Current interest rate suppression policy inevitably leads to heightened levels of abuse of credit and increasing levels of financial system distortions that must ultimately undermine the policy.
&
A likely success can be undermined by unexpected events, but an expected failure can be ultimately certain regardless of any intervening events that may delay or obscure it.

"To benefit from failure, we must know when we fail." The score must be constantly kept and acknowledged.

Self-assessment should be received skeptically for obvious reasons. Time-lag increases "hindsight bias" and memory inaccuracy. Ambiguous forecasts - the use of ambiguous terms - thwarts record keeping.

Good notes on the factors considered in a forecast are important for the postmortem - which is essential.

Accurate and prompt feedback is vital for the adjustment process. Professional weather forecasters and bridge players benefit from prompt feedback. "To benefit from failure, we must know when we fail." The score must be constantly kept and acknowledged.
&
Self-assessment should be received skeptically for obvious reasons. Time-lag increases "hindsight bias" and memory inaccuracy. Ambiguous forecasts - the use of ambiguous terms - thwarts record keeping. What does "likely" or "probably" mean?

"To get better at a certain type of forecasting, that is the type of forecasting you must do -- over and over again, with good feedback telling you how your training is going, and a cheerful willingness to say, 'Wow, I got that one wrong. I'd better think about why.'"

Forecast efforts must be documented. Good notes on the factors considered in a forecast are important for the postmortem - which is essential. Even "correct" forecasts may be bad forecasts - a matter of luck - that should be acknowledged.
&
Perpetual effort is essential.

"The strongest predictor of rising into the ranks of superforecasters is perpetual beta, the degree to which one is committed to belief updating and self-improvement. It is roughly three times as powerful a predictor as its closest rival, intelligence."

While of course not infallible, the correlation of results over several iterations supports the conclusion that the superforecaster status involved mostly skill rather than luck.

"So it seems intelligence and knowledge help but they add little beyond a certain threshold."

The superforecasters actually increased their lead over all other forecasters in rounds 2 and 3 of the contest. There was no "regression to the mean" that would have been expected if initial success had been predominantly luck. Their success was facilitated by grouping them on teams with other superforecasters for rounds 2 and 3.

"[Roughly] 30% of individual superforecasters fall from the ranks of the top 2% next year. But that also implies that 70% of superforecasters remain superforecasters."

They were forecasting events like the price of oil with vast numbers of variables, or events subject to totally unforeseeable contingencies, like the likelihood of a violent confrontation between vessels of two nations in the South China Sea. While of course not infallible, the correlation of results over several iterations supports the conclusion that the superforecaster status involved mostly skill rather than luck.
&
The superforecasters typically are "very sharp people who follow the news closely, particularly the elite media." Superforecasters scored above the 80th percentile of the population on intelligence tests, but most were well below the top 1% category on standard IQ tests. The regular forecasters generally scored above the 70th percentile.

"So it seems intelligence and knowledge help but they add little beyond a certain threshold."

Superforecasters are "actively open minded," routinely consulting viewpoints from competing ideological, political and personal perspectives. They work at objectivity.

Ultimately it's not intelligence and knowledge that count, but "how you use it."

Common traits of superforecasters include:

They have a need for cognition: (They are puzzle doers.)
They are open to new experiences and ideas: (They habitually practice objectivity, curiosity and appreciation of variety in viewpoints.)
They have a capacity for self-critical thinking: (They do not fall in love with their own ideas or conceptions.)

"Superforecasters pursue point-counterpoint discussions routinely, and they keep at them long past the point where most people would succumb to migraines."

Superforecasters are "actively open minded," routinely consulting viewpoints from competing ideological, political and personal perspectives. They work at objectivity.

"For superforecasters, beliefs are hypotheses to be tested, not treasures to be guarded."

All superforecasters are proficient in math. Some qualify as math wizards and math is used on some occasions. However, math is seldom dominant in their forecasting activities.

"While superforecasters do occasionally deploy their explicit math models, or consult other people's, that's rare. The great majority of their forecasts are simply the product of careful thought and nuanced judgment."

Ultimately it's not intelligence and knowledge that count, but "how you use it."
&
Superforecasters think probabilistically. They reject fatalistic thinking. They compute probabilities finely - more finely than by 10% steps or even by 5% steps.

"[The] more a forecaster inclined toward it-was-meant-to-happen thinking, the less accurate her forecasts were. Or, put more positively, the more a forecaster embraced probabilistic thinking, the more accurate she was."

While warning that there is no one way to achieve superforecaster capabilities, the authors sum up the general techniques.

"Unpack the question into components. Distinguish as sharply as you can between the known and unknown and leave no assumptions unscrutinized. Adopt the outside view and put the problem into a comparative perspective that downplays its uniqueness and treats it as a special case of a wider class of phenomena. Then adopt the inside view that plays up the uniqueness of the problem. Also explore the similarities and differences between your views and those of others -- and pay special attention to prediction markets and other methods of extracting wisdom from crowds. Synthesize all these different views as acute as that of a dragonfly. Finally, express your judgment as precisely as you can, using a finely graded scale of probability."

Unfortunately, new information has to overcome intellectual inertia - "confirmation bias" and other sources of psychological bias that inhibit the recognition of error. These biases become especially difficult to overcome when reinforced by ideology or professional or personal commitment to an existing forecast.

Superforecasters are news junkies. They use both traditional and online sources and update forecasts repeatedly. During the IARPA contest, even the initial forecasts of the superforecasters were about 50% more accurate than those of the other forecasters, but that is not enough for them. They persistently work to improve their forecasts.
&
Unfortunately, new information is not always reliable information. Judging reliability can be difficult. The authors relate a couple of instances when new information ruined perfectly good existing superforecaster forecasts.
&
Unfortunately, new information has to overcome intellectual inertia - "confirmation bias" and other sources of psychological bias that inhibit the recognition of error. These biases become especially difficult to overcome when reinforced by ideology or professional or personal commitment to an existing forecast.

Regulars among the television talking heads are not superforecasters. The authoritative word from Washington or academic sources seldom comes from superforecasters.

Typically, updates involve small changes in probability assessments. Superforecasters make many small adjustments and generally avoid dramatic shifts.

"Try, fail, analyze, adjust, try again," is the way the authors describe this practice. There is a persistent need to resist biases, especially the tendency to rely on already acquired knowledge and on what is readily available -- on "the tip of your nose."

What makes them so good is less what they are than what they do -- the hard work of research, the careful thought and self-criticism, the gathering and synthesizing of other perspectives, the granular judgments and relentless updating."

Superforecasters are not recognized "experts" or "professionals," and so are much less personally invested in any particular forecast.

"[The] self esteem stakes are far less than those for career CIA analysts or acclaimed pundits with their reputations on the line. And that helps them avoid underreaction when new evidence calls for updating beliefs."

The middle ground between overreacting to unexpected news and overreacting is a skill of superforecasting. Typically, updates involve small changes in probability assessments. Superforecasters make many small adjustments and generally avoid dramatic shifts.

"[Superforecasters] not only update more often than other forecasters, they update in smaller increments."

The new information must be accorded its due, but it rarely negates the old information, so frequent small adjustments are generally appropriate. Nevertheless, when new information not only modifies but threatens the validity of old information, the latter may have to be discarded, leading to a radical change. Even here, there is no certainty.
&
Persistent efforts and willingness to learn - especially from failure - characterize the superforecaster. "Try, fail, analyze, adjust, try again," is the way the authors describe this practice. There is a persistent need to resist biases, especially the tendency to rely on already acquired knowledge and on what is readily available -- on "the tip of your nose." Such common sense decision making may be essential for everyday activities, but constitutes a trap for forecasters.
&
You don't learn to ride a bike by studying the physics involved. The authors' book and other books on forecasting won't make you a forecaster. However, learning best practices can nevertheless be of great value.

"[Superforecasters] score higher than average on measures of intelligence and open-mindedness, although they are not off the charts. What makes them so good is less what they are than what they do -- the hard work of research, the careful thought and self-criticism, the gathering and synthesizing of other perspectives, the granular judgments and relentless updating."

Debate must be robust but "respectful." Confrontation must always be "respectful." Generalities should be evaluated by examination of their particular parts.

During the IARPA contest, the authors randomly divided their forecasters into those who would work alone and those who would join teams that would communicate online. Some general guidance was provided, but each team developed its own procedures.

"On the one hand, we warned, groupthink is a danger. Be cooperative but not deferential. Consensus is not always good: disagreement not always bad. If you do happen to agree, don't take that agreement -- in itself -- as proof that you are right. Never stop doubting. Pointed questions are as essential to a team as vitamins are to a human body."

Debate must be robust but "respectful." Confrontation must always be "respectful." Generalities should be evaluated by examination of their particular parts.

It is civility that is essential. Political correctness is a one-sided civility that quashes dissent and supports various forms of confirmation bias.

"Bring in outsiders, suspend hierarchy, and keep the leader's views under wraps. There's also the 'premortem,' in which the team is told to assume a course of action has failed and to explain why -- which makes team members feel safe to express doubts they may have about the leader's plan."

The best teams were "actively open-minded."

The first iteration of the IARPA contest revealed a 23% advantage for teams. For the second iteration, the authors just employed teams. They concentrated their best forecasters into teams of a dozen forecasters each.
&
Initially, there were personality problems that hindered constructive engagement. The team members were all strangers. However, they all generally gave assurances that they welcomed questions and alternative perspectives. Team members responded to a "shared purpose."

"Research on teams often assumes they have leaders and norms and focuses on ensuring these don't hinder performance. The usual solutions are those the Kennedy administration implemented after the Bay of Pigs invasion -- bring in outsiders, suspend hierarchy, and keep the leader's views under wraps. There's also the 'premortem,' in which the team is told to assume a course of action has failed and to explain why -- which makes team members feel safe to express doubts they may have about the leader's plan. But the superteams did not start with leaders and norms, which created other challenges."

Properly managed, teams have huge advantages in knowledge, information gathering, personal commitment, and variety of perspectives. The results in iteration 2 and 3 of the IARPA contest were startling, with accuracy improvements of about 50%. Interestingly, applying the "extremizing" technique boosted the results of the separate forecasters and ordinary teams much closer to the results of the superteams.

"How did superteams do so well? By avoiding the extremes of groupthink and Internet flame wars. And by fostering minicultures that encouraged people to challenge each other respectfully, admit ignorance, and request help."

The best teams were "actively open-minded."
&

"A group of open-minded people who engage one another in pursuit of the truth will be more than the sum of its opinionated parts."

Prediction markets beat ordinary teams by about 20%, but superteams beat prediction markets by 15% to 30%. Prediction markets have a good record, but they lack the liquidity and intensity of purpose of real financial markets. (Very few actively managed investment funds regularly beat the financial markets.)

"Teams were not merely the sum of their parts. How the group thinks collectively is an emergent property of the group itself, a property of communication patterns among group members, not just the thought processes inside each member. A group of open-minded people who engage one another in pursuit of the truth will be more than the sum of its opinionated parts."

Teams are idiosyncratic as are their members, and team formation is a highly nebulous art. The authors caution against accepting some formula for forming a team. They suggest some best practices, not recipes for success for all purposes.
&

The superforecaster team members were generalists. Professional analysts and futurists are often specialists who develop expertise in particular fields of interest and particular targets. The authors nevertheless point out that their superforecaster teams outperformed the intelligence agency professionals and specialists on many occasions. Their forecasting practices were tested during three iterations of the IARPA challenge and proved their value.
&
The superforecasters "aced" the Scottish referendum vote, "even beating British betting markets with real money on the table." Among those who were correct on the general forecast, few came close to the 10% margin of victory.
&

Leadership and forecasting:

The intellectual differences between the arts of the forecaster and the leader are explained by the authors. Both must analyze similarly, but leaders ultimately must act, usually with imperfect information. Leaders must act decisively yet remain flexible as new information is received.
&

Leaders must act decisively yet remain flexible as new information is received.

It is war, of course, that most dramatically reveals the strengths and weaknesses of strategic planning and leadership in action. The authors discuss the practices of the German and U.S. armies in WW-II, the Israeli army and the current U.S. army. They emphasize the need to combine decisiveness with flexibility - and to extend that combination all the way down the chain of command.

"How skillfully leaders perform this balancing act determines how successfully their organizations can cultivate superteams that can replicate the balancing act down the chain of command. And this is not something that one isolated leader can do on his own. It requires a wider willingness to hear unwelcome words from others -- and the creation of a culture in which people feel comfortable speaking such words."

"Intellectual humility compels the careful reflection necessary for good judgment; confidence in one's ability inspires determined action."

Corporate leadership involves similar problems and similar approaches for their solution. The authors sum up the combination of self-confidence and intellectual humility required for good leadership.

"The humility needed for good judgment is not self-doubt -- the sense that you are untalented, unintelligent, or unworthy. It is intellectual humility. It is a recognition that reality is profoundly complex, that seeing things clearly is a constant struggle, when it can be done at all, and that human judgment must therefore be riddled with mistakes. This is true for fools and geniuses alike. So it's quite possible to think highly of yourself and be intellectually humble. In fact, the combination can be wonderfully fruitful. Intellectual humility compels the careful reflection necessary for good judgment; confidence in one's ability inspires determined action."

Institutions must plan for surprise and incorporate plans for resiliency and adaptability. Unfortunately, costs put practical limits on such preparations.

The dangers of confirmation bias, in all its many guises - ego, ideological, herd instinct, established judgment, overconfidence in the adequacy of current knowledge, etc. - is emphasized by the authors. You never really know enough. Hindsight bias makes the past look more predictable than it ever was and renders knowledge uncertain. (Never underestimate an adversary or overestimate the capabilities of allies or your own team.)
&
Long-term forecasts are viewed as of dubious value. "[The] accuracy of expert predictions declined towards chance five years out." "Scope insensitivity" (failure to adequately adjust forecasts for different timeframes or different parameters) afflicts long term forecasts. The authors point out how different events turned out at the end of each decade of the 20th century compared with expectations at the beginning. Twenty-year forecasts and forecasts about the coming century are "absurdities." See, 25 year economic forecasts, examining the results of economic forecasts in 1973 for the end of the 20th century.

"Savoring how history could have generated an infinite array of alternative outcomes and could now generate a similar array of alternative futures, is like contemplating the one hundred billion known stars in our galaxy and the one hundred billion known galaxies. It instills profound humility."

FUTURECASTS' forecast about the continuing success of the European Union is currently being tested - as it has been in the past and will be repeatedly. The membership and borders may vary, as noted in the original forecast issue, but the continuance of the EU itself is still far from in doubt.

Thus, institutions must plan for surprise and incorporate plans for resiliency and adaptability. Unfortunately, costs put practical limits on such preparations. How much should be spent on earthquake predictions, and where?
&

Accurate forecasting can make a significant difference in all manner of plans and activities on a human and short-term scale. However, before you can seek an answer, you must first recognize a question. The best forecasters may not be the best at recognizing pertinent questions. Those with questions and those who find answers may need each other, the authors speculate.
&

A big question - the effectiveness of interest rate suppression policy since 2009 - is discussed as an example at the end of the book. How accurate have the objections of those advocating an austerity alternative been?
&
On one side are Keynesians like Ben Bernanke and Paul Krugman. Prominent critics include James Grant and Niall Ferguson. So far, the Keynesians seem to have the better of the debate.
&
Unfortunately, the question fatally lacks precision, the authors point out. The kind and degree of inflation involved and the timeframe are critically important questions.
&
Does asset-price inflation count? After all, bubble-mania involves asset-price inflation and can be as disastrous as consumer-price inflation. Also, asset-price inflation massively benefits Wall Street and the "rich," increasing inequality. If the bubbles take a decade to burst, does that justify the objections to the Keynesian policies? Such essential parameters have been neglected in this vitally important policy dispute.

"The key is precision. It is one thing for Austrians to say a policy will cause inflation and Keynesians to say it won't. But how much inflation? Measured by what benchmark? Over what period? The result must be a forecast question that reduces ambiguity to an absolute minimum."

Interest rates are of fundamental importance for the proper functioning of a wide array of the market's complex mechanisms. The masters of the financial universe at the central banks toy with interest rates with an astounding combination of hubris and naïveté.

FUTURECASTS remains firmly in the anti-Keynesian policy camp. It is not unusual for strong, naturally resilient capitalist systems to remain prosperous for the better part of a decade in the face of even the most destructive economic policies.
&
The Fordney-McCumber Tariff of 1922 and resulting trade war preceded the Great Depression Crash of 1929 by seven prosperous years. See. Blatt, "Understanding the Great Depression." The adoption of Keynesian policies during the Kennedy/Johnson administration didn't begin to visibly break down until about 1968. The bubble-mania that preceded the 2007-2009 financial crisis was observable as early as 2003 and had policy antecedents traceable back to the 1980s. See, Morgenson & Rosner, "Reckless Endangerment " an Industrial Policy horror story.
&
Leverage rates rising to inherently unstable heights all around the world are just one obvious result of the years of interest rate suppression in the U.S. and other hard-currency nations. Fiduciaries pushed into riskier investment profiles is another. An obvious bubble in the bond markets is a third. Asset-price inflation in many sectors is reaching 2006-2007 levels. Artificially low interest rates undermine banks, insurers, managed assets, pension funds, retirees, savers, and borrowers who are marginal credit risks.
&
Nor is this in any way a complete list. The noxious impacts of manipulation of interest rates permeate market economic systems.
&
Interest rates are of fundamental importance for the proper functioning of a wide array of the market's complex mechanisms. The masters of the financial universe at the central banks toy with interest rates with an astounding combination of hubris and naïveté. Keynesians discuss interest rate manipulation with insouciant ignorance. The longer interest rate suppression continues, the more dangerous the situation becomes.
&
It is well that the Keynesians are rushing to claim victory now. When interest rates rise back towards market levels, matters will look far different. The central banks have dug a very deep hole, but they simply cannot stop digging.