Making that big decision30 April 2001
A bad decision in the nuclear industry can easily become a very expensive one. Professors Nick Chater and Koen Lamberts of the University of Warwick’s Department of Psychology explain how big decisions often turn into big disasters.
The decision-making process is fundamental to the success of all areas of human activity, especially in the nuclear industry. Many projects demonstrate that human decision-making is notably frail. Vast numbers of decisions, and often big and sometimes highly public, decisions go badly wrong.
So why is human decision-making so fallible? Despite the huge importance of the question, there has been relatively little interest across the business and public sector in finding out why decision-making goes wrong, and finding ways of improving human decision-making. This is particularly peculiar from the perspective of cognitive science, the academic discipline concerned with building and testing information processing models of human thought. There is a large body of well-attested knowledge about how people make decisions, where they are likely to go wrong, and at least some valuable suggestions about how decision-making can be improved. But knowledge, locked up in academic journals, is a peculiarly useless thing from the perspective of real decision makers. The real question is: can basic research in cognitive science be made relevant to decisions that people face in the real world?
Perhaps the most striking, and shocking, result concerns the quality of expert judgement – how good experts are at using a range of given information to judge how some particular outcome will turn out.
To assess how good people are at these judgements, we need some kind of standard against which performance can be measured. The most straightforward standard to use is a so-called ‘linear’ statistical model. This kind of model can be viewed as giving ‘points’ for each aspect of the information that the decision-maker possesses. Then, after choosing an appropriate weighting using a statistical method (such as linear regression), the points are added together to come to the final judgement.
One would expect that experts should be able to do a lot better than this simplistic approach – because they understand not just the specific features of each case individually, but also how they might inter-relate to each other; they have made large numbers of similar judgements in the past, and they will usually have vast amounts of potentially relevant background information that the simple ‘point-count’ statistical method blithely ignores.
But the reverse is the case! Across over a hundred scientific studies, the earliest of which date back to the 1950s, it has consistently been found that experts rarely match, let alone surpass, this simple alternative way of making judgements. In fact, they generally do worse. People seem to find it remarkably difficult to pull together all the knowledge that is relevant to making a judgement, despite frequently having large amounts of knowledge and experience.
The frailty of human judgement is, however, masked by a second fundamental feature of the human decision-maker: overconfidence. People who think they know the answer to a mundane general knowledge question are consistently much surer than they should be. So, for example, in a typical study, for questions where people think they have an 80% chance of being correct, they actually give the right answer just 65% of the time. And the effect is one of the most ubiquitous in the study of the mind – crucially people are overconfident in their predictions of what the future will bring, including the outcomes of their own decisions. So this makes the frailty of human judgement even more dangerous. Decisions of all kinds, including big decisions, are routinely being made with a powerful, but illusory, sense of control and understanding.
There are some nice experimental studies that illustrate how non-evidence can be interpreted as positive evidence. In one study, people are given two urns. One contains, say, 60 red chips and 40 blue chips; the other contains 40 red chips and 60 blue chips. One of the urns is chosen at random, and the participant in the experiment is not told which. Chips are then successively drawn (and replaced) from the urn, and the question is, as ever more chips are drawn, how confident is that person that they know which urn is which. Compared to what would be dictated by the laws of probability theory, people make decisions that are in the right direction (if they have seen more red chips, they guess it is the urn with more red chips). For once they are under- rather than over-confident. They want to see far more chips than is actually necessary to make a reliable judgement. But this difference from ‘real-world’ situations is not surprising, because people do not have any experience or background knowledge on which to rely, and it is this that tends to give rise to overconfidence. For the present argument, the key point is what happens when people come to an initial view (it’s the ‘mainly red’ urn, with 60% confidence), and then see a sequence containing equal numbers of red and blue chips. Such a sequence is, of course, equally likely whichever urn is being sampled, so it provides no reason whatever to modify the person’s confidence level. But such ‘neutral’ sequences are typically associated with a steady increase in the person’s confidence. The upshot is that no news is viewed as good news! So it is perhaps not surprising that highly experienced experts develop substantial faith in their judgement, even where the basis for this faith in terms of practical success is slight.
One might wonder whether some of the frailties of individuals might be cancelled out in the interchange of group discussion. Is group decision-making more accurate and less overconfident than that of individuals? Unfortunately, the weight of evidence points the other way. Groups are typically no more accurate than individuals, but are frequently much more confident. To simplify fairly drastically, the problem is that group members convince each other that a particular view is correct. People are ‘soaking up’ views from other members of the group, which can strengthen their own views, and this is a mutually reinforcing process. Many famously disastrous decisions emerge from insular and close-knit groups. For example, in the much-studied Bay of Pigs debacle in 1961, Kennedy and a small team of advisors planned the operation while deliberately avoiding discussions with outsiders who might disagree with the plan. Group members explicitly held back their own misgivings, or explicitly instructed each other not to rock the boat. Kennedy learned the lesson from this planning disaster, using a much more open team when handling the Cuban missile crisis the next year, and deliberately seeking out those who might hold contrary opinions.
So where does this leave expert judgement and decision-making? Certainly, the upshot of academic research indicates that expert judgements can often be equalled or outperformed by simple statistical methods and that experts are consistently overconfident. But this does not, of course, mean that experts are not ‘expert’ or that they are unnecessary. Experts do have a great deal of knowledge, and this can be essential in, for example, putting together a plan of action, or carrying out a procedure, or in communicating and justifying a decision. Moreover, experts are often needed to provide the information that feeds into the decision-making process, to decide what features apply to some particular case. What experts, like all of us, find very difficult, is simultaneously taking account of a multitude of different factors in order to come to a specific judgement or decision.
There are, in general terms, two possible ways to improve how decisions are made. The first way relies on the fact that, in some cases, the crucial phase of integrating disparate knowledge into a single judgement should simply be delegated to a statistical method. There is a tendency to use human experts, though, for the really big decisions. So, just when the stakes are highest the more frail system is used. At least in some contexts, this may result in the worst possible outcome. The second way, where the quantity of past data required for statistical methods is simply not available, is to use expert knowledge more effectively. Replacing groups of experts with independent and sometimes anonymous expert opinions (to avoid biases introduced by groups); using a ‘statistical’ type model, but using experts to estimate the number of ‘points’ that should be associated with each feature (rather than estimating these from past data); and a large range of other techniques can be used to ‘de-bias’ judgement and decision-making, although not, of course, to perfect it. The value to commercial and public sector bodies, in particular to the nuclear industry as a whole, of drawing on some of these methods may be substantial.
Perhaps the most important lesson from the study of real-world human decision-making is just how difficult the task of forecasting the future, and attempting to control it, really is. This is a task that is made substantially more difficult if we are unaware of fundamental human limitations.