|
Math TalkThis Math Talk will be found in Unit 5, which deals with mathematical analysis. It covers technical topics related to this unit. All of the highlighted words are defined in the glossary. The (un)Truth About StatisticsHave you ever heard the expression, "Four out of five doctors recommend..."? Or "... 42% more relief from heartburn"? Or "... better highway mileage than any other sub-compact hatchback sedan costing under $10,000 made in America"? Perhaps you suspected that these claims were not completely true. It is wise to be suspicious, because statistics (and numbers in general) can be manufactured to make any idea sound convincing. When used properly, statistics is a powerful tool for uncovering truth; when used improperly, it can be manipulated to prove almost anything. Try, try againThere are lots of ways to misuse statistics. One way is perseverance: if at first you don't succeed (i.e., get the result you wanted), try, try again. Suppose you want to claim in a TV commercial that 4 out of 5 dentists recommend your toothpaste. You ask 5 dentists, but only 1 of them recommends your brand. So, forget you ever asked them! Ask another 5 dentists! This time, 2 of them recommend your brand. Forget them! Ask another 5! Keep trying until, by random fluctuation, you get lucky and 4 out of 5 recommend your brand. Then, show your TV commercial. Whatever you do, do not talk about the 13,925 dentists you had to survey before you got lucky, and don't mention that only 8% of them recommended your brand. Sometimes this sort of thing happens even to honest people. If the results do not match our theory, it is too easy to think of a "good" reason to believe that data we do not like are not valid, so we have to do the experiment again. This happens far too often in scientific research, even today. Despite people's best intentions to be fair, there is just too much temptation to rationalize away the "bad" data. However, you rarely see any scientists rationalize away the "good" data, the data which support their theories! Here we have the first lesson of honest statistics: you cannot ignore the data that do not fit your theory. Sometimes you have good reason to believe some piece of data should be excluded because it is just a mistake. But in your scientific report, you have to say so, and state exactly why it has been omitted. You can exclude data if you have good reason, but you cannot ignore them, or fail to report them. How many?A nursing home recently tried new procedures designed to reduce the number of accidental injuries to patients. They were pleased to announce that in the first four months of the year, patient accidents were down a whopping 60% compared to last year. Can't argue with that! Or can you? How many are we talking about here? If last year there were 50 accidents, and this year only 20, then they are down 60%, and there is no doubt that this result is statistically significant. The chance of the result happening by random fluctuation ("by accident") is less than 1 in 10,000. But suppose there were 5 accidents last year, and only 2 this year. Yes, they are down 60%. But no, this result is not significant. The chances are better than 1 in 4 that this could happen by random fluctuation. We have already seen that as we acquire more data, our results become more precise. They also become more reliable. Sometimes, an early result is based on so little data that it has no real significance. Do not put too much faith in statistical results (not even a whopping 60%) until you know how much data went into them. Survey says!Suppose two politicians are debating a school funding bill. They both try to show that the public is on their side by conducting a survey. Politician A wants to show that people favor the bill, so his survey asks, "Should we invest more in our children's future by passing the school funding bill?" Lo and behold, people do want to invest in their children's future, so most people say yes, and politician A announces that the vast majority favor his bill. Politician B wants the bill to fail, so his survey asks, "Should we raise taxes to fund more and bigger government bureaucracy by passing the school funding bill?" Not surprisingly, people do not want higher taxes and more bureaucracy, so they mostly say no, and politician B claims that the vast majority oppose the bill. This may seem like an exaggerated example, but it is not. This actually happens! Almost every political survey is deliberately designed to produce a specific response. The questions are usually phrased to make the desired response sound good, while making the undesired response sound very bad. By doing so, the questions bias the subject's opinion about the topic of the survey. Not surprisingly, whoever paid for the survey usually gets the response they want. Politicians are not the only ones who do this. Advertising surveys are carefully designed to make the company product look good while making the competition look bad. Even if you are trying very hard to be fair, it is actually quite difficult to phrase the question in a way that does not influence anyone's response. There are other ways surveys can go wrong, too; designing an accurate survey is a very difficult task, requiring much expertise. There are some organizations that do it well; for example, the Gallup organization specializes in conducting fair, scientifically reliable surveys. Still, it is an unfortunate fact that most surveys just cannot be trusted (especially political and advertising surveys). What are you trying to prove?It happens regularly that a government agency or private commission launches a major study of an important social issue. Too often they begin by announcing that they are going to prove some theory, which has important consequences for social policy. You can bet big money that they will find proof. After all, they have already made up their minds! Any study which begins by assuming the correct answer, then looks for proof, will fail to give serious consideration to the possibility that the assumed "correct answer" is not correct. Any scientist who has already decided before the experiment that one result is "right" and another is "wrong" is no scientist at all. It is very hard to avoid all bias when taking data. That is why we work very hard to make our experiments double blind: we arrange that neither the scientists taking data, nor their subjects, know how the data will affect the outcome. For example, suppose we want to study the effectiveness of a new headache pill. We give half our subjects the new medication, while the other half are given an inert sugar pill. We have to be sure that the subjects do not know which one they are getting. We also have to be sure that the scientists taking the data also do not know (at least until all the data are in). Otherwise, there is far too much temptation to "nudge" the data the way we want them to go. Accidents happenWe have said that the standard of "unlikeliness" in statistics is 0.05, or 5%, or a 5% false-alarm probability. This means that if we do a scientific experiment, and get a result that's only 5% likely to happen by accident, we have evidence that it is not an accident. We can write our results in a scientific paper, and every statistician will agree that our evidence is significant. So we have evidence, but we do not yet have proof. After all, there is a 5% chance that it did happen by accident. Accidents do happen! In fact, an accident that is only 5% likely will happen about 5% of the time. After all, with a 5% false-alarm probability, we will get some false alarms. Suppose a university employs 100 scientists, and each one does a different scientific experiment. From probability theory, we expect 5% of them to get a result that's only 5% likely, by accident! So just by accident, about 5 of the 100 scientists will get evidence that they can call "statistically significant" and publish in a scientific paper. And they do have evidence, strong enough that their claim deserves further study. But they do not have proof. That is one of the reasons scientific experiments have to be repeated. If you get a "significant result" once, you have evidence. If two people get the same result, there is very strong evidence. If a dozen people do the same experiment, and they all get a significant result, then we can start to believe it. Every year, scientists do hundreds of thousands of experiments. If they use a 5% false-alarm probability (and most of them do), we can expect 5% of the results to be false alarms. Five percent of 100,000 experiments is 5,000 false alarms! That means 5,000 results that seem to be significant, but really happened only by accident. Some of them will be published in important scientific journals. And they should be published: they are all possibilities, and deserve further study. But for most of them, we should not be convinced until the results are repeated. ConclusionWe have seen that if you want to deceive people, statistics makes it easy. In fact, even if you want to be honest, there are so many things that can go wrong in an experiment or a survey that we must carefully guard against bias. Even if we succeed, and get an unbiased result which is "statistically significant," it still might have happened just by accident. So the experiment has to be repeated, many times, and each time requires the same care in guarding against any bias which could affect the results. That is a lot of work! Still, the payoff makes it well worth it. Not doing so gives us half-baked theories which sound good but really are not, supported by biased data and invalid statistics. This is worse than ignorance! But if we invest the effort to do science well, we reap the reward of knowledge that we can trust, and often can put to very good use. | ||||||||||||||||