May 16, 2015

“Look at the statistics of the situation through an objective lens; use the data ethically, honestly, and scrupulously.” – Me

I love statistics. I’ve loved it ever since Ms. Williams taught it to me in 12th grade. I have taken graduate-level statistics courses, and have also taught AP Statistics. Many people ask me what statistics are all about. It’s really about looking at data, analyzing it, and making a good decision based on sound judgment and knowledge of statistical principles. But why statistics is lost on most people is not just due to the fact that it is math; there is so much misinformation about it. When I first started teaching AP Statistics there were questions in the text that stumped me; even with the answer book, I felt I needed more information. I took to the web to find better explanations than the textbook gave, and I was stunned to find so many conflicting interpretations of the problem. It wasn’t that they were all completely wrong, it’s just that many of the answers were based in part on missing, outdated, or incorrect information. It seems as if statistics is evolving nearly at the pace of technology, and not everyone can keep up with it.

Whether you dabble in statistics, go into it balls-to-the-wall, are aware of its basics, or want nothing to do with it, there are people in key positions who need to be able to apply statistics to the making of their decisions; one such person is NFL Commissioner Roger Goodell. Recently, Goodell handed down a four-game suspension to Patriots Quarterback Tom Brady, and penalized the Patriots organization with monetary fines and loss of future draft picks. Those of you who know me well, know I’m still too butt-hurt from the Tuck Rule game to like the Patriots, but the decision made by Goodell defies even the loosest interpretation of statistical jurisprudence. I think the 4-game suspension of Brady was inappropriate, because despite the evidence that it is “more likely than not” that Brady was “at least generally aware” of the lack of air pressure in the balls he used, the statistics say it is highly likely a mistake could have been made in the process.

Let me try to explain statistical inference as basically as I can for those of you who are unfamiliar with it. In statistics, you gather a random sample through a well-designed sampling technique so that you can learn about a population. Two popular ways to report on these findings are through confidence intervals and hypothesis tests, two types of “statistical inference.” When doing the hypothesis test, you have a hypothesized mean (for instance, the average IQ of the population is 100) and perhaps you have sample data that shows that it is 103. Is your sample mean of 103 sufficient to prove that the average IQ is actually more than 100? Is that a statistically significant difference based on the sample statistics, or is this chance variation? To answer that question, you obtain a test statistic, which you may convert into a “p-value” which means “probability value.” Let’s say you get a p-value of .12. What this means is that if the hypothesized mean is true, if the true mean actually is 100, then it is 12% likely that a sample mean of 103 could be obtained if the sample is chosen properly. In other words, 88% of the time, any variation in the sample mean (IQ 103) from the hypothesized mean (IQ 100) will be due to the fact that there actually is a difference between the two. But in 12% of those samples, the IQ of 103 is just chance variation. For most statisticians, 12% is too high to say that there is a statistically significant difference in the IQ of the population. In fact, most statisticians require less than 5%. The general standard is a level of significance of .05, meaning we want to be able to show that more than 95% of the time, the sample that you take will vary from the population because there really is a difference. I have seen levels of significance of .10 used, but you’d have to be crazy to use a higher number than that.

That brings me to the Deflategate Scandal. What does it mean to be “more likely than not” that Brady was at least generally aware? 51% is more likely than not. While it would be a clumsy quantification at best, I think most of us would be comfortable estimating even that it’s 60%-40% or 75%-25% that Brady knew about the balls. But to a statistician, the 25% chance that he didn’t know about it is too high to make a comfortable accusation and decide that the evidence actually shows that he is guilty of the actions of which he is accused. If you want to know how often you can be successful if your probability is 25%, flip a coin twice. Pick heads or tails, the probability it lands on that side twice is 25%. Do that trial 20 times. How many times did it land on the chosen side? Over many repeated 20-trial observations, it should happen 5 times out of 20. (I have done the trial and gotten as few as 0 and as many as 12, however.) That is how often you would make a mistake if you convicted the accused if you were 75% sure he/she was guilty. If the investigation could show that without a doubt Brady knew about the balls, or that there is strong evidence that Brady was complicit in the ball deflation, then he deserved what he got. But “more likely than not” doesn’t do it for me, and shouldn’t do it for anyone else. Should there be a punishment? Sure, there is sufficient evidence of general malfeasance such that some punishment must be made. You fine the team one or more high draft picks, and make them donate a large sum of money to some charity, and move on. You don’t make a witch hunt out of it on the lackluster evidence you’ve got regarding Brady’s personal involvement, when there is still a fairly decent chance that he may not have been “at least generally aware” of the ball pressure. That evidence is not even close to sufficient for an individual punishment.

There are many people on this earth who, if they are not aware of the principles of statistics, for the most part they can do their jobs and live their lives without completely sucking ass. Roger Goodell is not one of them. Either learn this prick some statistics, or remove him from office, please.