If you have taken a class in statistics, whether in college or as a part of professional training, how much has it helped you solve problems?
Based on my observation, the answer is mostly not much.
The primary reason is that most people are never taught statistics properly. Terms like null hypothesis and p-value just don’t make intuitive sense, and statistical concepts are rarely presented in the context of scientific problem solving.
In the era of Big Data, machine learning, and artificial intelligence, one would expect improved statistical thinking and skills in science and industry. However, the teaching and practice of statistical theory and methods remain poor – probably no better than when W. E. Deming wrote his 1975 article “On Probability As a Basis For Action.”
I have witnessed many incorrect practices in teaching and application of statistical concepts and tools. There are mistakes unknowingly made by users inadequately trained in statistical methods, for example, failing to meet the assumptions of a method or not considering the impact of the sample size (or statistical power). The lack of technical knowledge can be improved by continued learning of the theory.
The bigger problem I see is that statistical tools are used for the wrong purpose or the wrong question by people who are supposed to know what they are doing — the professionals. To the less sophisticated viewers, the statistical procedures used by those professionals look proper or even impressive. To most viewers, if the method, logic, or conclusion doesn’t make sense, it must be due to their lack of understanding.
An example of using statistics for the wrong purpose is p-hacking – a common practice to manipulate the experiment or analysis to make the p-value the desired value, and therefore, support the conclusion.
Not all bad practices are as easily detectable as p-hacking. They often use statistical concepts and tools for the wrong question. One category of such examples is failing to differentiate enumerative and analytic problems, a concept that Deming wrote extensively in his work, including the article mentioned above. I also touched on this in my blog Understanding Process Capability.
In my opinion, the underlying issue using statistics to answer the wrong questions is the gap between subject matter experts who try to solve problems but lack adequate understanding of probability theory, and statisticians who understand the theory but do not have experience solving real-world scientific or business problems.
Here is an example. A well-known statistical software company provides a “decision making with data” training. Their example of using a hypothesis test is to evaluate if a process is on target after some improvement. They make the null hypothesis as the process mean equal to the desired target.
The instructors explain that “the null hypothesis is the default decision” and “the null is true unless our data tell us otherwise.” Why would anyone collect data and perform statistical analysis if they already believe that the process is on target? If you are statistically savvy, you will recognize that you can reject any hypothesis by collecting a large enough sample. In this case, you will eventually conclude that the process is not on target.
The instructors further explain “It might seem counterintuitive, but you conduct this analysis to test that the process is not on target. That is, you are testing that the changes are not sufficient to bring the process to target.” It is counterintuitive because the decision maker’s natural question after the improvement is “does the process hit the target” not “does the process not hit the target?”
The reason I suppose for choosing such a counterintuitive null hypothesis here is that it’s convenient to formulate the null hypothesis by setting the process mean to a known value and then calculate the probability of observing the data collected (i.e. sample) from this hypothetical process.
What’s really needed in this problem is not statistical methods, but scientific methods of knowledge acquisition. We have to help decision makers understand the right questions.
The right question in this example is not “does the process hit the target?” which is another example of process improvement goal setting based on desirability, not a specific opportunity. [See my blog Achieving Improvement for more discussion.]
The right question should be “do the observations fall where we expect them to be, based on our knowledge of the change made?” This “where” is the range of values estimated based on our understanding of the change BEFORE we collect the data, which is part of the Plan of the Plan-Do-Study-Act or Plan-Do-Check-Act (PDSA or PDCA) cycle of scientific knowledge acquisition and continuous improvement.
If we cannot estimate this range with its associated probability density, then we don’t know enough of our change and its impact on the process. In other words, we are just messing around without using a scientific method. No application of statistical tools can help – they are just window dressing.
With the right question asked, a hypothesis test is unnecessary, and there is no false hope that the process will hit the desired target. We will improve our knowledge based on how well the observations match our expected or predicted range (i.e. Study/Check). We will continue to improve based on specific opportunities generated with our new knowledge.
What is your experience in scientific problem solving?
1 comment
Comments are closed.