On Statistics as a Method of Problem Solving

Fang Zhou — Sun, 01 Nov 2020 03:55:59 +0000

If you have taken a class in statistics, whether in college or as a part of professional training, how much has it helped you solve problems?

Based on my observation, the answer is mostly not much.

The primary reason is that most people are never taught statistics properly. Terms like null hypothesis and p-value just don’t make intuitive sense, and statistical concepts are rarely presented in the context of scientific problem solving.

In the era of Big Data, machine learning, and artificial intelligence, one would expect improved statistical thinking and skills in science and industry. However, the teaching and practice of statistical theory and methods remain poor – probably no better than when W. E. Deming wrote his 1975 article “On Probability As a Basis For Action.”

I have witnessed many incorrect practices in teaching and application of statistical concepts and tools. There are mistakes unknowingly made by users inadequately trained in statistical methods, for example, failing to meet the assumptions of a method or not considering the impact of the sample size (or statistical power). The lack of technical knowledge can be improved by continued learning of the theory.

The bigger problem I see is that statistical tools are used for the wrong purpose or the wrong question by people who are supposed to know what they are doing — the professionals. To the less sophisticated viewers, the statistical procedures used by those professionals look proper or even impressive. To most viewers, if the method, logic, or conclusion doesn’t make sense, it must be due to their lack of understanding.

An example of using statistics for the wrong purpose is p-hacking – a common practice to manipulate the experiment or analysis to make the p-value the desired value, and therefore, support the conclusion.

Not all bad practices are as easily detectable as p-hacking. They often use statistical concepts and tools for the wrong question. One category of such examples is failing to differentiate enumerative and analytic problems, a concept that Deming wrote extensively in his work, including the article mentioned above. I also touched on this in my blog Understanding Process Capability.

In my opinion, the underlying issue using statistics to answer the wrong questions is the gap between subject matter experts who try to solve problems but lack adequate understanding of probability theory, and statisticians who understand the theory but do not have experience solving real-world scientific or business problems.

Here is an example. A well-known statistical software company provides a “decision making with data” training. Their example of using a hypothesis test is to evaluate if a process is on target after some improvement. They make the null hypothesis as the process mean equal to the desired target.

The instructors explain that “the null hypothesis is the default decision” and “the null is true unless our data tell us otherwise.” Why would anyone collect data and perform statistical analysis if they already believe that the process is on target? If you are statistically savvy, you will recognize that you can reject any hypothesis by collecting a large enough sample. In this case, you will eventually conclude that the process is not on target.

The instructors further explain “It might seem counterintuitive, but you conduct this analysis to test that the process is not on target. That is, you are testing that the changes are not sufficient to bring the process to target.” It is counterintuitive because the decision maker’s natural question after the improvement is “does the process hit the target” not “does the process not hit the target?”

The reason I suppose for choosing such a counterintuitive null hypothesis here is that it’s convenient to formulate the null hypothesis by setting the process mean to a known value and then calculate the probability of observing the data collected (i.e. sample) from this hypothetical process.

What’s really needed in this problem is not statistical methods, but scientific methods of knowledge acquisition. We have to help decision makers understand the right questions.

The right question in this example is not “does the process hit the target?” which is another example of process improvement goal setting based on desirability, not a specific opportunity. [See my blog Achieving Improvement for more discussion.]

The right question should be “do the observations fall where we expect them to be, based on our knowledge of the change made?” This “where” is the range of values estimated based on our understanding of the change BEFORE we collect the data, which is part of the Plan of the Plan-Do-Study-Act or Plan-Do-Check-Act (PDSA or PDCA) cycle of scientific knowledge acquisition and continuous improvement.

If we cannot estimate this range with its associated probability density, then we don’t know enough of our change and its impact on the process. In other words, we are just messing around without using a scientific method. No application of statistical tools can help – they are just window dressing.

With the right question asked, a hypothesis test is unnecessary, and there is no false hope that the process will hit the desired target. We will improve our knowledge based on how well the observations match our expected or predicted range (i.e. Study/Check). We will continue to improve based on specific opportunities generated with our new knowledge.

What is your experience in scientific problem solving?

Setting SMART Goals

Fang Zhou — Fri, 31 May 2019 01:24:02 +0000

Recently I had conversations with several people on different occasions about effective goal setting. It is a common practice to use Specific, Measurable, Achievable, Relevant, and Time-bound (SMART) as criteria to create goals. However, using SMART goals for effective management or decision making is not as simple as it appears.

For example, “improve product ABC yield to 96% or more by September 30” can be a SMART goal. In a non-manufacturing environment, a similar goal can be “reduce invoices with errors to 4% or less by September 30.”

Suppose now is September 30, and we have only 4% of the products or invoices classified as bad. Did we achieve our goal?

Most people would say “Of course, we did.” But the real answer is “We don’t know without additional information or assumptions.”

Why?

The reason is that the 4% is calculated from a sample, or limited observations from the system or process we are evaluating. The true process capability may be higher or lower than 4%.

We can use a statistical approach to illustrate the phenomenon. Since the outcome of each item is binary (good/bad or with/without errors), we can model the process as a binomial distribution. Figure 1 shows the probability of observing 0 to 15 bad items if we examine a sample of 100 items, assuming that any item from the process has a 4% probability of being bad.

Figure 1: Binomial Distribution (n=100, p=0.04)

When the true probability is 4%, we expect to see 4 bad items per 100, on average. However, each sample of 100 items is different due to randomness, and we can get any number of bad items, 0, 1, 2, etc. If we add the probability values of the five leftmost bars (corresponding to 0, 1, 2, 3, and 4 bad items), the sum is close to 0.63. This means that there is only a 63% chance of seeing 4 or fewer bad items in a sample of 100, when we know the process should produce only 4% bad items.

More than 37% of the time, we will see 5 or more bad items in a sample of 100. In fact, there is a greater than 10% chance seeing eight bad items — twice as many as expected!

In contrast, a worse-performing process with a true probability of 5% (Figure 2) has a 44% chance of producing 4 or fewer bad items. This means that we will see it achieving the goal almost half the time.

Figure 2: Binomial Distribution (n=100, p=0.05)

Suppose the first process represents your capability and the second one of your colleagues, how do you feel about using the SMART goal above as one criterion for raises or promotions?

The point I am making is not to abandon the SMART goals but to use them judiciously. In many cases, it calls for statistical thinking – understanding variation in data. Just because we can measure or quantify something doesn’t mean we are interpreting the data properly to make the right decision.

It takes “some rudimentary knowledge of science”¹ to be smart.

1. Deming, W. Edwards. Out of the Crisis : Quality, Productivity, and Competitive Position. Cambridge, Mass.: Massachusetts Institute of Technology, Center for Advanced Engineering Study, 1986.

Probability – biopm, llc

On Statistics as a Method of Problem Solving

Setting SMART Goals