Decision Making – biopm, llc Improving Knowledge Worker Productivity Mon, 01 Mar 2021 04:46:03 +0000 en-US hourly 1 Decision Making – biopm, llc 32 32 193347359 The Missing Information in Business Metrics Mon, 01 Mar 2021 02:18:09 +0000 Continue reading The Missing Information in Business Metrics]]> Modern businesses generate and consume increasingly large amounts of data.  Information is needed to support operational and strategic decisions.  Despite the advent of Big Data tools and technology, most organizations I have worked with aren’t able to take advantage of the data or tools in their daily work.  While greater awareness of human visual perception and cognition has improved dashboard designs, effective decision-making is often limited by the type of information monitored.

It is common to see summary statistics (such as sum, average, median, and standard deviation) being used in reports and dashboards.  In addition, various metrics are used as Key Performance Indicators (KPIs).  For example, in manufacturing, management often use Overall Equipment Effectiveness (OEE) to gauge efficiency.  In quality, process capability indices (e.g. Cpk) are used to evaluate the process’s ability meet customer requirements. In marketing, the Net Promoter Score (NPS) helps assess customer satisfaction.

All of these are statistics, which are simply functions of data. But what does each of them tell us? What do we want to know from the data? What specific information is needed for the decision?

Unfortunately, these basic questions are not understood by most people who use performance metrics or statistics.  I discussed some specific mistakes in using process capability indices last July.  A more general problem is that statistics can hide the information we need to know.

For example, last year I was coaching a Six Sigma Green Belt (GB) working in Quality.  A manufacturing process had a worsening Cpk.  The project was to increase the Cpk to meet the customer’s demanding requirement. Each time we met, the GB would show me how the Cpk had changed.  But Cpk is a function of both the process center (average) and the process variation (standard deviation), which comes from a number of sources (shifts, parts, measurements, etc.).  The root causes of the Cpk change were not uncovered until we looked deeper into the respective changes in the average and in the different contributors to the standard deviation.  

The key takeaway is that when multiple contributors influence a metric, we cannot just monitor the change in the metric alone.  We must go deeper and seek other information needed for our decisions.

Many people may recall in statistics training that the teachers always tell them “plot the data!”  It is important to visualize the original data instead of relying on statistics alone because statistics don’t tell you the whole story.  The famous example to illustrate this point is the Anscombe’s quartet, which includes four sets of data (x, y) with nearly identical descriptive statistics (mean, variance, and correlation) and even the same linear regression fit and R2.  However, when visualized in a scatter plot, they look drastically different.  If we only looked at one or few statistics, we would have missed the differences.  Again, statistics can hide useful information we need.

Nowadays, there is too much data to digest, and modern tools can conveniently summarize and display them. When we use data to inform our business decisions, it’s easy to fall into the practice of looking only at the attractive summary in a report or on a dashboard.  The challenge of using data for decision making is to know what we want and where to get it.

Guess who wrote below about information monitoring for decisions?

With the coming of the computer this feedback element will become even more important, for the decision maker will in all likelihood be even further removed from the scene of action. Unless he or she accepts, as a matter of course, that he or she had better go out and look at the scene of action, he or she will be increasingly divorced from reality.

Peter Drucker in 1967.  He further wrote:

All a computer can handle is abstractions. And abstractions can be relied on only if they are constantly checked against concrete results.  Otherwise, they are certain to mislead.

Metrics and statistics are abstractions of reality – not the reality.  We must know how to choose and interpret these abstractions and how to complement this information with other types1

1. For more discussion on “go out and look” (aka Go Gemba), see my blog Creating Better Strategies.

The Practical Value of a Statistical Method Tue, 01 Dec 2020 03:58:19 +0000 Continue reading The Practical Value of a Statistical Method]]> Shortly after I wrote my last blog “On Statistics as a Method of Problem Solving,” I received the latest issue of Quality Progress, the official publication by the American Society for Quality.   A Statistics article “Making the Cut – Critical values for Pareto comparisons remove statistical subjectivity” caught my attention because Pareto analysis is one of my favorite tools in continuous improvement.

It was written by two professors “with more than 70 years of combined experience in the quality arena and the use of Pareto charts in various disciplines” and covers a brief history of Pareto analysis and its use in quality to differentiate the vital few causes from the trivial many.

The authors introduced a statistical method to address the issue of “practitioners who collect data, construct a Pareto chart and subjectively identify the vital few categories on which to focus.”  The main point is that two adjacent categories sorted by occurrence in a descending order may not be statistically different in terms of their underlying frequency (e.g. rate of failure) due to sampling error.  

Based on hypothesis testing, the method includes two simple tools:

  1. Critical values below which the lower occurrence category is deemed significantly different from the higher one
  2. A p-value for each pair of occurrence observations of the adjacent categories to measure the significance in the difference

With a real data set (published by different authors) as an example, they showed that only some adjacent categories are significantly different and therefore, are candidates for making the cut.

I see the value in raising the awareness of statistical thinking in decision making (which is desperately needed in science and industry).  However, in practice, the method is far less useful than it appears and can lead to improper applications of statistical methods.

Here are but a few reasons.

  • The purpose of Pareto charts is for exploratory analysis, not for binary decision-making, i.e. making the cut which categories belong to the vital few.  As a data visualization tool, a Pareto chart shows, overall, whether there is a Pareto effect – an obvious 80/20 distribution in the data not only indicates an opportunity to apply the Pareto principle but also gives the insight in the nature of the underlying cause system.  
  • Using the hypothesis test to answer an unnecessary question is waste.  Overall, if the Pareto effect is strong, the decision is obvious, and the hypothesis test to distinguish between categories is not needed.  If the overall effect is not strong enough to make the obvious decision, the categorization method used is not effective in prioritization, and therefore, other approaches should be considered.  
  • Prioritization decisions depend on resources and other considerations, not category occurrence ranking alone.  This is true even if the Pareto effect is strong.  People making prioritization decisions based solely on Pareto analysis are making a management mistake that cannot be overcome by statistical methods. 
  • The result of the hypothesis test offers no incremental value – it does not change the decisions made without such tests.  For example, if the fourth ranking category is found not statistically different from the third and there are only enough resources to work on three categories, what should the decision be? How would the hypothesis test improve our decision? Equally unhelpful, a test result of significant difference merely confirms our decision. 
  • The claim of “removing subjectivity” by using the hypothesis test is misleading.  The decision in any hypothesis test depends on the risk tolerance of the decision maker, i.e. the alpha (or significance level) used to make the decision whether a given p-value is significant is chosen subjectively.  The choice of a categorization method also depends on subject matter expertise – another subjective factor.  For example, two categories could have been defined as one.  In addition, many decisions in a statistical analysis involve some degrees of expert judgment and therefore introduce subjectivity.  Such decisions may include whether the data is a probability sample, whether the data can be modeled as binomial, whether the process that generated the data was stable, etc.  

Without sufficient understanding of statistical theory and practical knowledge in its applications, one can easily be overwhelmed by statistical methods presented by the “experts.”  Before considering a statistical method, ask the question “how much can it practically improve my decision?”  In addition, “One must never forget the importance of subject matter.” (Deming)

On Statistics as a Method of Problem Solving Sun, 01 Nov 2020 03:55:59 +0000 Continue reading On Statistics as a Method of Problem Solving]]> If you have taken a class in statistics, whether in college or as a part of professional training, how much has it helped you solve problems?

Based on my observation, the answer is mostly not much. 

The primary reason is that most people are never taught statistics properly.   Terms like null hypothesis and p-value just don’t make intuitive sense, and statistical concepts are rarely presented in the context of scientific problem solving. 

In the era of Big Data, machine learning, and artificial intelligence, one would expect improved statistical thinking and skills in science and industry.  However, the teaching and practice of statistical theory and methods remain poor – probably no better than when W. E. Deming wrote his 1975 article “On Probability As a Basis For Action.” 

I have witnessed many incorrect practices in teaching and application of statistical concepts and tools.  There are mistakes unknowingly made by users inadequately trained in statistical methods, for example, failing to meet the assumptions of a method or not considering the impact of the sample size (or statistical power).  The lack of technical knowledge can be improved by continued learning of the theory.

The bigger problem I see is that statistical tools are used for the wrong purpose or the wrong question by people who are supposed to know what they are doing — the professionals.  To the less sophisticated viewers, the statistical procedures used by those professionals look proper or even impressive.  To most viewers, if the method, logic, or conclusion doesn’t make sense, it must be due to their lack of understanding.  

An example of using statistics for the wrong purpose is p-hacking – a common practice to manipulate the experiment or analysis to make the p-value the desired value, and therefore, support the conclusion.

Not all bad practices are as easily detectable as p-hacking.  They often use statistical concepts and tools for the wrong question.  One category of such examples is failing to differentiate enumerative and analytic problems, a concept that Deming wrote extensively in his work, including the article mentioned above.  I also touched on this in my blog Understanding Process Capability.

In my opinion, the underlying issue using statistics to answer the wrong questions is the gap between subject matter experts who try to solve problems but lack adequate understanding of probability theory, and statisticians who understand the theory but do not have experience solving real-world scientific or business problems.   

Here is an example. A well-known statistical software company provides a “decision making with data” training.  Their example of using a hypothesis test is to evaluate if a process is on target after some improvement.  They make the null hypothesis as the process mean equal to the desired target.  

The instructors explain that “the null hypothesis is the default decision” and “the null is true unless our data tell us otherwise.” Why would anyone collect data and perform statistical analysis if they already believe that the process is on target?  If you are statistically savvy, you will recognize that you can reject any hypothesis by collecting a large enough sample. In this case, you will eventually conclude that the process is not on target.

The instructors further explain “It might seem counterintuitive, but you conduct this analysis to test that the process is not on target. That is, you are testing that the changes are not sufficient to bring the process to target.” It is counterintuitive because the decision maker’s natural question after the improvement is “does the process hit the target” not “does the process not hit the target?”

The reason I suppose for choosing such a counterintuitive null hypothesis here is that it’s convenient to formulate the null hypothesis by setting the process mean to a known value and then calculate the probability of observing the data collected (i.e. sample) from this hypothetical process.  

What’s really needed in this problem is not statistical methods, but scientific methods of knowledge acquisition. We have to help decision makers understand the right questions. 

The right question in this example is not “does the process hit the target?” which is another example of process improvement goal setting based on desirability, not a specific opportunity. [See my blog Achieving Improvement for more discussion.]  

The right question should be “do the observations fall where we expect them to be, based on our knowledge of the change made?”  This “where” is the range of values estimated based on our understanding of the change BEFORE we collect the data, which is part of the Plan of the Plan-Do-Study-Act or Plan-Do-Check-Act (PDSA or PDCA) cycle of scientific knowledge acquisition and continuous improvement.   

If we cannot estimate this range with its associated probability density, then we don’t know enough of our change and its impact on the process.  In other words, we are just messing around without using a scientific method.  No application of statistical tools can help – they are just window dressing.

With the right question asked, a hypothesis test is unnecessary, and there is no false hope that the process will hit the desired target.  We will improve our knowledge based on how well the observations match our expected or predicted range (i.e. Study/Check).   We will continue to improve based on specific opportunities generated with our new knowledge.

What is your experience in scientific problem solving?

]]> 1 1220