*Author's Note: This article is the first in a series on the use of statistically designed experimentation in coatings research and development. The articles will be based on examples and anecdotes from my 30 years of work in polymers, coatings and adhesives. I have not had formal training in statistics, but I have had the good fortune of being tutored by experienced users and counseled by applied statisticians. I am short on theory (don't ask me to interpret a Durbin-Watson statistic or calculate a Box-Cox transformation), but if you want some practical step-by-step procedures, stay tuned.*

Statistical methods encompass a variety of procedures. Some of the buzzwords include numerical analysis, sampling statistics, paired testing, factorial design, Latin squares, robust testing, Taguchi, simplex designs, regression analysis, response surface methodology, sequential analysis and control charting. These techniques can be used to analyze experimental error, define the "true" value, identify bias, identify and quantify effects of significant experimental variables, optimize a process or formulation, predict future performance, select product specifications, control quality, or to correct off-grade product.

I have heard many times that statistics is OK for manufacturing and quality-control chemists, but not for everyday use in research and applications labs where different things are done every day. However, if you are involved with a number, whether it is a continuous variable like temperature, an ordinal variable like performance ranking or a nominal variable like counting categories of objects, you are involved with statistics.

So if you are involved with numbers (even though statistical methods seem to add more complication), using statistics will provide more information with fewer experiments at a faster rate and a lower cost. In addition, statistically derived information, charts and figures can be more easily understood by management, colleagues and customers than tables of data.^{1}

## Beginnings

When I give statistical seminars, I generally start with the Scientific Method, because not only is it the basis for all scientific investigation, but it is also the basis for statistically designed experimentation. In fact, as early as the sixth century a form of the Scientific Method was used by Socrates.Almost every day one can read in the newspaper a statement like "scientists today proved that ..." Actually, according to the Scientific Method, nothing can be proven true, only proven false. Statistic analyses are wonderful tools to aid a scientist in research.

The first step of the Scientific Method is * observation * of a fact. The second step is for the scientist to account for this fact by way of a *hypothesis* that explains what was observed.^{2} The next step is *experimentation* that could disprove the hypothesis. If the hypothesis is disproved, the new data is combined with the old data to form an alternative hypothesis, which is again tested. Ultimately, when all of the observations support the hypothesis - it only takes one example for disproof - the hypothesis is accepted as a *theory*.

In the scientific world, theories are continually evaluated with additional tests and with newly generated information - again, only one exception to the theory must be found to disprove it. If the theory is disproved, the new data is combined with all the old data and a new hypothesis is constructed. With additional testing, the new hypothesis may become a theory. The process continues until the theory is shown never to have an exception. The resultant conclusion is termed Natural Law.

The story of gravity illustrates how theories evolve. Anyone can observe the effect of gravity by jumping. After an apple hit him on the head, Sir Isaac Newton developed the hypothesis that the masses of two objects attract each other. The magnitude of the gravitational effect can be predicted from the masses of the bodies and distance they are apart. The Newtonian theory of gravitation only holds true for macroscopic bodies, such as planets, people or apples. Newton's theory was seen to fall apart when measuring techniques became good enough to measure forces between atoms.

Albert Einstein developed a new hypothesis, which combined the data on macroscopic and microscopic effects of gravity. His new hypothesis eventually became known as the Special Theory of Relativity. Even this theory did not describe all gravitational observations, like the effect of the mass of a giant star on passing star light. Using this new data, Einstein eventually developed his General Theory of Relativity. Even today, scientists continue to gather data in attempts to demonstrate shortcomings of this theory.

## An Interesting Tool

Because of experimental variation,^{3}differences between experiments are often difficult to judge. Usually the experimenter goes with the highest number and ignores experimental variation. Statistical methods can help identify real differences and point out when the data is inadequate to make a decision.

Statistical methods can be used in all facets of the coatings industry: screening studies, formulation, manufacturing process optimization and quality control. I have used statistically designed experimentation to explore emulsification processes, polymer compositions, coating resin combinations, adhesive formulations, cast elastomer recipes, polymerization reaction conditions, performance tests and analytical tests.

After the statistical analysis was completed, I used the results to predict an optimum set of operating conditions to achieve a desired result, to predict an expected value for a given set of operating conditions, to calculate an optimum adjustment for an out-of-spec blend, to control a manufacturing process to maintain product quality, and to calculate the lowest cost operating conditions of a manufacturing unit.

## Basics

At the start of a project, a researcher will ask: "Is my new paint formulation better than the old one?" The statistically minded researcher will form a hypothesis and then try to prove it wrong. One hypothesis is the Null Hypothesis, which states that two things are equal - no difference. This means that the researcher will make the statement: "My new paint formulation performs the same as my old."The researcher will then gather data in an attempt to disprove this hypothesis. He uses statistical tests to compare the data from both formulations. If the data does not disprove the hypothesis, the researcher has two choices, accept the hypothesis or gather more data to try to disprove it. The researcher never says he has proved that formulations are equal. Someday he may gather more data to show that they are not. If the data disproves the hypothesis, the researcher then develops an alternative hypothesis, for example, that the performance of Formulation A is better than that of Formulation B.

The researcher then has a new hypothesis to try to disprove. If this hypothesis holds, the statistical researcher does not say "I have proven Formulation A is better than Formulation B." Rather, the researcher says, "I cannot disprove that Formulation A is better than Formulation B."^{4} Usually, when a statistical researcher says the latter, people hear the former.

The statistical researcher never makes absolute statements like the above. He always gives a probability, for example, "I am certain that if I say 'I cannot disprove that Yield A is greater than Yield B,' I will be wrong only one time in 20." Usually, a statistician will conduct additional experiments if he concludes he could be wrong one time in 10; he will believe his conclusion when he would be wrong only one time in 20; and he will buy his wife flowers if he can conclude he would be wrong only one time in 100.

There are two types of mistakes a statistician can make: he can reject the null hypothesis when it is actually true; or he can accept the null hypothesis when it is actually false. The first is called Type I error and the second is called Type II error. Both errors are disastrous. Statistically designed experimentation evaluates these errors each time a statistical analysis takes place.

## Experimental Error

First, experimental errors are not mistakes. Experimental error refers to the normal day-to-day variations in processes, analytical equipment, raw materials or uncontrolled aspects of the experiment. For example, a glass burette has delimiters in terms of tenths of milliliters. The experimental error is ±0.03 ml. When a researcher reports the titration took 12.32 ml, the real value could be anywhere between 12.29 and 12.35 ml: not a mistake, just experimental error. If greater precision is needed automatic titrators using the electric potential of the solution improves the precision to less than ±0.01 ml.Typically a chemist has several different test methods to choose from. For example, ASTM reports more than a dozen methods for viscosity measurement. The tests may have different accuracies and precisions. Only experience and history can help decide which one to use. It is not unusual for laboratories in different locations to report different results even when the identical test method is used. That is the reason ASTM committees conduct repeated round-robin testing protocols to determine error within a lab and between different labs and then include the data in the standard test methods.

There are two major assumptions in the collection of test data: experimental error follows a normal distribution - data is symmetrically distributed above and below the average; and that random order is used for experimentation. Not using random order experimentation can lead to false conclusions.

A researcher ran the four experiments in the order shown in the table. After Experiment 2, her thermometer broke, so she got a new one. After running the experiments the researcher concluded that high levels of Variable B gave improved yields. When she transferred this result to manufacturing, higher yields were not seen. Actually, the replacement thermometer gave a lower reading at the same temperature, so that experiments 3 and 4 were run at a higher temperature than 1 and 2. The higher temperature gave the higher yields, not the high level of Variable B. If she had randomized the experiments, say 1, 4, 3, 2, the bias introduced by the new thermometer would have been spread over the whole experimental design and the erroneous conclusion would have been avoided.

This same situation could occur if one had changed raw materials in the middle of a design experiment or if one of the uncontrolled process variables had accidentally changed. If the experiments in a design are conducted in random order, the unknown variables become part of the experimental error, which can be separated from the effects of the variables.

## The Statistical Method

Statistically designed experimentation requires six steps. First, the researcher uses his knowledge and experience to define the goals of the work - the null hypothesis. Process variables and ingredients are selected for study. The upper and lower limits are set for these variables.Then, the extreme conditions of the experiment are run. This checks to see if the limits are set too wide or too narrow. If too narrow, the researcher may miss a new opportunity, but if too wide, several experiments may fail and invalidate the design. If all the extreme experiments are successful, the researcher may wish to broaden the limits, but if the extremes fail, he must reduce the limits.

The number of experiments is selected in the third step. A specific pattern of experiments is selected. Certain of the experiments are chosen for replication. Test methods are selected on the basis of applicability, their degree of accuracy and precision.

Next, one or more experimental designs is performed to develop a predictive, mathematical model: a screening design, or a full factorial design; and ultimately an enhanced design to refine the information.

The last step is for the researcher to use the model defined in the fourth step to make and verify predictions about the process or formulation. When verification experiments reproduce the predicted results, the researcher has an operational theory.