How to Bake the Perfect Cake

This article is Part 10 in a series. See PCI August 2003, for the previous installment.

Where it all Started

When I first started my career in 1969, I worked at General Mills. This was long before the time of personal computers and before I had heard of statistical design of experiments. As part of General Mills' internal research symposiums a group of food chemists demonstrated a computer program that could predict the qualities of cake recipes. This fascinated me. These scientists had baked a series of cakes according to some plan, measured the qualities of each cake, had numerically analyzed the results, and had developed a predictive computer program. This program could either predict the qualities of the cake from the ingredients or predict a recipe based on the qualities of the cake desired. This was my first exposure to experimental design. I wouldn't really discover the use of factorial statistics until the mid ‘70s.1

A factorial design sets variables at high and low levels (i.e., in a three-factor design the experiments would be coded as in Table 1).

If Variable 1 was temperature, the code "-" would stand for the low level in the experiment, say 100 °C, and the code "+" would stand for the high level, say 150 °C. This means that for experiments 1, 3, 5 and 7 the temperature would be set at 100 °C. Similarly, the other variables would be something controlled during the eight experiments, like time or agitation rate. When analyzing data, -1 is used for "-" and +1 is used for "+". The design can be graphed, as in Figure 1. Four of the points have their mathematical coordinates shown.

After the experiments are run and the data gathered, the data can be subjected to regression analysis to develop a predictive equation. The generic form is shown in Equation 1, where the bs represent the regression coefficients and the Xs, the ingredient variables.

Property = b0 + b1X1 + b2X2 + b3X3 + b12X1X2 + b13X1X3 + b23X2X3 + b11X12 + b22X22 + b33X32 (1)

Once I had learned about Design of Experiments or DOE, I tried using DOE on all my lab work. I learned the hard way that factorial statistics is best used when variables are independent of one another, for example, temperature, pressure, time, speed, etc. DOE is not good for studying formulations.

I was given an assignment to evaluate the effect of mischarging monomers in acrylic resin preparation. The standard formulation had monomer A at 20 weight %, monomer B at 30 % and monomer C at 50 %. I decided to vary the monomers at 10 relative %; or A would vary from 18 to 22; B from 27 to 33; and C from 45 to 55. The original design looked like that in Table 2.

When I evaluated the design I realized that not all experiments totaled 100%, so I scaled the experiments back to 100%. This is also shown in Table 2. Now my dilemma was that Experiment 1, the experiment where all the monomers were to be undercharged, was the same as Experiment 8, where all the monomers were to be overcharged, and these two experiments were the same as my starting-point formulation, the center of my design. What to do? I made the resins as described and lucked out in that there were no differences in performance properties in coatings. Several years would pass before I discovered a solution.

In November 1979, Robert Snee wrote about the use of mixture statistics in formulations ("Experimenting with Mixtures," Chemtech, Nov., 1979, pp 702-710). This was the answer I needed. By then, mixture statistics had been investigated for about 20 years. For those who would like to read more, J. Wiley has published a book, Experiments with Mixtures by J. Cornell that goes into greater detail than this article will.

Mixture Constraints and Predictive Equations

Mixture statistics are useful when studying variables that add to 100%, like blends, formulations and polymers. This covers applications like coatings, adhesives, elastomers, detergents, fuels, recipes, etc.

If a mixture has n ingredients that add to 100%, only n-1 ingredient levels need be specified, because the level of nth ingredient is set. This is a loss of freedom, or in statistical terms, a loss of one degree of freedom. This can be stated mathematically in equations 2 and 3.

X1 + X2 + X3 = 1 or 100% (2a)

X3 = 1 - X1 - X2 (2b)

This constraint has mathematical implications, because the following algebraic identities hold.

b0 = b0 * 1 = b0 * (X1 + X2 + X3) (3)

X12 = X1 * X1 = X1 * (1 - X2 - X3) = X1 -X1X2 - X1X3 (4a)

X22 = X2 * X2 = X2 * (1 - X1 - X3) = X2 -X1X2 - X2X3 (4b)

X32 = X3 * X3 = X3 * (1 - X1 - X2) = X1 -X1X3 - X2X3 (4c)

When equations 3 and 4 are substituted into equation 1, equation 5 results.

Property = c1X1 + c2X2 + c3X3 + c12X1X2 + c13X1X3 + c23X2X3 (5)

The c coefficients have a different meaning than the b coefficients. The linear coefficients are given in Equations 6a through 6c, and the quadratic coefficients are given in Equations 7a through 7c.

c1 = b0 - b1 - b11 (6a)

c2 = b0 - b2 - b22 (6b)

c3 = b0 - b3 - b33 (6c)

c12 = b12 - b11 - b22 - b33 (7a)

c13 = b13 - b11 - b22 - b33 (7b)

c23 = b23 - b11 - b22 - b33 (7c)

Graphical Representation of Mixture Space

The algebraic derivation can be shown graphically. Figure 2a shows a coded design space for two factorial variables, X1 and X2. In a mixture design, however, the sum of X1 and X2 must equal 1. This means that all of the experimental points must fall on the line between the points (0,1) and (1,0). When the response is plotted, Figure 2b results. Either the amount of X1 or X2 can be used as the abscissa. Similarly, Figure 3a shows a coded design space for three factorial variables, X1, X2 and X3. In a mixture design of three ingredients, the sum of the three ingredients must equal one. In this case the allowable experimental points fall on the triangularly shaped plane between the points (0,0,1), (0,1,0) and (1,0,0). In triangular plots, the response is plotted like a contour map, where each contour represents the same response for different formulations.

It is sometimes difficult to understand how to decipher the formulations in a triangle plot. Each vertex represents 100% of one ingredient. The side opposite that vertex represents formulations with none of the ingredient of that vertex. For example, vertex A represents 100% ingredient A and the base between B and C represents formulations with 0% A. The base between B and C represents various combinations of ingredients B and C. For example, point Q represents a formulation with 50 parts B and 50 parts C (no A). Similarly, point R represents a formulation with 25 parts B and 75 parts C. Points inside the triangle represent formulations with all three ingredients present. For example, point S represents a formulation with 331⁄3 parts A, 331⁄3 parts B and 331⁄3 parts C; and point T represent a formulation with 25 parts A, 25 parts B and 50 parts C.

The Study of Mixtures

So far this article has described the algebraic and graphic derivation of mixture statistics and how mixture statistics differs from factorial statistics. At this point, a description could be made on how to formulate and bake the perfect cake. For now, examples will be shown how mixture statistics can be used to study an elastomer, a coating and an adhesive. We'll talk about how to bake a cake at the end of the article.

Formulation of an Acrylic Urethane Coating

Three resins were available for formulating an acrylic urethane coating (Table 3). These resins were usually formulated with an oligomeric polyisocyanate based on hexamethylene diisocyanate at a mix ratio of about 1.1 isocyanate groups per 1.0 acrylic resin hydroxyl group. Because these resins had different properties they were potentially blending partners.

Acrylic Resin A with its high Tg and high hydroxyl number yielded coatings, when cured with the above polyisocyanate, with the highest chemical resistance and tensile properties of the family. Acrylic Resin B with a Tg just above room temperature and its high molecular weight yielded coatings with a very fast dry time. Resin C with its Tg below room temperature yielded coatings with high flexibility. All of the resins yielded coatings that retained about 65% of their original gloss, when formulated into white coatings and exposed in south Florida.

A goal of this project was to determine if intermediate properties could be attained by blending the resins. Ten experimental formulations were prepared as shown in Figure 5, which is a typical mixture experimental design for fitting a quadratic algebraic predictive model. White pigmented paints were formulated with all the necessary ingredients; coatings were prepared, and the coating properties determined. Regression analysis was performed on the data and response surfaces plotted as shown in Figures 6a - 6d.

In general, the response surfaces demonstrated what was already known about coatings based on the individual resins. Figure 6a demonstrates that as the amount of acrylic B, the fast-dry resin, was increased in the formulations, the dry times decreased. Figure 6b demonstrates that as the amount of acrylic C, the flexible resin, was increased, the impact resistance increased. Figure 6c demonstrates that as acrylic A, the chemical resistant resin, was increased, the chemical resistance increased.

The Florida weathering data in Figure 6d verified what was known about coatings using only a single resin: 60_ gloss retention was ~65% after two years. However, a surprise was seen in blends of acrylic resins A and C. When there was a mix ratio of 50:50 A:C, the gloss retention increased to over 85%. This was completely unexpected and illustrates how discoveries are made and patents are granted. A postscript is that a resin development program succeeded in producing Resin D with an intermediate composition between Resins A and C, which had the improved weathering.

One additional benefit of developing response surfaces needs to be pointed out. Response surfaces can be overlaid. Formulations can then be predicted, which satisfy both performance criteria. For example, if formulations are desired that have at least 70% gloss retention and greater than 60 reverse-impact, the grayed area in Figure 7 shows the possible formulations that would satisfy these criteria. A third response surface, maybe cost, could be superimposed to refine the allowed formulations.

Formulation of a Urethane Cast Elastomer

The first example illustrated how ingredients can be varied from 0 to 100% in a formulation. This is not always desired. The second example is a cast elastomer prepared from hydrogenated MDI diisocyanate, H12MDI, and three polyols: a polytetramethylene glycol, PTMEG; butanediol, BDO; and trimethylol propane, TMP. The cast elastomers were prepared with an NCO-OH ratio of 1:1. Since the diisocyanate was common to all the formulations, it does not have to be taken into account for the design. The design space and experimental points are shown in Figure 8a. The PTMEG contributes to the soft segment domains of the cast elastomer, while the BDO and TMP contribute to the hard segment domains. The isograms showing hard segment content are shown in Figure 8b. In this design, the cast elastomer made with 100% PTMEG was not very interesting, so was excluded from the design. The design was also limited to formulations that had >20% PTMEG. It was believed that formulations with <20% PTMEG would be too brittle to make successful castings. Formulations with only PTMEG and BDO would be thermoplastic, since these are both difunctional, whereas formulations that include TMP would be thermoset, because TMP is trifunctional.

Most cast elastomer performance properties can be predicted, at least the trends can be, from the hard segment content and the degree of crosslinking, that is the trimethylol propane content. This turned out to be true for tensile properties and hardness. A surprise was seen in the case of Bayshore Rebound.

For the Bayshore Rebound response surface, an area in the middle of the design space is seen to be energy absorbing. The question was why. If you recall this test, it consists of dropping a ball onto the cast elastomer and measuring the height of the rebound and calculating a percentage of the height dropped. When the samples were examined, the samples with a high content of PTMEG were rubbery and so gave a "rubber ball" bounce. The samples with a high hard segment content were very hard and gave a "billiard ball" bounce.

Formulation of an Epoxy Adhesive

In the last example the PTMEG content in the formulation varied from 20-90%; and the BDO and TMP each varied from 20-80%. This next example illustrates two things: one is that mixture statistics can include non-reactive species; and the other is that constraints do not have to be straightforward.

The epoxy resin used in the adhesive was a liquid epoxy of 190 equivalent weight. The hardener portion was made up of a diamine crosslinker, an acid catalyst and a plasticizer. The epoxy and the hardener formulation were mixed such that there was an oxirane-amine hydrogen ratio of 1:1. Since the epoxy was common to all the formulations, it does not have to be taken into account for the design.

The amine is the crosslinker and was varied from 20-100% in the formulation. The catalyst was varied from 0-35% of the formulation and the plasticizer was varied from 0-70% of the formulation. In addition it was determined that the catalyst-plasticizer ratio should not be greater than 80:20, so this constraint was also included. The resulting design space and experimental points are shown in Figure 10a.

When the experiments were run, the amine number of the hardener was found to be linearly correlated to the amine content, as expected, Figure 10b. A log transformation of the hardener viscosity was used to determine a quadratic dependence on the three ingredients of the hardener as illustrated in Figure 10c. The gel time of the adhesive is shown in Figure 10d. This showed that there was an optimum balance of catalyst and plasticizer, if the fastest cure was desired.

What's Next?

The above examples illustrated formulations of three ingredients. Others could be shown for formulations with four or more ingredients using the same principles. Beyond four I can't visualize the design space, but computer programs can do that.

Now that I've shown how to conduct an experimental design for an acrylic urethane coating, an urethane cast elastomer and an epoxy adhesive, conducting an experimental design to get the perfect cake should be a snap. All of the cake formulations, uh recipes, I have seen consist mostly of sugar, flour and some standard combination of additives - eggs, water, shortening, flavoring, etc. A design like one of the above could be run with response variables like cake height, texture and flavor. I'll leave those details to you.

What else can be done with mixture statistics? Properties of oligomers and polymers can be modeled using mixture statistics. This could be a topic for a future article.

References

1 Paint & Coatings Industry Magazine, "deSigns of the Times: Or, When F is a Passing Grade," August, 2001.