In a normative SenseMaker triad, story data are clustered at the three vertices plus the centroid of the triangle, with lesser groups near the midpoints of the sides and perhaps “stringers” between the centroid and the other six loci. The first image in Part III: Random Data shows an example.

There are two major reasons to examine the distribution of the data more closely. Firstly, if there are a lot of data points in one location, the true density of the data may be obscured due to overprinting. Secondly, comparison between groups of respondents is universal in SenseMaker projects, which generally leads to questions best answered with quantitative methods. Here are some possible comparative approaches:

- side-by-side visual inspection of cohorts or demographic groups, in the same or different triads, utilizing the client’s knowledge of the subject population(s);
- calculation of the statistical significance of the difference between the (geometric) means of two or more data groups, as illustrated in Part IV: Confidence Regions;
- selective estimation of data distribution by counting points in prominent clusters (see below); and
- histogram-like methods for aggregating data, either with discrete (discontinuous) bins or with smooth (continuous) contour lines. The first of these produces a ternary heat map, which is discussed in Part VIIIb. The second is something else entirely and the subject of this post.

This figure shows a triad, T7 (left-hand frame), from a recent project. Laurie chose the circled clusters by eye and used the “lasso” tool in Tableau to measure the number of points in each cluster. The labels show that 87% of the total falls in those seven clusters, with a slight overall majority in the elongated field at the upper vertex. This semi-quantitative method can be useful for a client’s discussions, for example, about differences between respondents in each cluster. If the number of stories increases substantially, however, then distinct clusters may become less obvious; and overprinting (evident near the top vertex where individual blue dots blur into an unresolvable blob) may become a barrier to seeing both the deconstructed detail and the larger structure.

The other two frames of the T7 figure are designed to show precisely that larger structure. The blue lines (center) and orange-shaded areas (right) mark contour lines that enclose areas of the triad with successively higher likelihood of containing story points. Hence, the most closely-packed blue lines and deepest-orange colors are near the upper vertex where the 54%-cluster of points was demarcated in the left frame. **T7 is typical of the kind of match one would expect between contour patterns and the underlying data.**

Just as on a topographic map, where contours indicate increasing height above sea level, the contours on these two plots can be read as a central ridge, climbing toward a peak at the upper vertex. These contours, however, trace out values of a probability density function (PDF), which were calculated by kernel density estimation (KDE). If you recognize the PDF and KDE acronyms, then you probably don’t need to go to those two Wikipedia links; and if you don’t, well, then you probably don’t *want* to go to them. (But, if you just can’t resist the lure, then go to the KDE link and read the first section on “Definition” and look at the side-by-side images comparing a histogram with its equivalent one-dimensional KDE. Now all you have to do is imagine that the T7 frames (above) are a two-dimensional version of the same idea. That histogram + KDE image is also reproduced in the Appendix at the end of this post.)

Suffice to say that the KDE + PDF combo is a way to aggregate the triad data and then calculate a smooth, continuous guess (the contours) about where any unsampled stories would be likely to plot. **Again, T7 is a good example of the kind of match this technique can yield.** The beauty of it is that the guess is non-parametric, meaning you don’t have to know anything about whether the data obey some particular set of statistical rules, such as a normal, bell-shaped distribution.

So the idea in principle is that the KDE contours will show the larger, overall structure of the data, whether there is overprinting or not. In some situations it would not be unusual to omit the data entirely, for example, where there were so many points that the image at some working scale was a blur of run-together dots. Fortunately, in the study of which T7 was a part, Laurie kept the data points as a base layer for the two versions of the KDE plots. That made it much easier to see the problem in T3:

In a nutshell, the contour pattern is wrong. Most importantly, there should be indication in the contours that the highest concentration of data is at the lower-left vertex, with equal secondary concentrations at the lower-right vertex and at the centroid. Instead, there is a single central peak, with all contours dropping off to the corners. If you were looking at this pattern on a topographic map while out hiking, you would expect to look up and see a familiar Alpine glaciation feature. Think “Matterhorn.” The actual data, however, describe a more subdued version of the KDE plots for T7, but rotated so that the “ridge” would be climbing up toward the lower-left vertex, with similar lower levels (probabilities) at the lower-right and center .

In fact, this is not an isolated example. In the study from which T3 and T7 were taken, these two are the extremes, and the other five triads can be arranged gradationally between them. Four of the seven, including T3, have contours that overall do not accord with the data. (Laurie has learned subsequently of at least three other SenseMaker practitioners who have seen this issue.)

The most likely potential explanation gets into some very deep technical weeds, involving the interplay between the effect of the log-ratio transformation (LRT) on near-vertex and other near-zero triad data and the behavior of a fixed-width kernel function when data points are sparse. Oddly enough, it appears that the LRT preferentially stretches-out near-vertex/near-zero data so that they are less heavily weighted in the KDE than they should be. There is a long Appendix (below) that discusses these issues with some arithmetic detail, but very little mathematics, for the tiny number of people in the Cogniverse who might be interested.

Right now, we can say three things with a modest degree of certainty:

- Laurie’s data prep and Nicholas Hamilton’s KDE-related R code (see ggtern) were identical for all seven triads, so (simple) human error is unlikely to be the culprit. Said differently,
**non-SenseMaker practitioners should continue to have full confidence in ggtern**. - The ternary heat maps that are now available in ggtern are a reliable way to present the same information in a discrete format, though they admittedly lack the appeal of the continuous PDF contours. See Part VIIIb in this series for examples.
- Any SenseMaker practitioners who decide (or continue) to use KDE/PDF contours should include story data points as a base layer and heed the paraphrased lesson of Part VII: Mapping the Datasaurus Dozen: Don’t just calculate, plot… and then compare!

The simplest way to grasp the likely cause of the mismatch between triad story data and PDF contours, as in the T3 example (above), is to look at the equivalent phenomenon in one dimension (1-D), that is, in a histogram and its corresponding KDE. As a reminder, even though a triad has three components and coordinates — (A,B,C), one at each vertex, and (a,b,c), each running from 0 to 1, respectively — only two are independent due to the closure constraint. Hence, the triad is a 2-D construct.

Here is a superb illustration [1] from the KDE article in Wikipedia:

And here is the accompanying description of this graphical comparison (emphasis and footnote added):

Kernel density estimates are closely related to histograms, but can be endowed with properties such as smoothness or continuity by using a suitable kernel. To see this, we compare the construction of histogram and kernel density estimators, using these 6 data points:

x_{1}= −2.1,x_{2}= −1.3,x_{3}= −0.4,x_{4}= 1.9,x_{5}= 5.1,x_{6}= 6.2. For the histogram, first the horizontal axis is divided into sub-intervals or bins which cover the range of the data. In this case, we have 6 bins each of width 2. Whenever a data point falls inside this interval, we place a box of height 1/12. If more than one data point falls inside the same bin, we stack the boxes on top of each other.For the kernel density estimate,

we place a normal [Gaussian] kernel… (indicated by the red dashed lines) on each of the data points. The kernels are summed to make the kernel density estimate (solid blue curve).[2] The smoothness of the kernel density estimate is evident compared to the discreteness of the histogram, as kernel density estimates converge faster to the true underlying density for continuous random variables.x_{i}

*With apologies to Woody Allen* [3]….

In order to appreciate how the log-ratio transformation might “stretch out” the near-vertex, near-zero data in a triad, we’ll look first at a contrived example in the univariate (1-D) KDE plot. In the following schematic figure, there are four stacked number lines showing the same 6 data points as in the comparison plot from Wikipedia (above):

- The blue line shows the data points in the original configuration.
- The orange line expands
*only*the interval (-1,+1) by 2X, that is, each point therein (including*x*_{3}) is now twice as far from the origin as in the original; all other points maintain the same relative spacing with respect to the nearer endpoint of the expanded interval and to any other point(s) on the same side of the origin. - The gray line expands the (original) interval by 4X, otherwise as for 2X.
- The yellow line expands the (original) interval by 8X, otherwise as for 2X.

The four black squares show the migration of *x*_{3} to the left with each expansion step.

Here is the effect of these expansions on their respective KDE curves, with the kernel widths held constant; again, the four black squares mark successive locations of *x*_{3}:

There are several things to notice about the three expanded curves relative to the original:

- to the right of the origin, the main peak remains at the same height (0.076) because the three right-most kernels are fixed with respect to each other;
- a shoulder for
*x*_{4}begins to emerge on the “expanded” curve as it separates from and loses the additive contribution from the kernel for*x*_{3}; - to the left of the origin, the height of the main peak drops as the central expansion pushes
*x*_{1}and*x*_{2}away, effectively removing the kernel for*x*_{3}from the summation; - a subtle shoulder due to the
*x*_{3}kernel is visible on the orange (2X) curve, becoming more noticeable on the gray (4X) curve, finally appearing as a distinct peak in the yellow (8X) curve, as the effect of the first two kernels on*x*_{3}diminishes; and - the valley between the two peaks (modes) in the original deepens and bottoms-out as
**the expansion removes any contribution from the near-zero kernel**, or any of the others for that matter.

This last bullet highlights — ironically by the *absence* of a “signal” where the yellow curve bottoms-out — that a univariate or bivariate kernel density estimate is usually a “local” construct. This situation changes dramatically as the number of dimensions goes up, for example, in machine-learning or other “big-data” situations. Here is an example (simplified and paraphrased from James et al., 2017, p. 108-9):

Imagine a data set with 100 points. In a univariate (1-D) analysis, that would be adequate to calculate an accurate KDE. If those same data were spread out over 20 dimensions, however, then any given observation would generally have no nearby neighbors, the so-called curse of dimensionality. A KDE can still be calculated, but now the bulk of the contributions will generally be from distant tails on the kernels, which plays havoc with the bias-variance trade-off (very roughly accuracy vs. precision) of the results.

The contrived example (above) illustrates conceptually how the curse of dimensionality, less commonly referred to as the “empty space phenomenon” (Silverman, 1986; Verleysen, 2003), might manifest in 1-D. No artifice is needed to see the equivalent in 2-D, however, because **every time we use a log-ratio transformation on triad data we are unwittingly creating a scale expansion**. Metaphorically, this allows a “leakage” of empty space into the near-vertex, near-zero data, stretching them apart so that the KDE is largely determined by the tails on no-longer-local kernels.

We can get a good feel for the potential magnitude of this effect by a simple coordinate-mapping exercise, going from a ternary to a rectilinear LRT plot. Here is a triad, sub-divided into two parts: the cleverly-named “interior” and a surrounding “annulus,” separated by the 1% coordinate lines for the respective vertices:

In the triad coordinate frame, with a little help from Pythagoras, the base of the full triangle is 100 units; the height is 86.6 (= √3/2 x 100); and the area is 4330 units^{2} (= 1/2 x 100 x 86.6). The base of the interior triangle is 98 units; the height is 84.0 (= √3/2 x 97, not 98!); and the area is 4116. Hence the area of the annulus is 214 (= 4330 – 4116). The annulus/total ratio is 0.0494; and the annulus/interior is 0.0520. We will use the latter below for a comparison with the analogous ratios (notice the *plural*) in the LRT plot.

Additionally, we need some points in the triad for the coordinate-mapping exercise. Here is a representative cluster near the B vertex:

A refresher: If you’ve never given much thought to how the shapes of data patterns can change between ternary and Cartesian plots — in this case between a triad and a log-ratio plot — the previous post in this series, Part VII: Mapping the Datasaurus Dozen, offers a visual primer of sorts. See especially the fourth and fifth figures down that page, the two adjacent multi-image panels. These are followed by a few paragraphs, under the “So what…?” heading, that summarize the workings of the log-ratio transformation.

A reminder: Now that we’re deep in the technical weeds, as I called them up in the main post, it’s worth recalling where we’re headed. What we’re after is a conceptual understanding of the mismatch between the location of story data in the triad T7 and the contours for their kernel density estimate. The hypothesis is that the log-ratio transformation, which is necessary to allow statistical analysis of the closed data from the triad, also introduces a distortion in near-vertex, near-zero data prior to the KDE calculation. We’re about to see how large that distortion, the stretching-out, can be.

Here are the near-vertex coordinate points for component B (above), mapped to a rectilinear grid with the isometric log-ratio (*ilr*) transformation (see Egozcue et al., 2003):

The colored hexagons correspond to the respective locations in the triad — (0,100,0), (1,98,1), and (5,90,5) — and similarly for the 1% A and 1% C coordinate lines. [4] The curve for 95% B is a schematic fit, but close enough to hint that the overall pattern of coordinate lines is more complicated than in a triad. The solid arrows look like the BA and BC legs of the ternary, but they actually end at the midpoints of their respective legs (see below).

There are two important things to notice about the overall nature of this graph. Firstly, the annulus (see the second-previous figure, above) is now evident as the area bounded on the outside by the BA and BC legs emanating from the blue hexagon and on the inside by the two short dashed segments pointing away from the green hexagon to the upper-right and straight down. (The large rhomboid between the green and blue hexagons is part of the dog-leg shape of the annulus in this part of the graph.)

Secondly, the label on the blue hexagon, “~100% B,” indicates that it is only an approximation, as are the BA and BC legs, because logarithmic terms cannot have zero values. Instead, there are a variety of ad hoc methods for handling zeroes in LRTs (Aitchison, 1986, ch. 11; van den Boogaart and Tolosana-Delgado, 2013, ch. 7). The actual coordinates for the blue hexagon are (0.01,99.98,0.01) and similarly for the BA and BC legs. The point at the intersection of the BA leg and the 1% A line, for example, is (0.99,99.00,0.01). The choice of whether to compensate for *c* being 0.01 in the *a* vs. *b* coordinate is part of the ad hoc nature of handling zeroes.

Said differently, in principle, the blue hexagon is only one of an unlimited number of “~100% B” points. The RGB-hexagon altitude could be extended another “decade” to the upper left to (0.001,99.998,0.001); and then another decade after that to (0.0001,99.9998,0.0001); and on and on.

We’re finally in a position to look at this *ilr*-transformation plot with the triad in its entirety (up to an arbitrary cut-off of the expansion). Here it is, with minimal labeling in the interest of reducing clutter:

- The color coding is A = red; B = blue; and C = green.
- The origin of the
*ilr*plot (0,0) corresponds to the centroid in the ternary (33+,33+,33+). - With a little study, the points and lines described in the previous figure for B should be identifiable for the other two vertices. The (+) signs mark the points connected by the 95%-B dashed line.
- The four solid line segments approximately locate the 1%-boundary between the interior and annulus in the triad. Recall that the enclosed interior is about 95% of the total area in standard ternary coordinates.
- The dashed line segments approximately locate the outside of the annular bands, with zeroes replaced successively by 0.01%, 0.001%, and 0.0001%. The straight-line approximation on the outer two bands underestimates the area of each annulus by a few percent.

I used the fabulous SketchAndCalc to measure the areas of each of the four enclosed shapes on a calibrated version of the image. The area of each annulus is then calculated by subtracting out the area of the interior teardrop; and the ratio of annulus/interior calculated for comparison with that ratio in a ternary. Recall that we found above that the latter was 0.0520. As the *ilr* plot clearly shows, each of the three annular shapes is larger than the interior. In fact, the annulus/interior ratios moving outward are 3.021, 5.462, and 8.440. And therefore the respective area “stretching” factor is very large: 58.1 (= 3.021/0.0520), 105, and 162.

Here are those “stretching” factors displayed as a function of the zero-replacement value for each annulus:

Using more coordinate points for the outer annular bands in the *ilr* plot would improve on my straight-line approximation and perhaps improve the stretching-factor curve. Still, I’m happy with an R of almost three-nines.

Now we can connect the conceptual dots and see that the hypothesis we’ve been pursuing is not only plausible but compatible with both story collection and resulting analysis:

- Each of these decade expansions increases the area of the annulus as the BA and BC legs necessarily migrate out with it.
- Placement of signifier points by touching a screen or clicking a mouse or transcribing a pencil mark is relatively imprecise, but more importantly highly non-linear. Two story points that look “adjacent” to a casual observer may become “widely-separated” in the subsequent log-ratio transformation.
- Overall, the near-vertex, near-zero story points stretch out with the coordinate frame. (Conversely, though it wasn’t explicit above, near-centroid points will be squeezed together.)
- The usual fixed-width kernels get farther and farther apart in the annulus. When they are summed, the kernel density estimate worsens — the equivalent of the yellow (8X) curve bottoming-out in the histogram example (above).
- Thus the contours of the probability density function don’t play nicely with the story points.
- Q.E.D. Not QED Insight, just Q.E.D.

The solution is to use a variable kernel density estimation (footnotes added):

Using a fixed filter width may mean that in regions of low density, all samples will fall in the tails of the filter with very low weighting, while regions of high density will find an excessive number of samples in the central region with weighting close to unity. To fix this problem, we vary the width of the kernel in different regions of the sample space. There are two methods of doing this: balloon and pointwise estimation. In a balloon estimator, the kernel width is varied depending on the location of the test point. [5] In a pointwise estimator, the kernel width is varied depending on the location of the sample. [6]

These two estimators are also referred to as “locally-adaptive,” which shows their potential utility for the triad-to-LRT issue. [7] They are discussed in a number of sources, including Jones (1990) and Terrell and Scott (1992). Sain (2002) is particularly clear and has helpful illustrations from both real and simulated datasets. He also discusses the difficulties in actually implementing these estimators, including with respect to the bias-variance trade-off.

I suspect that these practical difficulties are the basis for two other things I have noticed in learning about KDE and especially variable/locally-adaptive methods. Firstly, it is not easy to find more recent literature on the development and extension of the subject, as opposed to applications. [8] I have not looked at Scott (2015), the second edition of one of the three major monographs, so I may have simply missed the latest sources. Nonetheless, it is a bit odd that Terrell and Scott (1992) from 25 years ago is the most recent methodological citation in the Wikipedia article (link above).

Secondly, in the world of R, it appears that kde2d, the KDE function from the MASS package, does not include any variable-kernel capability. Hence, ggtern, the package of choice for producing triads in SenseMaker projects, *cannot* invoke it (although I am awaiting confirmation of this from Nicholas Hamilton). In fact, a Google-based search of the r-project.org site reveals only a few packages (such as ‘kader’, ‘np’, and ‘sparr’) that include any kind of multivariate adaptive-estimator or variable-bandwidth capability. Based on my reading of the descriptions, it is not obvious that they are a useful resource.

Instead, at least for the near term, **ternary heat maps are now available in ggtern**, brought to you by the aforementioned Nicholas Hamilton (and partly supported by QEDInsight), with your choice of triangular or hexagonal bins. Some SenseMaker examples are shown in Part VIIIb.

Non-SenseMaker users of ternary graphs who routinely calculate KDE-PDF contours, whether with an R package such as ggtern or with home-grown code, could be forgiven for asking why no one else has remarked on the problem discussed in this post. Great question. Why indeed? Here are three possibilities:

Firstly, people are just plotting the contours and assuming they’re right, with little to no comparison back to the original data. While my cynical side might be happy with that explanation, I also know that most people who spend the time and effort required to produce these graphs are very invested in them. More to the point, we now know of at least seven people working with story data who have noticed the mismatch in multiple projects. No offense to these people (or ourselves), but SenseMaker practitioners have not cornered the market on careful inspection of data. Almost everyone does that, so the answer must lie somewhere else.

Secondly, three members of the group of Spanish mathematicians that has extended Aitchison’s original work on compositional analysis over the past 20 years have looked at kernel density estimates in ternary and rectilinear diagrams. Martín-Fernández et al. (2007) used a dataset of basaltic lavas from Aitchison (1986), which shows the type of centered, arcuate pattern that we first encountered here in Part II: Log-Ratio Transformation. Subsequently, they did extensive simulations with a group of 12 test densities to compare several kernel estimators, using standard mean integrated square error measures.

Here are the 12 test densities from Chacón (2009) in rectilinear coordinates:

And here are the same 12 test densities from Chacón et al. (2008) in ternary coordinates (inverse-transformed from the square grid):

These two sets of contours are not explicitly from the same sample set for each density, but Chacón et al. (2008) described them as “closely related.” The methodology is certainly close enough for the purposes of this post.

Note in particular that all 12 densities in the grid plots are (1) centered on the origin, which we saw above corresponds to the centroid of the triad; and (2) fall within, generally well within, the range of (-3,+3) on each of the two axes. Compare this to the range of (-12,+12) in the “stretched” *ilr*-transformed figure above. As constructed, these simulated patterns would fall within the solid-line-bounded “teardrop” in our stretched plot. Thus, it is no surprise that the contours for the 12 test densities almost completely miss the “0-1%” annulus in the ternary plots. Said differently, if you set out to *avoid* having a near-vertex, near-zero stretching effect in ternary data, it’s hard to imagine a better demonstration of how and why it could go unnoticed.

Thirdly, there is the flip side of the previous point. A triad in a SenseMaker project is not only a data display tool, but it is also a data collection tool. Efforts to balance the signifiers at all three vertices and to avoid influencing respondents notwithstanding, every project and triad has finger touches and cursor clicks on edges and corners. By design, storytellers are allowed to choose near-vertex, near-zero placements, in some sense encouraged to do so by their view of the labels. If you think about the larger universe of data presentation in ternary diagrams, this is an unusual situation, one very unlikely to be encountered, I suspect, by a metallurgist or geologist or ecologist or geneticist.

Aitchison, J. (1986, reprinted 2003) *The Statistical Analysis of Compositional Data*. The Blackburn Press, Caldwell NJ. 416 pp. plus additional material.

Chacón, J.E. (2009) Data-driven choice of the smoothing parametrization for kernel density estimators. *The Canadian Journal of Statistics*, v. 37, p. 249-265.

Chacón, J.E., Martín-Fernández, J.A., and Mateu-Figueras, G. (2008) A comparison of the alr and ilr transformation for kernel density estimation of compositional data. CoDaWork’08, 3rd Compositional Data Analysis Workshop, Girona (Spain), May 2008. Retrieved February 2018 from https://dugi-doc.udg.edu/bitstream/handle/10256/724/Chacon.pdf

Duong, T., and Hazelton, M.L. (2003) Plug-in bandwidth matrices for bivariate kernel density estimation. *Nonparametric Statistics*, v. 15, p. 17-30.

Duong, T., and Hazelton, M.L. (2005) Cross-validation bandwidth matrices for multivariate kernel density estimation. *Scandinavian Journal of Statistics*, v. 32, p. 485-506.

Egozcue, J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G., and Barceló-Vidal, C. (2003) Isometric logratio transformation for compositional data analysis. *Mathematical Geology*, v. 35, p. 279-300.

Hastie, T., Tibshirani, R., and Friedman, J. (2016, corrected 12th printing 2017) *The Elements of Statistical Learning*, second edition. Springer, New York. 745 pp.

James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013, corrected 8th printing 2017) *An Introduction to Statistical Learning with Applications in R*. Springer, New York. 440 pp.

Jones, M.C. (1990) Variable kernel density estimates and variable kernel density estimates [sic]. *Australian Journal of Statistics*, v. 32, p. 361-371.

Martín-Fernández, J.A., Chacón-Durán, J.E., and Mateu-Figueras, G. (2007) Updating on the kernel density estimation for compositional data, pp. 713-720 *in* Rizzi, A., and Vichi, M. eds., COMPSTAT 2006 – Proceedings in Computational Statistics, Rome. Springer Physica-Verlag, Heidelberg.

Sain, S.R. (2002) Multivariate locally adaptive density estimation. *Computational Statistics & Data Analysis*, v. 39, p. 165-186.

Scott, D.W. (2015) *Multivariate Density Estimation*, second edition. Wiley, Hoboken NJ. 384 pp.

Shimazaki, H. and Shinomoto, S. (2010) Kernel bandwidth optimization in spike rate estimation. *Journal of Computational Neuroscience*, v. 29, p. 171-182.

Silverman, B.W. (1986, reprinted 1998) *Density Estimation for Statistics and Data Analysis*. Chapman & Hall/CRC, Boca Raton. 175 pp.

Terrell, G.R., and Scott, D.W. (1992) Variable kernel density estimation. *The annals of Statistics*, v. 20, p. 1236-1265.

van den Boogaart, K.G., and Tolosana-Delgado, R. (2013) *Analyzing Compositional Data with R.* Springer-Verlag, Berlin. 258 pp.

Verleysen, M. (2003) Learning high-dimensional data, pp. 141-162 *in* Ablameyko, S., et al., eds., *Limitations and Future Trends in Neural Computation*, NATO Science Series, III: Computer and Systems Sciences, v. 186. IOS Press, Amsterdam.

- By Drleft at English Wikipedia, CC BY-SA 3.0, Link ^
- This description says only that “the kernels are summed,” but the formal definition is that the estimated density function
*f(x*at target point_{o})*x*is the average of the_{o}*n*kernels contributing at that location. It is common practice, however, to plot*nf(x*in the interest of clarity (for example, Jones, 1990, and Hastie et al., 2009), and I have done that here. ^_{o}) - In Annie Hall, Woody Allen’s childhood surrogate, Little Alvy, is suffering from existential angst upon learning that the universe is expanding. He concludes that life is pointless and, perhaps a bit disingenuously, stops doing his homework. To which his mother responds: “What has the universe got to do with it? You’re here in Brooklyn. Brooklyn is not expanding!” ^
- These two lines are correctly labelled. Even though A and C are at the lower-left and lower-right vertices, respectively, of the triad, the LRT reverses them, and simultaneously rotates the triad roughly 60° CCW. This is well-known behavior for the LRTs, which is easier to see with the
*alr*transformations in Part VII: Mapping the Datasaurus Dozen. ^ - A test point is also known as an estimation point or query point or target point. It is a location on the number line or computational grid where the sum/average of the PDF is calculated. All those colored dots in the 2X/4X/8X expanded curves are calculated at target points. ^
- A sample location is also known as a data point or an observation. The six kernels (red dashed curves) in the univariate histogram-KDE comparative plot are centered on the six sample locations. ^
- Martín-Fernández et al. (2007) illustrate the use of bivariate kernel estimators from Duong and Hazelton (2003, 2005) with compositional data in ternaries. These plug-in and cross-validation methods are full-bandwidth matrix extensions that can accommodate not only axis-parallel and diagonal datasets, but more general and multi-modal distributions. They are data-driven and may start from pilot bandwidths, but they are not locally-adaptive and thus do not address the triad-to-LRT stretching issue. ^
- But see Shimazaki and Shinomoto (2010), which is both developmental and applied. This paper looks at timeline-based neurological spikes. The conceptual parallel is so strong, however, that it is hard to believe their method for selecting bandwidth “locally in time, assuming a non-stationary rate modulation” can’t be ported over to more conventional KDE analyses. ^

In an earlier post in this series, Part III: Random Data, I showed an example of a SenseMaker triad, with data clustered near vertices, along edges, and in the center; most participants used one of those seven locations to signify their stories, weighted toward one, two, or all three corners, respectively. I also showed a ternary with 500 random points. Here they are side-by-side:

My plan was to write another post about how an analyst or subject-matter expert might deal with such a “spectrum,” ranging between one end-member with well-defined, highly-aggregated data and another with random-looking, highly-scattered data. Surely those two poles would cover the triadic universe, right? That plan was sidetracked, however, when I recognized the possibility of aberrant cases, ones that are unlikely to arise with story data, but which might nonetheless provide some insight. So *this* post is about how to derive those cases and about the esoteric lessons therein.

The trigger for this change of plans was ‘The Datasaurus Dozen’ and its inspiration, ‘Datasaurus’, which I only learned of belatedly. The latter was introduced a year ago by Alberto Cairo (U. of Miami) in this tweet:

Don’t trust summary statistics. Always visualize your data first https://t.co/63RxirsTuYpic.twitter.com/5j94Dw9UAf

— Alberto Cairo (@albertocairo)

The eternal point is clear and succinct: Don’t just calculate, plot!

Clarity and brevity notwithstanding, Justin Matejka and George Fitzmaurice of Autodesk Research decided to reinforce the message. They created an additional twelve (x,y) datasets, The Datasaurus Dozen, with the same summary statistics – arithmetic means, standard deviations, and correlation coefficient, all identical to two decimal places – but visually distinct graphical patterns. There are horizontal, vertical, and diagonal parallel lines; fuzzier horizontal and vertical swaths; a grid and a blob of points; a big “X”; a five-pointed star; and single and double circles. There were no other life forms though, either extant or extinct.

If you’re more of a viewer than reader, you can watch this video and then scroll down to the next heading without missing any essentials. Or you can read on, ± watching.

This project is described in a research news article, Same Stats, Different Graphs. Not surprisingly for the Autodesk site, there are excellent graphics, including several animated gifs. Matejka and Fitzmaurice acknowledged and augmented the ancestor to all such constructs, Anscombe’s quartet, four datasets published in 1973 by Yale statistician F.J. Anscombe, who exhorted his contemporaries with the same message: Don’t just calculate, plot!

There are also links in the news article to the technical paper that Matejka and Fitzmaurice presented at an ACM conference, under the same title, in which they detailed the construction of The Datasaurus Dozen datasets. Here is their description of the novelty in their approach:

The key insight behind our approach is that while generating a dataset from scratch to have particular statistical properties is relatively

difficult, it is relativelyeasyto take an existing dataset, modify it slightly, and maintain (nearly) the same statistical properties. With repetition, this process creates a dataset with a different visual appearance from the original, while maintaining the same statistical properties. Further, if the modifications to the dataset are biased to move the points towards a particular goal, the resulting graph can be directed towards a particular visual appearance.

In the most prominent of their results (below), Datasaurus is the seed dataset from which The Dozen were created:

My immediate question when I saw these patterns was, what do they look like mapped into a triad? Here is the immediate answer for The Dozen (arrayed identically to the preceding figure; clickable for a much larger image):

And here is a side-by-side comparison of the rectilinear and ternary versions of Dinosaurus:

The dotted green and yellow lines connect six equivalent points and clarify that the ternary Dinosaurus is facing right, that is, the vertical join between the two plots is a mirror plane of sorts. Details of the reason for a “re-scaled” plot and the construction of the triad from it are given in the Appendix.

Mind you, I would never expect to see story dots forming even a star or a circle, let alone a dinosaur. When some other parameter, such as time or reward structure, is an independent (controlled) variable in a story-collection process, however, then unusual data structures may provide guidance in interpretation and, perhaps surprisingly, in prediction. But first a brief recap of the back-and-forth of the relevant transformations.

Part II: Log-Ratio Transformation in this series gave a foretaste of the relevance of looking at SenseMaker data in both ternary and rectilinear coordinates. As that post discussed, the fact that we can present summary statistics for a triad — to say nothing of more advanced metrics like kernel density estimates (see Part VIIIa) — is due to the methodology created by statistician John Aitchison (see References in Part II). In a nutshell, constant-sum ternary coordinates are transformed to open-ended (x,y) coordinates in a log-ratio space where standard statistical calculations can be performed reliably; and the results are then inverse-transformed back to the ternary. Part IV: Confidence Regions is an outcome of just such a procedure.

There are three log-ratio transformations in common usage, the additive (*alr*), centered (*clr*), and isometric (*ilr*). The first two were developed by Aitchison and the third by Vera Pawlowsky-Glahn and her collaborators (see References and Additional Readings in Part II). There is a clear and equation-free, though still highly mathematical, discussion of the pros and cons of each in the introduction of Egozcue et al. (2003), in which they first introduced the *ilr* transformation:

[It] is called

isometricbecause it allows us to associateanglesanddistancesin the [triad] toanglesanddistancesin [the transformed rectilinear plot], where we feel more comfortable from an intuitive point of view. This is of particular interest with respect to concepts of orthogonality.

Here is an example from their paper of parallel (solid) and orthogonal (dashed) lines transformed between the two coordinate systems:

Note that The Dinosaurus Dozen pair labelled “x_shape” (3rd row, 3rd column in each panel, above) shows exactly this behavior, but for two intersecting lines [1], rather than two parallel ones.

Egozcue et al. refer to the solid lines as “compositional processes” and cite bacterial growth, radioactive decay, and sedimentary deposition as natural examples of these patterns. In fact, as the names imply, the solid curves/lines are *parametric in time*, which varies independently along them, even though it does not appear as an explicit variable on any axis or vertex. In the research literature of the respective disciplines — biology, physics, geology — these processes are more likely to be shown in a log-ratio plot than a ternary.

On the other hand, here are three examples of parameterization in which a ternary is the more common visualization choice because the emphasis is on multi-component compositional change (rather than increase/decrease of a single “species” as in Egozcue et al.’s examples):

- parametric in time (t): archeology, e.g., changing historical composition of tools or pottery shards as the “technology” of an era evolves or source areas and trade routes come in and out of favor;
- parametric in temperature (T): metallurgy, e.g., changing equilibrium composition of alloys or solid solutions as temperature decreases; and
- parametric in both t & T: geology, e.g., changing lava/magma composition during flow differentiation or fractional crystallization (see the example from Aitchison in Part II of this series, although it used the
*alr*transformation, not the*ilr*).

There are at least two categories of SenseMaker projects that *could* show parameterized data. Firstly, punctuated or continuous capture of stories is inherently time-parametric. There would be no guarantee that signified results would actually change over time, but looking for such change is a primary motivation for the approach. It is also a means of testing whether adjustments of extrinsic factors, including safe-to-fail experiments, had detectable effects.

Secondly, the instrument itself might be parameterized as a means of testing some aspect of the methodology. Imagine a large, homogeneous population of respondents, subsets of whom were presented with different versions of a prompting question, yet attached to the same labelled triad. If those versions were designed to fall along some “spectrum,” then it would be interesting to see if there was a corresponding array of “compositions.”

As the first figure in this post reminds us, story data in a real SenseMaker triad are likely to be very blobby. Rarely will there be precise patterns that would warm the heart of a mathematician. The potential for parameterization — in time, reward structure, or some other study variable, or in the instrument itself — indicates, however, that *the data could act as a directional pointer*. Consideration of both ternary and rectilinear patterns, as in the pair of graphs immediately above, could suggest optimal pathways [2] along which respondents might move (or be moved) due to future interventions. Given the nature of complex problems and the inherent messiness of human nature, this would be, at best, a second-order effect. It would imply nonetheless a minor addendum to the continuing lesson of Datasaurus and The Datasaurus Dozen, offered here in the spirit of xkcd [3]:

Imagine that the data in a rectilinear plot, say, Datasaurus, are the result of a log-ratio transformation from some unseen ternary plot. The latter can be recovered by applying an inverse transformation to the (x,y) data, which I did subject to the following qualifications and comments:

- The simplest approach was to use the
*alr*transformation because it directly yields an increase from D-1 to D components (2 to 3 in this case). Additionally, my motivation was only to recover the general pattern of points, not to preserve metrical distances. - The axes on the original plots were scaled from 0 to 100. That range is unrealistic for the log-ratio data that the Aitchison methodology anticipates. For
*alr*, the data transform from a closed triad with non-zero, constant-sum (100%) coordinates to two independent, open-ended variables, which in practice both fall in the range -5 to +5. Consequently, to simulate the log-ratio ranges, I re-scaled the original data in Matejka and Fitzmaurice’s CSV file so that the new results satisfied (x’,y’) = (0.1*x – 5, 0.1*y – 5). Here is a side-by-side comparison of Dinosaurus from their file and from my re-scaled data, showing that they are essentially identical; the orange data point is the*arithmetic*mean, which transforms to the*geometric*mean in a triad (see Part I: Geometric Mean):

- As I mentioned in footnote [1], the original graphs had differing metrics on the two axes, resulting in rectangular rather than square gridlines. I matched the two metrics for the original “dino” image (above) which I replicated from the CSV file, so that there could be no doubt of fidelity between original and re-scaled images. I did the same for all of The Datasaurus Dozen as well, prior to the inverse transformation.
*The equal-metric originals from the CSV file and my re-scaled images are as identical as the Dinosaurus pair*(above). Here is the array for The Dozen, re-scaled (clickable for a much larger image):

- Comparing this array (just above) to the original one (and making allowance for the “flattening” of the latter) shows that there are small but noticeable differences in the patterns. Note especially “slant_down” (Row1,Col2), “slant_up” (R1,C3), “wide_lines” (R2,C1), and “high_lines” (R2,C2). In the first pair, the number of well-defined lines differs between the two sets; and in the second pair, the dispersion of each swath of points differs substantially. These differences could not have resulted from the re-scaling (see previous bullet). My guess is that the data in Figure 2 of their research news article and in the downloadable CSV file are simply from different simulation runs with the same summary statistics. (In fact, Justin Matejka confirmed my guess in an email exchange on August 21.)
- Finally, I converted the re-scaled (x’,y’) data to triad coordinates by the inverse
*alr*transformation [4], using the form equivalent to eqs. (4)-(6) in Part II. The image array is shown above at Graphical Results.

Egozcue, J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G., and Barcelo-Vidal, C. (2003) Isometric logratio transformation for compositional data analysis. *Mathematical Geology*, v. 35, p. 279-300.

Martín-Fernández, J.A., Olea-Meneses, R.A., and Pawlowsky-Glahn, V. (2001) Criteria to compare estimation methods of regionalized compositions. *Mathematical Geology*, v. 33, p. 889-909.

Pawlowsky, V. (1989) Cokriging of regionalized compositions. *Mathematical Geology*, v. 21, p. 513-521.

- Dinosaurus and The Dozen were presented in graphs with differing (x,y) metrics, resulting in rectangular rather than square gridlines. One result is that the circle and bullseye
*appear*ellipsoidal rather than, well, circular. With the benefit of hindsight from Egozcue et al.’s discussion, however, the ternary for “x_shape” is consistent with the two lines being non-perpendicular; this is also confirmed by the re-scaled plots discussed in the Appendix. ^ - “Optimal pathways” is my shorthand for Egozcue et al.’s discussion of Hilbert space and geodesics and Aitchison distances. Suffice to say that the graphical patterns of data might point to the “shortest” way to get people to change their stories and move their signifiers. ^
- I am grateful to Randall Munroe for rendering the body and head of Rexthor and his dog as triads (albeit non-equilateral ones). Now if only the AKC would recognize the Directional Pointer. ^
- Applying the
*alr*transformation to data in a triad requires that one of the three variables be chosen as the divisor of the other two in calculating the log ratios. Not surprisingly, this well-known asymmetry can lead to different results in the log-ratio graph for each divisor. If you’re more comfortable looking at your data in a rectilinear plot than in a ternary, this is probably pretty unsettling. But if you stick with it, something surprising emerges — any computational results that are inverse-transformed back to the originating triad are the same, regardless of the divisor. - In math-speak,
*alr*inverse-transformed values are “invariant under permutation.” It took Pawlowsky (1989) two-plus pages of moderately dense linear algebra to prove this. Thankfully, she re-stated it in prose a dozen years later (Martín-Fernández et al., 2001): ” …one important property of the alr transformation is the independence of results from the selection of the denominator after the [inverse] transformation has been applied.” - I bring this up in the interest of full disclosure. I have indeed inverse-transformed some data, but from a
*supposed log-ratio plot*to a*previously non-existent ternary*. Said differently, I have done at best only the inverse half of what Pawlowsky discussed, and therefore I can’t say that my half by itself is*provably*permutation-invariant. - What I can say, however, is that some simple algebra with the logarithmic and exponential terms of the two transformations suggests that changing the divisor is simply a matter of arbitrarily choosing how to map the two log-ratio coordinates onto the three triad coordinates. The pattern of data points should remain unchanged, but it will be rotated 120° inside the triangle. In the case of the motivating image for all this, it should still be the same Dinosaurus, but now face-down in one corner or lying on his back in the other. Here is the experimental confirmation (original on left, face-down on right): ^

Unit of observation, however, has a somewhat technical meaning, in both the natural and social sciences. Here is an excerpt from the (already) brief entry in Wikipedia (linked above):

In statistics, a unit of observation is the unit described by the data that one analyzes…. A study may have a differing unit of observation and unit of analysis: for example, in community research, the research design may collect data at the individual level of observation but the level of analysis might be at the neighborhood level, drawing conclusions on neighborhood characteristics from data collected from individuals. Together, the unit of observation and the level of analysis define the population of a research enterprise.

How should we view stories, signifiers, data points in triads, and even the study participants themselves through these observation-colored glasses? I’m going to answer *this* question by looking again at the geological precedents that I discussed in the prior Confidence Regions post. Then I’ll come back to that opening question.

The importance of the distinction between the levels of observation and analysis was very evident in the paper by Weltje (2002) from which I took a geological example to illustrate right and wrong ways to estimate confidence regions in a triad. The primary *data* in his example came from a method called “point counting.”

A thin slice cut from a hand specimen (left, above) is mounted on a glass slide (right) and examined under a microscope with a precision stage that allows the thin section to be moved in fixed increments. As the slide is viewed grid-wise in both x- and y-traverses, the mineral grains that successively appear under the cross-hairs in the center of the field of view are identified and recorded. Depending on the size of the individual grains, several hundred such points would be counted.

The resulting data – n_{1} grains of mineral 1; n_{2} of mineral 2; and so on for typically 6 or 8 abundant minerals (or identifiable rock fragments), plus a catch-all “other” – are *categorical*, that is, they fall into mutually-exclusive descriptive bins without regard to any natural ordering.[1] These numbers can be converted to *compositional* data by dividing each individual count n_{i} by the sum of all counts, and multiplying by 100 to get percentages. In order to show them in a triad, as in Weltje’s figures, any three can be re-normalized to 100%.

As in the quote (above) from Wikipedia, there are several levels at which a geologist might analyze such data. Weltje treats two of these levels. His Model A is “the grain as unit of observation.” At this level we can ask how uncertain the composition of the hand specimen is. Given reasonable and testable assumptions of specimen homogeneity and stochastic independence of adjacent grains, point counting can be viewed as a form of Bernoulli sampling. As Weltje discussed in mathematical detail, this allows calculation of a confidence region for the composition of a single hand specimen from a multinomial generalization of the common chi-squared test. (A more accessible discussion is given by Xu et al. (2010), using data from experimental biology.)

Hence, multiple hand specimens can be compared in a ternary plot – each with its own confidence ellipse – and homogeneity assessed at various spatial scales, such as serial sawed slices, specimens along a single roadcut, or regional-scale sampling of a mappable unit (“formation”). Thus, Weltje’s grain-as-unit-of-observation Model A displays a confidence region around each data point in the ternary. In other words, *the data from all observed grains in a single thin section are aggregated into a single point*, which is then the focus of statistical analysis.

His Model B is “the composition as unit of observation.” This is the level of data analysis that I illustrated in Part IV of this series, with the two-panel figure comparing the non-rigorous “hexagonal fields of variation” and the formal confidence ellipses for both the total population and the geometric mean of the population. In this model, *the data from all observed compositions for all hand specimens are aggregated into a single point* (the mean), which is then the focus of statistical analysis.

Imagine a criminology grad student who would like to interview witnesses of a particular event and ask them to recount what they saw and to signify their stories. With a prompting question and signifier designed to elicit a *normative* response, that would be a way to test the “homogeneity” — consistency, accuracy, veracity — of the witnesses and the student’s methodology. This is a scenario in which the first model (A) looks superficially useful, one in which the student would like to calculate a confidence ellipse for each witness (data point) in a triad. But it is inconsistent with the kind of undirected discovery for which SenseMaker is designed. More importantly, there would be no categorical data, no “point counts” to use in the chi-squared calculations, because the witnesses’ marks in the triad would be compositional from the outset. So, instead of trying to force Bernoulli sampling and a multinomial distribution on some unsuspecting data, I’m going to proceed by semantic rather than mathematical manipulation.

Here are some commonalities or pairings in the geological vs. sensemaking approaches to displaying results in a triad, arranged from coarest to finest:

• **an outcrop/layer/formation = a cohort** – both are regionally definable;

• **a hand specimen = a story** – both are collected in the field;

• **a composition = a signifier** – both appear in a triad, by calculation or touchscreen; and

• **a grain = a word** – each is embedded in its specimen/story.

There are also pairings of agents that guide or define each of the above:

• geological processes (for example, sediment deposition or volcanic eruption) = the project participants who wrote the stories; and

• the field geologist = the analyst or practitioner.

In the geology era of my life — the left-hand side of each of these pairings — I have stood at a lot of outcrops; collected a lot of specimens; measured and plotted a lot of chemical compositions; and looked at a lot of mineral grains and probed them with not only visible light but also various charged-particle beams (yet more chemical compositions!). Occasionally, I needed to do some kind of calibration or cross-check of data or a method, and those times looked like Weltje’s model A, the grain as unit of observation. But the bulk of the time I was working with the composition as unit of observation.

That’s why I already knew about triads, when Laurie first asked me about them, and why the mathematical underpinning of her work was familiar. Even with different terminology and unfamiliar subject matter, I could generally follow what she was doing: objectives and cohorts were defined; prompting questions and signifiers designed; stories and responses collected; results analyzed and plotted; and occasionally stories were themed by a client’s subject-matter experts.

Remember that this is yet another post in a series on Statistics in the Triad. That would suggest I should trot out some quasi-formal answer along the lines of the opening quote from Wikipedia. Instead, having watched Laurie work with clients, I suspect that the correct answer is “all of the above,” that the unit of observation is whatever helps those clients listen to and understand their constituents, formulate probes, and take a best guess at next steps.

Looking now at the boldface bullets (above), however, one thing stands out by its absence – the individual word in the participants’ stories. Nowhere are the words in a story examined the way the grains in a rock may be examined.[2] Even the aforementioned story theming, which certainly requires *reading* stories, is more like a geologist *looking* at a hand specimen. An experienced eye can look at the rocks in the two photos above and think “granite” without needing to count anything, just as a reader can scan a printed story and grasp the content without having to deal (consciously) with each individual word.

Of course if you saw the handwritten text in the second photo (above) and anticipated “… stormy night,” you were probably surprised or puzzled or amused, or all three, when you got to “avocado” instead. If so, thank you for illustrating my point: except in aberrant situations, sensemaking doesn’t require attention to words in a way that would make them a unit of observation.

Not *requiring* attention, however, doesn’t mean that we shouldn’t be *giving* it. I’ve said to Laurie for years that I am amazed at how much potential information is being left on the table with unexplored story texts, what I would now express as failing to consider *the word as unit of observation*. On the other hand, I completely take her point that there is only so much the community of practitioners can do and that people with the requisite analytical skills may just not (yet) be part of it.[3]

Full disclosure: I certainly don’t possess those skills. In fact, about all I can do is rattle off names like “latent semantic analysis” (see “natural language processing”) and “support vector machine” (see “machine learning”).[4] They are examples of tools that a knowledgable person might apply to examining the words, looking at the story “grains” under a sensemaking microscope, if you will. Whatever the tool, the words should be added to the list of units of observation. They surely have much to tell when we ask the right questions.

It is a truism that the big-S websites — searching, shopping, socializing — not only look at the words we type, but they monetize them. This may be for their own benefit, as in “people who bought this also bought…”; or it may be by selling to others what they learn about us from analyzing our words.

Such analyses can give insight across a scale that a SenseMaker project could never encompass. Facebook has how many hundred million active users?… But the result is also exactly what SenseMaker was designed to avoid — someone else’s compilation of a third-person, this-might-be-what-they-meant inference and representation. Instead, practitioners help clients receive a message from their audience that is a first-person, here-is-what-we-think declaration and contextualization.

The logical extension of the foregoing is to combine these two approaches. I mean much more than simply doing textual analysis of stories collected in a project. My fantasy is that an entity that falls outside of the big-S websites, yet appreciates and has access to their data, would recognize the symbiotic value of combining narrative and counting. I’m thinking of a social-media analytics company like Crimson Hexagon. OK, let’s be honest — I’m thinking *precisely* of Crimson Hexagon.

A search of their website on “surveys” gives 10 hits, almost all of which mention the word only once, and all of which are dismissive. That’s OK. It may be a form of mission bias for them, but I’m dismissive of surveys as well. Anyone who understands SenseMaker probably is. What I envision instead is a project in which Crimson Hexagon chooses a sample population representative of social-media participants who commented on one of their clients. That population would become the target for a SenseMaker project, and they would tell stories about their experience with the client and signify them.

The resulting whole — combining narrative and counting — would surely be much greater than the sum of its parts. After all, if you want to be really sure that your broad-scale analytical results are accurate, you should test them, right? And what better way — arguably the only way — than to ask some of the participants to tell you directly, precisely, in-context, with no intervening inference? And on the flip side, if you are confident that a small sample of a population has told you precisely what they think, how much more valuable could it be if you could confidently extend that picture to a vast number whose stories and signifiers you could never hope to gather?

Weltje, G.J. (2002) Quantitative analysis of detrital modes: statistically rigorous confidence regions in ternary diagrams and their use in sedimentary petrology. *Earth-Science Reviews*, v. 57, p. 211-253.

Xu, B., Feng, X., and Burdine, R.D. (2010) Categorical data analysis in experimental biology. *Developmental Biology*, v. 348, p. 3-11.

- One of the most commonplace examples of this scheme is a recalcitrant child partitioning mixed vegetables into separate piles on the plate and counting in particular the number of distasteful items, say lima beans, that must be eaten (or otherwise disposed of). This also offers a learning opportunity, probably seldom realized, of how the closure constraint can produce an alarming increase in the percentage of beans as the more palatable carrots and corn are eaten. ^
- At one level, this is unfair, because the grains
*must*be counted in order to calculate the rock compositions that are then plotted in a ternary diagram, whereas the words are irrelevant when the storyteller places a finger on a touchscreen to indicate the “composition” of a signifier in a triad. ^ - Amazingly enough, in the three weeks since I wrote the post on Confidence Regions, Laurie has learned of three practitioners — two clients, one fellow analyst — who have small groups of students working with them on analyzing story texts. Check back here to see if they are willing to be publicly identified at this time. ^
- These names may not be helpful if you don’t already know what they mean. Kind of like the self-referential “cat (see feline), feline (see cat)” that you might encounter if you didn’t already know about the small, furry, carnivorous, mammalian house pet. ^

What you have to understand is that the question is implicitly about

famousactresses….This means that when we ask about the association of acting talent and sexiness amongst the famous, we have censored data where people who are low on both dimensions are censored out. Within the truncated sample there may be a robust negative association, but the causal relationship is very indirect….

To illustrate this for a population of aspiring actors, he ran a simulation of the relationship between “body” and “mind,” both metrics normally-distributed and assumed orthogonal to each other. He further divided the data into “failed aspirants” and “working actors,” with the latter defined by imagining that “casting directors jointly maximize talent and looks so only the aspiring actors with the highest sum for these two traits actually get work in Hollywood.” (Casting directors who don’t know they are using a joint maximization function may be distressed to see their efforts reduced to a mere 2 lines of code in Rossman’s first blog post.)

Here is the graphical result, with the axes scaled in standard deviations:

The light open circles (“Unobserved”) and solid triangles (“Observed”) are the failed aspirants and working actors, respectively. Here is Rossman’s summary in *The Atlantic*:

Among those actors we can readily observethere then will be a negative correlation between looks and talent, even though there is no such correlation in the grand population. If we see only the working actors without understanding the censorship process we might think that there is some stupefaction of being ridiculously good-looking.

Rossman subsequently learned that this “logical fallacy” is already well-known as “conditioning on a collider.” This less-intuitive name was coined by his UCLA colleague in computer science and statistics, Judea Pearl, who developed structural (graphical) models for causality. A collider can appear in such network-like models as a node that “blocks the association between the variables that influence it.” *In Rossman’s simulation, the variables are body and mind, and* *the collider is the maximized sum for looks and talent*.

To stay in the jargon, the body-mind graph for the working actors has been confounded (obscured or complicated) by this collider, leading to a non-causal association (false correlation) between the two independent variables. If you are big into statistical control, as expressed by phrases such as “we controlled for age, history of smoking, and diet in study participants,” then an unrecognized collider would be a scary thing indeed.

The short answer is “Probably not.” You should keep in mind, however, that I claim no expertise in this cacophony of causing and censoring, colliding and conditioning, confounding and controlling and correlating. (That’s a lot of commencing with “c.” Coincidence? *Sí*.) Nonetheless, I do get Rossman’s general point well enough to make two comments vis-à-vis SenseMaker analytics.

Firstly, there is a superficial similarity of the body-mind graph (above) to a stones canvas, with its centered “origin” and two orthogonal axes. Neither of those pseudo-dyadic axes, however, is likely to host normally-distributed data. In fact, it would not be unusual to see stones data clustered in, say, the upper-right corner of a canvas. In contrast, that could only happen in the body-mind graph if the aspiring population included, and the casting directors had managed to identify, a bunch of Megan Fox or George Clooney wannabes with +4-sigma IQs of 170 or more.

Secondly, there is the less-obvious similarity of the body-mind graph to a triad. To see this requires changing it from a right triangle in disguise to an equilateral triangle. Here’s how that can happen schematically, showing only a few hypothetical data points (red) for working actors:

Step (1)➔(2) isolates the first (upper-right) quadrant of Rossman’s graph (1). Surprisingly, the remaining data in (2) have “compositional” traits:

- the coordinate values are non-negative, as they are in the original body-mind graph;
- the two variables are still independent, or can be treated as such, as long as at least one more compositional variable can be identified [see (2)➔(3) below];
- the remaining data for working actors are no longer normally-distributed; and
- both axes are effectively capped at the green dashed lines by the empirical absence (or extraordinarily unlikely existence) of any Clooney-esque braniacs.

Step (2)➔(3) incorporates those four bullets, invoking the closure constraint by capping each axis at 100%; connecting the X and Y endpoints (dashed line); and recasting the origin as Z, the third member of each coordinate triple. In the context of Rossman’s graph, Z is the the complement of the sum for looks and talent:

Z = 100 – (X+Y).

In other words, you can think of this third vertex as a “placeholder” for the collider.

If you still can’t see (3) as a disguised triad, here is an enlarged view showing some equi-percentage lines (yellow) for each component, with mutual intersections of 45º or 90º, as expected in a right triangle, as opposed to the universal 60º in a triad. The transition of Step (3)➔(4) is then just a matter of morphing the right triangle into an equilateral one.

The short answer is “Read on.” Also don’t worry. That lone equation above was it. I’m not going to tell you that there is a bunch of math to learn for your project and story data. The reason is simple: you are already working with the equilateral triangle in (4). Instead, the guidance that this post can offer is heuristic and not especially robust.

Imagine that you have a triad from a project, or more likely for a cohort within the project, where most or all the data points hug one of the legs of the triangle, call it XY. Obviously those respondents did not resonate to the choice presented at the opposing vertex Z when they were signifying their stories, leaving you with a dyad-like line of data along XY. So, you might be asking yourself, did I miss some variable(s)? If I could go back for another round of story collection, what label or property or characteristic would I place at Z in the hope of creating additional discrimination and insight by moving points out into the triad?

Please note, however, that those XY data appear negatively-correlated, exactly the characteristic of conditioning on a collider that Rossman illustrated through his mind-body simulation. This prompts a subtle, but potentially more important question: *Does that negative correlation make sense in context?* In the context of the project, the cohort (if any), and the total population of respondents? This question matters because the apparent choice of a zero-value for Z means only X or Y (but not both) is independent. The resulting negative correlation is an artifact of the closure constraint, though it should still make sense in context.

Alternatively, does that negative correlation suggest something “screwy” (as in the title of Rossman’s article)? Something like the “stupefaction of being ridiculously good-looking”? If that is the case, then the lack of resonance among the respondents to Z could be telling you that there is a collider, unrecognized and hidden, whose *absence* concentrated the responses along the XY leg of the triad.

Again very schematically, if the missing property or characteristic had, in fact, been at Z, the responses might have been in a very different location in the triad, perhaps with little weight given to the XY leg and little reason to think about a negative correlation between X and Y. In its absence, however, it becomes a hidden collider. The resulting projection toward the XY leg might be not only artifactual but also nonsensical.

]]>The answer for groups of data in a ternary plot or triad, however, is not as straightforward. The reason for this is rooted in the fly-in-the-ointment issue that I covered at the beginning of Part II of this series, the constant-sum or closure constraint. Whether the coordinate triplets (x, y, z) for each point are extracted from an overall composition and normalized to 100% before plotting, or whether (as in a SenseMaker project) they are constrained to 100% at the moment of data collection, for example, by a finger touching a tablet screen, two restrictions now apply. Firstly, all the data are non-negative and also capped at 1.0 (100%), so any statistical method that assumes non-bounded data (plausibly -∞ to +∞) cannot be used. Secondly, only two of the three components are independent, which can lead to spurious negative correlations and other aberrations if the data are not handled carefully.

As a geologist, I have mixed feelings about my former profession having pioneered the search for graphical portrayals of confidence limits in ternary plots. The only mitigating circumstance is that the method in most widespread use for several decades was introduced before Chayes (1960) issued the first *modern* warning about the limitations of working with compositional data. The names of this method — most commonly “hexagonal field of variation” or “error polygon” — are accurate descriptions.

Here is an extreme, but not-all-that-atypical, example from Weltje (2002, Fig. 16):

The left-hand image (A) shows two sets of nested hexagons for 90%, 95%, and 99% confidence intervals, the inner set for the population mean (+) and the outer set for the overall population of samples of river sands. (This kind of diagram was used across a wide range of sub-specialties in the earth sciences. Hydrologists and sedimentary petrologists, who might study such samples, are no more or less guilty than many others.) The methodology is both simple and specious:

- calculate the
*arithmetic mean*of the data; - calculate the
*standard deviation*(σ) for each of the three individual components (Qt = quartz, Rnc = Rock fragments/”rest” [my shorthand], Rc = Rock fragments/carbonate) ; - plot the pair of parallel lines defining the appropriate
*variance window*(e.g., ±2σ = 95%) for each component; and - truncate each line where it intersects those for the other two components at the same
*confidence level*.

If you have read the first two posts in this series, the italicized phrase in each bullet highlights the problem with this type of plot: it uses the arithmetic mean, rather than the geometric mean (see Part I), and it uses statistical measures that are inappropriate under the closure constraint (see Part II). If you prefer a less abstract, more visual definition of the problem, look at the hexagonal fields themselves. It should not inspire a lot of confidence (pun intended) when the limits for 90-99% of the data extend beyond what is mathematically allowed (i.e., could fall outside the area of the *closed* triad)! And yet this was not only a widely-used method, but one whose shortcomings were openly acknowledged in the academic papers that presented the results, with phrases such as “the uncertainties are not statistically rigorous” (see Pawlowsky-Glahn and Barcelo-Vidal, 1999). Absent any other approach, the rule was clearly desperate scientists call for desperate methods. Nonetheless, as Weltje (2002) succinctly put it, “hexagonal fields of variation must be regarded as mere graphic constructs.”

Contrast this with Weltje’s right-hand image (B), which shows the same data for river sands, with closed contours for the same three confidence levels. Again the inner set is for the population mean (+), but notice that, compared to the mean in image A, it is now slightly shifted to the left of the two nearest data points. In A, the + was the arithmetic mean; in B, it is the geometric mean. The outer set of contours is again for the overall population of samples.

Weltje (2002) developed this methodology for confidence regions in the triad as a direct extension of Aitchison’s (1986) approach to compositional data. His historical discussion and examples are entirely geological, but his attention to the rationale is lucid, and the math is moderately accessible (facility with linear algebra required). In a subsequent post, The Story as Unit of Observation, I discuss parallels between the geological and sensemaking perspectives. For now, however, let’s look at some examples of confidence regions from SenseMaker projects.

In the 15 years subsequent to Weltje’s paper, the math has been amplified and extended by Pawlowsky-Glahn and colleagues (see the Additional Readings at the end of Part II). More importantly, within the open-source R community, there are now packages that implement various formulations of Aitchison’s log-ratio transformations for compositional data, including confidence regions and the plotting thereof. So, having revived our command-line skills, and with considerable assistance from Ashton Drew, we jumped into R.

Essentially every recent project on which Laurie has worked has included at least one or two triads in which subsets of data — cohorts within the population of respondents — could prompt the opening question: Are the groups significantly different? As a proof-of-concept trial, Laurie reviewed the triads from a study of employees in a bilingual, multi-national corporation and picked two by eye that she thought might show a difference for cohorts based on native language. Here are the raw data for the first one:

And here is the plot for the 95% confidence regions on the (geometric) means, which are clearly significantly different:

Here are the raw data for the second triad:

Again, here is the plot for the 95% confidence regions on the (geometric) means, which are also significantly different:

Arguably the most interesting results are likely to arise when comparisons involve multiple cohorts, offering the possibility of simultaneously identifying similar (overlapping) and different (non-overlapping) responses to the same prompts and lead-ins. Here are two examples from a project by a not-for-profit organization concerning a refugee population that included (among others) unmarried and married girls (ages 13-24), mothers and fathers of the girls, husbands of the girls, and unmarried men (otherwise similar to the husbands).

Since the point here is solely to document the ability of this technique to make distinctions (or not), I have omitted some labelling details, including the lead-in, as a privacy/security consideration. Even with that limitation, it is clear that the perspectives of the husbands and their unmarried counterparts differ significantly from those of the core family members. The distinctions are even more evident in the second triad for this study:

You don’t have to know anything about the society or culture involved to recognize the disparity in value placed on education by the various groups. And to appreciate the guidance that such clear results might provide for the client and supporting agencies, to say nothing of the benefit that might accrue to the unmarried girls in the long run.

Aitchison, J. (1986, reprinted 2003) *The Statistical Analysis of Compositional Data*. The Blackburn Press, Caldwell NJ. 416 pp. plus additional material.

Chayes, F. (1960) On correlation between variables of constant sum. *Journal of Geophysical Research*, v. 65, p. 4185-4193.

Pawlowsky-Glahn, V., and Barceló-Vidal, C. (1999) Confidence regions in ternary diagrams. In “Old Crust – New Problems,” Freiberg ’99. *Geologische Vereinigung*, v. 89, p. 37-47.

Weltje, G.J. (2002) Quantitative analysis of detrital modes: statistically rigorous confidence regions in ternary diagrams and their use in sedimentary petrology. *Earth-Science Reviews*, v. 57, p. 211-253.

As it turned out, 100-million-plus people, in 50 million US homes, watched the unprecedented overtime victory by the New England Patriots over the Atlanta Falcons. It was a dramatic two-games-in-one contest — Atlanta dominated the scoring 28-3 until barely two minutes remained in the third quarter; and then New England made an orthogonal pivot, to tie 28-all with 57 seconds remaining in the fourth, winning 34-28 in slightly less than four minutes of sudden-death overtime.

The come-from-behind drama notwithstanding, the biggest surprise for me was that later in the week the game would evoke the fundamental difference between a SenseMaker study and a traditional marketing survey. Presumably this was an audience-of-one reaction, not shared by other viewers.

Back at the stadium, the on-field leadership of New England quarterback Tom Brady earned him the Most Valuable Player award and unanimous accolades from his teammates. If you had to pick a single hero, it was he.

A few days later, the background mental nagging from that word “hero” got me to re-locate something I had read about four years ago. The author is Malcolm Ryan, Senior Lecturer in the Department of Computing at Macquarie University, who works in artificial intelligence and game design. Here is some of what he had to say in a post on ‘Narrative-driven Design’ in his blog *Words on Play* [hyperlink added]:

I’ve taken an approach to design that I haven’t really seen discussed before. I’m calling it “narrative-driven” design. The idea is that you choose a particular set of narratives that you want to see emerge from your game and then you design systems to enable and encourage (but not enforce) those narratives.

I’m talking here about what I have previously called “intrinsic” narratives — the stories that emerge from the gameplay — rather than “extrinsic” narratives — the stories imposed by the author. A good game should have a good intrinsic narrative, even if it has little or no extrinsic narrative. Consider sports, for example. A good game of cricket or football or whatever has an exciting narrative: Our team did this, but then their team did that. They had the edge for a while but then our [star] player did something amazing! It was neck and neck to the end but finally we won! There is no externally-written fantasy going on here. The story is based on the drama of the game itself.

Just to be clear, his “gameplay” refers to a designed game meant to be played on-screen or with cards or by some other method; my mapping is “a good game of… football” = Super Bowl LI, and the “star player” = Tom Brady.

I originally found Dr. Ryan’s blog post because I was writing about the difference between results a client would see in a SenseMaker story-collection *project* vs. a traditional 1-to-10 bubbled *survey*. In a nutshell, a *project* delivers a message from the client’s audience that is a first-person, here-is-what-we-think declaration and contextualization. A *survey*, at best, can only deliver someone else’s compilation of a third-person, this-might-be-what-they-meant inference and representation; at worst, it can be a meaningless collection of questions and answers that have been “gamed.” (Is it a sign that we use that word?)

I remember the aha moment as I was writing: Oh, a *project* can deliver an intrinsic or inherent story, governed by the thoughts of the respondents (assuming well-designed prompting questions and signifiers), but a *survey* is much more likely to be governed by extrinsic or tangential factors, which means that “gaming” by the respondents could equally well be “guiding” by the survey designer. This intrinsic vs. extrinsic categorization of narrative couldn’t be a wholly original insight, I thought, so time for a Google search on “intrinsic narrative” AND “extrinsic narrative”.

Dr. Ryan’s post on ‘Narrative-driven Design’ was (and still is) the top hit. The surprise is that there are only 78 hits total (as of March 6th). Actually that is an overstatement, since by the time you get a few pages down, Google offers up a common warning: “In order to show you the most relevant results, we have omitted some entries very similar to the 31 already displayed.”

If you add “games” to that Google search, the number of hits reduces to 51, with 19 displayed. Skimming those 19 leaves me with the suspicion that this has been a decade-long, if somewhat “unstructured,” topic of conversation in the game-developer community. But one that seemingly peaked in 2011 and 2012. (I’ll return to the topic of more structured conversation, meaning “academic” discourse, in a future post.)

I looked at a few of the other results, including one from the blog of Failbetter Games, makers of Sunless Sea. They self-describe as “an independent games studio” and “Purveyors of only the finest examples of interactive narrative.” I particularity liked the example of chess offered by “Tony” in his comment at the end of the post ‘Late to the party: games and stories’:

Extrinsic and Intrinsic are useful words. Thank you. I find ‘narrative’ and ‘theme’ are… useful words too. For example, chess has no extrinsic narrative (designer driven), a strong intrinsic narrative (player driven), and a solid theme (war of kings).

Succinct and memorable, especially for those of us who don’t inhabit any gamer universe, either as developer or player. And illustrative of my point — like chess, the best SenseMaker project has no extrinsic narrative, because the storytelling is not driven by the instrument designer, but it does have a strong intrinsic narrative, driven by the storyteller(s).

Although in some contexts, even chess can be scripted….

]]>Both the presence and density of story points depend, of course, on each respondent’s reaction vis-a-vis the lead-in (“The events in the story happened…”), and not unimportantly on the precision with which the person places a dot on paper or a fingertip on a touchscreen. Assuming that the prompting question and signifiers are well-designed, the cumulative pattern across all users should enable discernment of the influences and modulators of their experiences.

What happens, however, if the data cluster elsewhere… or *nowhere*? In the extreme, what if the data are random? Despite decades of using ternary plots in my prior life as a lab scientist, I had never considered this question. (I’d like to think this says more about the inherent regularity of the chemistry of volcanic rocks than it does about my powers of imagination, but never mind.)

That changed in late October, during a workshop, when Laurie presented initial results for a client’s project that had yielded more than 1400 stories from participants spread across seven principal cohorts. By far the smallest cohort, a group of “community leaders,” had told only 50 stories. Even in the data for this small subset, most of the triads looked typical, but there were a few where, if you didn’t already know the pattern, you might not have been quick to define a norm.

One of the latter triads (see final image pair, below) prompted a rhetorical question from a member of the client team, asking about the scatter and seeming “lack of definition” (my words, not hers) in the distribution of points. During the next break, I asked her if she could imagine a scenario in which these community leaders might have placed their points in that *particular* triad with such a degree of generality or perhaps casualness that they could appear to be random. That led, in turn, to the question of what randomness in a triad would actually look like.

The answer is not surprising — randomness looks, well, random (see immediately below). But the question has to be addressed, just to be sure, because the closure constraint on ternary data (summation to 100%) produces counter-intuitive effects. Chief among these are spurious negative correlations and the need to represent an “average” by the geometric mean, rather than the more familiar arithmetic mean. (These are discussed in Part I and Part II of this series.)

Here are plots of 50 and 500 points generated in Excel with its standard RANDBETWEEN function:

These data were calculated in a “cycle” of three steps, starting with these steps for the first point (first row in Excel):

1. find the A-coordinate (lower-left vertex) with RANDBETWEEN(0,100), which gives an integer value between 0 and 100 inclusive;

2. find the B-coordinate (top) with RANDBETWEEN(0,100-A); and

3. find the C-coordinate (lower-right) by subtraction from 100-A-B.

Then repeat for the second point (row), but now in order 1:B, 2:C, 3:A; repeat for the third point (row) in order 1:C, 2:A, 3:B; then back to the first cycle for the fourth row, etc. This “supercycle” minimizes the clustering of values near the vertices that seems to arise in some recalculations of the data when all rows initiate on the same vertex.

The shortcomings in Excel’s random number generator are widely acknowledged, but the end result is surely good enough for purposes of this post. And definitely superior to what the Trolls in Accounting provided to Dilbert some years ago.

Now we can compare that most-scattered triad for the community leaders with a plot of 50 random numbers. The unlabelled plot of actual data from the workshop was prepared by Laurie in Tableau, and the concentric circles and half-altitudes are part of her standard template. Similarly, the random-number plot has vertical gridlines and *x-y* axes that are artifacts of transforming the 3-component data to plot in Cartesian coordinates, since Excel cannot create “native” ternary diagrams. If you can ignore all these technical add-ons, the two distributions of points are fairly similar. In fact, if you can picture the random-number triangle rotated 120° CCW, there are some surprisingly good correspondences.

Full disclosure: I chose this particular random pattern from the several tens of trials that I ran precisely because of the degree of visual similarity. The point is not to show, however, that the community leaders were *thinking randomly*. Actually, I don’t know what that would mean. There is nothing “random” about respondents’ stories and signifiers — they presumably knew exactly what they meant! Instead, I think of it as a minor cautionary tale: especially with small-sample projects or sub-cohorts, there could be a gradation from well-defined response patterns to ones that were *visually random*. This is just what showed up in the data at the workshop for the small cohort.

The inevitable question then is how does an analyst or a subject-matter expert deal with the *decreased utility* across a range from well-defined, highly-aggregated data to random-looking, highly-scattered data. If there isn’t an app for that, is there at least a comparative metric? These are much more general questions that apply to a project of any size and ones that I will come back to in the next post.

It’s less important to know exactly what Laurie means by each of these labels (see below) than to recognize that “experience and expertise” in this context are a matter of personal choice, of the who and how of client and data engagement.

This point was made forcefully by Iwan Jenkins in an August 22 post, Avoid shaved legs and let your customers love you. It is a tale of intense frustration over his difficulty in uploading podcasts to an iTunes server, compounded by the inability of Apple Support to offer “help” at a level appropriate to *his* chosen facility:

In my twelfth email I was commanded to change 3 lines of html code at the front end of my RSS feed. After this, my podcast would be accepted.

If I could insert 3 seconds of stunned silence here I would.

I am not a techie….

I am trying to do interesting work which makes a difference in people[‘]s lives, and Apple’s products promise to help me.

But they are making it difficult—and this makes it hard for me to love them.

I don’t code. I don’t want to code. I don’t want to waste time learning to how to backslash br hyphen colon br double back slash.

I just want to upload my podcast.

In case you don’t know Iwan, he has a Ph.D. in organic chemistry, so learning HTML would not be an intellectual challenge. Nor does his failure to do so indicate *akrasia* or some other character flaw from a Greek tragedy. Rather, as he says, he doesn’t code because “I don’t want to code.” So, in the spectrum above, he chose the center as the best place to serve his clients. (He also chose to shave his legs as the most expedient way to get the podcast uploaded, but you’ll have to read the rest of his post for the details.)

In a follow-up post, Laurie will discuss what she means by the three labels. She will also include an instrument that she and Zhen used in the Houston workshop. They asked participants to amplify the spectrum by indicating specific skills and perspectives that they would associate with the respective labels, based on their own self-characterization. In the interim, here is a *graphical* portrayal that shows several “calibration points,” users of SenseMaker that we know, labelled by single letters (but otherwise not identified here).

Roughly speaking, someone nearer the horizontal axis — Client-centric — would *choose* to spend more time working directly with clients in design, collection, and textual interpretation; someone nearer the vertical axis — Tool-centric — would spend a larger proportion of time on analysis, visualization, and numerical interpretation. In many cases, the optimal result would come from a pairing of people in distinct areas of the arcuate band. In any case, the unstated assumption is that *all* of the people understand the principles of SenseMaker, its application, and the importance of blending qualitative and quantitative data to provide maximum value to their clients.

*This post is a continuation of Statistics in the Triad, Part I: Geometric Mean. The two are meant to be read sequentially, since the mathematical elements of the first are an important and inescapable prerequisite for the second. If you already have a working knowledge of the geometric mean, however, and how its use differs from that of the ubiquitous arithmetic mean, then you can just read on.*

The fundamental property of a triad or ternary plot that requires special consideration when applying statistics is its constant-sum or closure constraint. In a SenseMaker project, the data automatically sum to 100% when a respondent clicks or places a marker inside the triad on a collector screen; or the results are normalized (see Part I) if the respondent otherwise enters numerical values for each of the three components (vertices), which are then divided by the sum of the three values to yield percentages.

The constraint arises because, once the normalization step is completed, only two of the three variables are independent. If you think of the original data as being plotted in three-dimensional Cartesian coordinates, then the triad is essentially a projection from three to two dimensions:

This results in a loss of one degree of freedom, which means that if one of the variables is changed, the combination of the other two must necessarily change as well, but in the opposite direction. Thus, spurious negative correlations can arise among the data that fall completely outside any interpretive, subject-matter context, such as a SenseMaker project. This problem has been recognized for more than a century (Pearson, 1897), though it did not gain widespread notice until geologists began looking for workarounds in the 1960s and 1970s, prompted by a paper by Chayes (1960). In addition, because the bounded compositional data (in the range 0 to 1) of a triad cannot have a normal distribution, any statistical method that assumes non-bounded data (typically -∞ to +∞) cannot be used, for example, factor and principal component analysis. Even the lowly arithmetic mean is suspect.

A “practical tool of analysis” came in the 1980s in a series of papers by the statistician John Aitchison, fully articulated in his 1986 book (p. 112, reprinted in 2003 with some supplementary materials): *The Statistical Analysis of Compositional Data*. In the subsequent three decades, there has been considerable work on clarifying and extending the theoretical underpinnings of Aitchison’s work, in particular by Vera Pawlowsky-Glahn, Juan-José Egozcue, and their collaborators. The most succinct summary of these results can be found in their 2006 paper (see References); and a more extended treatment is given in their 2015 book, *Modeling and Analysis of Compositional Data*. Several others among the Additional Readings are less accessible and not for the mathematically-faint-of-heart.

Aitchison pulled that old rabbit, the coordinate transformation, out of the mathematician’s hat. Think Cartesian-to-polar coordinates or one of the other transformations used in the applied world when a difficult problem in one geometric setting magically becomes straightforward in another. (If you dig into the Additional Readings, you will find that each of the three log-ratio transformations now in use has some drawback in one or more coordinate systems. We will focus on only one of the three, however, and will ignore those issues for the sake of clarity, or reduced obscurity if you prefer.)

Here’s the setup for data point ${x}_{i}$ in a triad displaying components $A$, $B$, $C$:

\begin{equation}

{ x }_{ i }:\left( { a }_{ i },{ b }_{ i },{ c }_{ i } \right) ,\quad where\quad { a }_{ i }+{ b }_{ i }+{ c }_{ i }=1

\end{equation}

which applies for each of the ${n}$ data points that may be available for that triad.

Aitchison then introduced the (additive) log-ratio transformation, which changes (1) to this:

\begin{equation}

{ x }_{ i }^{ \prime }:\left[ y=\ln { \frac { { a }_{ i } }{ { c }_{ i } } } ,z=\ln { \frac { { b }_{ i } }{ { c }_{ i } } } \right]

\end{equation}

Schematically, the transformation looks like this:

At a stroke, this creates two independent variables and moves the data from the closed ternary space (the “simplex”) to the domain of real numbers (-∞ to +∞), thereby eliminating spurious negative correlations and other covariance problems and opening up the transformed data to a variety of standard statistical methods. Equally important, there is an inverse transformation that allows results to be brought back to the triad for display with the original data.

Here is a concrete example, reconstructed and modified slightly from Aitchison (1989, Fig. 1; 1986, Table 1.2), to illustrate the procedure:

The triad shows data points for 25 samples of the rock type “hongite.” (This is one of Aitchison’s droll inventions, along with “kongite,” “coxite,” and “boxite,” presumably based on his time at the University of Hong Kong and the statisticians David Cox and George Box, respectively.) Each data point satisfies (1); the $\left( y,z \right)$ plot shows those same samples after transformation by (2). Because ${y}$ and ${z}$ are independent, the *arithmetic mean* of each can be calculated from these relations:

\begin{equation}

{ y }_{ AM }=\frac { 1 }{ n } \overset { n }{ \underset { i=1 }{ \Sigma } } \ln { \frac { { a }_{ i } }{ { c }_{ i } } } \quad and\quad { z }_{ AM }=\frac { 1 }{ n } \overset { n }{ \underset { i=1 }{ \Sigma } } \ln { \frac { { b }_{ i } }{ { c }_{ i } } }

\end{equation}

Notice that (3) should be compared to the left-hand equation for ${ x }_{ AM }$ in (5) of Part I, where the terms being summed are now of the form $\ln { \frac { { x }_{ i } }{ { c }_{ i } } }$.

The arithmetic mean calculated from (3) for the transformed data is shown as the red dot in the right-hand graph above. Not surprisingly, it falls “midway” along the distribution. The red dot is also shown in the triad. We’ll look now at how that return step, the inverse transformation, was done and the significance of the resulting point.

The $\left( {a,b,c} \right)$ coordinates in the triad for the arithmetic mean $\left( { y }_{ AM },{ z }_{ AM } \right)$ in the log-ratio plot can be calculated from the inverse transformation, which exponentiates the independent terms *and* normalizes them back to a constant sum of 1 in ${ABC}$:

\begin{equation}

{ \overset { \_ }{ a } }_{ LR }=\frac { exp\left( { y }_{ AM } \right) }{ exp\left( { y }_{ AM } \right) +exp\left( { z }_{ AM } \right) +1 }

\end{equation}

\begin{equation}

{ \overset { \_ }{ b } }_{ LR }=\frac { exp\left( { z }_{ AM } \right) }{ exp\left( { y }_{ AM } \right) +exp\left( { z }_{ AM } \right) +1 }

\end{equation}

\begin{equation}

{ \overset { \_ }{ c } }_{ LR }=\frac { 1 }{ exp\left( { y }_{ AM } \right) +exp\left( { z }_{ AM } \right) +1 }

\end{equation}

The appearance of ${1}$ as the third term in the denominator of each equation, and as the numerator in (6), literally creates the closure (constant sum) required for the ternary plot. (The reason for the change in notation on these coordinates — the overset bar to denote mean and the subscript to connect back to the log-ratio plot — will become clear as we proceed.)

Each of (4)-(6) can be expanded by substituting (3), but we will only do this for (4) in the interest of brevity. The other two will look similar to this:

\begin{equation}

{ \overset { \_ }{ a } }_{ LR }=\frac { exp\left( \frac { 1 }{ n } \overset { n }{ \underset { i=1 }{ \Sigma } } \ln { \frac { { a }_{ i } }{ { c }_{ i } } } \right) }{ exp\left( \frac { 1 }{ n } \overset { n }{ \underset { i=1 }{ \Sigma } } \ln { \frac { { a }_{ i } }{ { c }_{ i } } } \right) +exp\left( \frac { 1 }{ n } \overset { n }{ \underset { i=1 }{ \Sigma } } \ln { \frac { { b }_{ i } }{ { c }_{ i } } } \right) +1 }

\end{equation}

Now we can begin to simplify and make a surprising discovery. Using standard logarithmic identities, including $\log { a+\log { b } } =\log { \left( { ab } \right) } $, we can collapse each of the summations to a product:

\begin{equation}

exp\left( \frac { 1 }{ n } \overset { n }{ \underset { i=1 }{ \Sigma } } \ln { \frac { { a }_{ i } }{ { c }_{ i } } } \right) =exp{ \left( \frac { 1 }{ n } \ln { \prod _{ i=1 }^{ n }{ \frac { { a }_{ i } }{ { c }_{ i } } } } \right) }={ \left( \prod _{ i=1 }^{ n }{ \frac { { a }_{ i } }{ { c }_{ i } } } \right) }^{ \frac { 1 }{ n } }

\end{equation}

This right-hand expression in (8) should look familiar (see (12) in Part I) — it is the geometric mean of the ratio ${ \frac { { a }_{ i } }{ { c }_{ i } } }$ for the ${n}$ data points.

We can now substitute (8) into (7) for each of the summation terms, and by implication do the same for each of equations (4)-(6):

\begin{equation}

{ { \overset { \_ }{ a } }_{ LR }=\frac { { \left( \prod _{ i=1 }^{ n }{ \frac { { a }_{ i } }{ { c }_{ i } } } \right) }^{ \frac { 1 }{ n } } }{ { \left( \prod _{ i=1 }^{ n }{ \frac { { a }_{ i } }{ { c }_{ i } } } \right) }^{ \frac { 1 }{ n } }+{ \left( \prod _{ i=1 }^{ n }{ \frac { { b }_{ i } }{ { c }_{ i } } } \right) }^{ \frac { 1 }{ n } }+1 } }

\end{equation}

If we make the trivial substitution $1={ \left( \prod _{ i=1 }^{ n }{ \frac { { c }_{ i } }{ { c }_{ i } } } \right) }^{ \frac { 1 }{ n } }$ in the denominator of (9), then the common factor, $\frac { 1 }{ { c }_{ i } } $, will cancel in all four terms, top and bottom, leaving this:

\begin{equation}

{ { \overset { \_ }{ a } }_{ LR }=\frac { { \left( \prod _{ i=1 }^{ n }{ { a }_{ i } } \right) }^{ \frac { 1 }{ n } } }{ { \left( \prod _{ i=1 }^{ n }{ { a }_{ i } } \right) }^{ \frac { 1 }{ n } }+{ \left( \prod _{ i=1 }^{ n }{ { b }_{ i } } \right) }^{ \frac { 1 }{ n } }+{ \left( \prod _{ i=1 }^{ n }{ { c }_{ i } } \right) }^{ \frac { 1 }{ n } } } }

\end{equation}

The numerator in (10) is by definition the *geometric mean* of the ${ { a }_{ i } }$ component for the data points in the triad; and the denominator normalizes it to the sum of the geometric means for all three components. Similarly for the other two components, ${ { b }_{ i } }$ and ${ { c }_{ i } }$. Thus, the arithmetic mean as calculated with (3) in the log-ratio plot — the red dot — inverse-transforms to the $\left( { \overset { \_ }{ a } }_{ LR },{ \overset { \_ }{ b } }_{ LR },{ \overset { \_ }{ c } }_{ LR } \right)$ coordinates in the triad — also the red dot. Surprising, confusing, unsettling… but true! As Aitchison (1989, p. 789) says, this is precisely “the composition formed from [the] geometric means by the process of closure” (Aitchison, 1989, p. 789). In other words, **the arithmetic mean in the log-ratio plot is identical to the geometric mean in a triad.**

As a practical matter, if we only wanted to know the geometric mean and had no interest in other statistical calculations with the log-ratio data, then we could skip the transformation entirely and simply calculate the geometric mean directly (see (5) or (12) in Part I). Red dots all around, and far fewer of those big $\prod { }$s and $\sum { }$s.

We can both visualize this relationship and understand its significance by re-visiting the ternary plot for Aitchison’s “hongite” compositions:

To emphasize, the red dot is the inverse transformation of the arithmetic mean in the log-ratio plot, but as we have just seen it is really the geometric mean in the triad. By contrast, the blue dot is Aitchison’s (1989, p. 329) composition of the arithmetic mean of each of the three *individual* ${a}$, ${b}$, and ${c}$ components in the triad. He succinctly sums up the problem with the blue dot, the arithmetic mean (emphasis added): “It is clearly useless as a *measure of location* because it falls outside the array of compositions and is indeed very atypical of the data set.”

Although the distinction between the arithmetic and geometric means is enhanced visually when the data are in a well-defined, curved array like this, the general principle remains: the geometric mean is the appropriate “measure of location” for constant-sum data. What if the data are just an amorphous blob? It doesn’t matter — use the geometric mean! OK, but what if the data look more like a typical SenseMaker study, with respondents’ dots located near the center, near the three vertices, and along the three bisectors? It doesn’t matter — use the geometric mean, whether for the entire data set or for cohort- or signifier-defined subsets.

There is one situation, however, where careful inspection of and experimentation with data patterns might be particularly helpful. Some of the curved arrays encountered in geochemical studies of volcanic rocks have a temperature-dependence, displaying changing composition as lavas and magmas cool and crystallize. Under the broad rubric of thermodynamics, “cooling” means “passing time.” So moving “down” such a data array would, in general, be tracking time. That thought evokes SenseMaker projects employing either punctuated or continuous capture of data, where a fixed group of prompts and signifiers is used over an extended period of time. In that situation, especially if various interventions occurred in the hope of eliciting a change in behavior/response, then tracking time would be both necessary and valuable. A study that was well-designed in this regard might be the perfect way to persuade a skeptic — for example, a client who didn’t fully appreciate the wonders of the math — that the geometric mean is the right “measure of location” to use. And once demonstrated to the skeptic’s satisfaction, the point would carry over to any distinctions among data, whether temporal or demographic or cultural.

*The Statistical Analysis of Compositional Data*. The Blackburn Press, Caldwell NJ. 416 pp. plus additional material.

Aitchison, J. (1989) Measures of Location of Compositional Data Sets. *Mathematical Geology*, v. 21, p. 787-790.

Chayes, F. (1960) On correlation between variables of constant sum. *Journal of Geophysical Research*, v. 65, p. 4185-4193.

Pawlowsky-Glahn, V., and Egozcue, J.J. (2006) Compositional data and their analysis: an introduction. *in* Buccianti, A., Mateu-Figueras, G., and Pawlowsky-Glahn, V., editors, *Compositional Data Analysis in the Geosciences: From Theory to Practice*, Geological Society of London, Special Publications 264, p. 1-10.

Pearson, K. (1897) Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. *Proceedings of the Royal Society of London*, v. 60, p. 489-502.

Egozcue, J.J., and Pawlowsky-Glahn, V. (2005) Groups of parts and their balances in compositional data analysis. *Mathematical Geology*, v. 37, p. 795-828.

Egozcue, J.J., and Pawlowsky-Glahn, V. (2006) Simplicial geometry for compositional data. *in* Buccianti, A., Mateu-Figueras, G., and Pawlowsky-Glahn, V., editors, *Compositional Data Analysis in the Geosciences: From Theory to Practice*, Geological Society of London, Special Publications 264, p. 145-159.

Egozcue, J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G., and Barcelo-Vidal, C. (2003) Isometric logratio transformation for compositional data analysis. *Mathematical Geology*, v. 35, p. 279-300.

Pawlowsky-Glahn, V., Egozcue, J.J., and Tolosana-Delgado, R. (2015) *Modeling and Analysis of Compositional Data*. John Wiley & Sons, New York. 272 pp.

van den Boogaart, K.G., and Tolosana-Delgado, R. (2013) *Analyzing Compositional Data with R.* Springer-Verlag, Berlin. 258 pp.

von Eynatten, H., Pawlowsky-Glahn, V., and Egozcue, J.J. (2002) Understanding perturbation on the simplex: a simple method to better visualise and interpret compositional data in ternary diagrams. *Mathematical Geology*, v. 34, p. 249-257.

The ternary plot, better known in the Cogniverse as a triad, is familiar to users of SenseMaker as a tool for both data collection and data display. Its three vertices usually denote potential attributes, among which respondents can choose in any proportions, to amplify or augment their responses or reactions to some prompt about which they have told stories.

One question that invariably arises in both analysis and interpretation is “What is the average response?”, whether of the entire sample population or some particular demographic cohort(s). Unless you have a Ph.D. in statistics, or a tame statistician assisting you, the answer to that question and how to calculate “the right average” for data in a triad may not be obvious.

For most of us, the immediate answer for any collection of numbers is the arithmetic mean: add up all the individual values and divide by the number of them. If you have responses on some 1-10 scale that total 6110 from 837 people, then the average (the arithmetic mean) is 6110 ÷ 837 = 7.3. But what does that really mean? One answer would be, well, we could replace all the individual responses with 7.3, and the net result would be the same. That’s a numerically correct answer, but for CE practitioners a socially oblivious one. It’s understanding the diversity of people and their responses that is at the heart of any SenseMaker project!

Instead, let’s ask a somewhat deeper question: What is it that this average, or any average, is trying to find? The statistician’s answer would be that it’s a measure of central tendency. It’s a search for one number that is minimally-distant from all those respondents’ values, so we’re looking for *a single number that is as close as possible to all those responses* (even if that number itself, say 7.3, doesn’t exactly match any of the respondents’ values, which might be only the surrounding integers …5, 6, 7, 8…).

In the two previous paragraphs, an unstated assumption crept in: we can work with the responses and the result we’re looking for on the number line, the visual representation of all real numbers on a straight line, extending in principle from -∞ through 0 to +∞. This creates an obvious meaning for distance, as in “minimally-distant,” so finding the arithmetic mean is then just a matter of finding the one point on the number line whose cumulative distance from all the responses to its left and right is smallest.

One possibility is brute force — try lots of points on the line and see which gives the smallest result. Another is to open iTunes and see if there’s an app for that. Or we can be a little more thoughtful and do what a mathematician would do: derive something. Those of you who have forgotten everything from first-year calculus, and those who never took it, can relax. We’re just going to state a few expressions/equations and the end result, with some sentences to connect them. (Those who actually remember the details can picture what the omitted steps would be anyway.) Despite that disclaimer, to appreciate the argument fully does require some modest comfort and facility with the algebraic manipulations. It is a mathematical topic, after all.

We might begin by saying let’s just add up — indicated by the summation sign $latex { \Sigma }&s=2$ — all those differences between the $latex {n}&s=2$ individual points, $latex {x}_{i}&s=2$, on the number line, and the value we want to know, $latex \overset { \_ }{ x }&s=2$, the arithmetic mean. That would look like this:

\overset { n }{ \underset { i=1 }{ \Sigma } } \left( { x }_{ i }-\overset { \_ }{ x } \right)

$latex

\overset { n }{ \underset { i=1 }{ \Sigma } } \left( { x }_{ i }-\overset { \_ }{ x } \right)&s=2$

This form would be inappropriate, however, because we don’t care about the sign (+/-) of each individual difference. But it’s fixable by using the absolute value of each (in effect, treating all $latex {x}_{i}&s=2$ as positive):

$latex

\overset { n }{ \underset { i=1 }{ \Sigma } } \left( |{ x }_{ i }-\overset { \_ }{ x } | \right)&s=2$

Unfortunately, if we don’t already know the value of $latex \overset { \_ }{ x }&s=2$ — which of course we don’t, since it’s the very thing we’re trying to determine! — we can’t do the next step in the usual derivation. (For the knowledgable reader, the expression is discontinuous at its minimum, so we can’t find a zero derivative.) There’s another standard workaround though — square each distance:

$latex

\overset { n }{ \underset { i=1 }{ \Sigma } } {\left( { x }_{ i }-\overset { \_ }{ x } \right)}^{ 2 }&s=2$

This expression keeps the proper relative position of each point compared to the mean; generates only positive values; and solves the problem we just encountered (because it has a continuous first derivative). This also illustrates the kind of expression that arises in generating a “least-squares line” or “least-squares fit,” one of the simplest and most common tools in data analysis.

These three expressions are the crucial part of what we need as a comparative basis for discussing the geometric mean. So we can skip the remaining steps in the derivation — expanding (3), differentiating, setting the result equal to zero, and then some algebra — and go to the final result:

$latex

\overset { \_ }{ x } =\frac { 1 }{ n } \overset { n }{ \underset { i=1 }{ \Sigma } }{ x }_{ i }&s=2$

In words, the arithmetic mean is the sum of all the individual values divided by the number of values. Just what we already knew, but arrived at in a way — looking for the minimally-distant point on the number line — that most of us never consider. We now turn to applying that same approach to the geometric mean.

Here is the simplest, if not necessarily most useful, way to distinguish between the two means:

• If we’re *adding* a group of numbers, the *arithmetic mean* is the single number that could replace all the individual terms and the *sum* would be unchanged; whereas,

• If we’re *multiplying* a group of numbers, the *geometric mean* is the single number that could replace all the individual terms and the *product* would be unchanged.

Keep in mind that subtraction is just addition of a negative number; division is just multiplication with the reciprocal of a number; and a fractional power is just a root.

Here are four-term examples of each mean (labelled by the subscript and dropping the overset bar for simplicity):

\[

{ x }_{ AM }={ \left( \frac { { x }_{ 1 }+{ x }_{ 2 }+{ x }_{ 3 }+{ x }_{ 4 } }{ 4 } \right) }\quad and\quad { x }_{ GM }={ \sqrt [ 4 ]{ { x }_{ 1 }{ \ast x }_{ 2 }{ \ast x }_{ 3 }{ \ast x }_{ 4 } } }

\]

Or in slightly more compact form:

\[

{ x }_{ AM }=\frac { 1 }{ 4 } { \left( { x }_{ 1 }+{ x }_{ 2 }+{ x }_{ 3 }+{ x }_{ 4 } \right) }\quad and\quad { x }_{ GM }={ \left( { x }_{ 1 }{ \ast x }_{ 2 }{ \ast x }_{ 3 }{ \ast x }_{ 4 } \right) }^{ \frac { 1 }{ 4 } }

\]

The fully-generalized forms for ${n}$ terms look like this:

\begin{equation}

{ x }_{ AM }=\frac { 1 }{ n } \overset { n }{ \underset { i=1 }{ \Sigma } } { x }_{ i }\quad and\quad { x }_{ GM }={ \left( \prod _{ i=1 }^{ n }{ { x }_{ i } } \right) }^{ \frac { 1 }{ n } }

\end{equation}

where the giant ${\Pi}$ indicates pi-for-product, analogous to the ${ \Sigma }$ as sigma-for-sum.

Presumably everyone reading this post has experience with, and therefore a good intuitive sense of, the arithmetic mean. We don’t have to think about it, we just use it. By contrast, applications of the geometric mean arise in domains of specialized knowledge: compound growth rates, including investing; changes in social statistics; metrics for aspect ratios in print and visual media; signal processing; and others.

What most of these applications have in common is “normalization” of the data, for example, dividing by some reference value. This is exactly the situation in a triad: the coordinates for each data point in the equilateral triangle are calculated as ratios to the sum of the three input values — that’s the normalization step — and then expressed as percents. So, no question about it, the geometric mean is the way to go (see Endnote).

We’ll explore how this happens in practice — how do we actually calculate the geometric mean for a population or cohort in a triad? — in Part II of this post. But for now we have one small loose end to resolve. Notice that the left-hand equation in (5) is the same as (4). In turn, we got to (4) by talking our way down through expressions (1)-(3). What is the equivalent path for the right-hand equation? Asked differently, for the geometric mean, what is the equivalent to the standard number line, the frame of reference along which the cumulative, minimally-distant differences will be measured?

To answer this, we can use an identity about the logarithm (log) of numbers — the log of a ratio of two numbers is equal to the difference of their individual logs:

\begin{equation}

\log { \left( \frac { a }{ b } \right) } =\log { a-\log { b } }

\end{equation}

This is true for a logarithm to any base (2, ${e}$, and 10 being the most common). We’re going to use natural logarithms (base ${e}$, written ${ln}$), which makes (6) look like this:

\begin{equation}

\ln { \left( \frac { a }{ b } \right) } =\ln { a } -\ln { b }

\end{equation}

This identity gives us a way to deal with *ratios of values*, including *normalized coordinates in a triad*. So we can go back to expression (3) and use logs of normalized values, including the mean:

\begin{equation}

\sum _{ i=1 }^{ n }{ \left( \ln { \left( \frac { { x }_{ i } }{ { x }_{ 0 } } \right) } -\ln { \left( \frac { { \overset { \_ }{ x } } }{ { x }_{ 0 } } \right) } \right) ^{ 2 } }

\end{equation}

where ${{ x }_{ 0 }}$ is the normalizing factor.

Now we can expand expression (8) using equation (7) to replace each of the two terms:

\begin{equation}

\sum _{ i=1 }^{ n }{ \left( \ln { { x }_{ i } } -\ln { { x }_{ 0 } } -\left( \ln { { \overset { \_ }{ x } }-\ln { { x }_{ 0 } } } \right) \right) ^{ 2 } }

\end{equation}

The two terms with the normalizing factor, ${{ x }_{ 0 }}$, cancel each other, which leaves an expression like (3), but now a least-squares fit based on natural logarithms:

\begin{equation}

\sum _{ i=1 }^{ n }{ \left( \ln { { x }_{ i } } -\ln { { \overset { \_ }{ x } } } \right) ^{ 2 } }

\end{equation}

As above for the arithmetic mean, we’ll skip the intervening steps in the derivation — expanding (10), differentiating, setting the result equal to zero (to find the minimum), and then more algebra — to reach the result analogous to (4):

\begin{equation}

\ln { \overset { \_ }{ x } } =\frac { 1 }{ n } \sum _{ i=1 }^{ n }{ \ln { { x }_{ i } } }

\end{equation}

Finally, recalling that $exp\left( \ln { a } \right) =a$ and using the identity ${ a }^{ b }=\left( b\log { a } \right)$, we get the product form for the geometric mean (same as the right-hand side of (5)):

\begin{equation}

{ x }_{ GM }=\overset { \_ }{ x } ={ \left( \prod _{ i=1 }^{ n }{ { x }_{ i } } \right) }^{ \frac { 1 }{ n } }

\end{equation}

So as Leopold II said in *Amadeus*, “Well, there it is.” We have derived in (12) the standard form of the geometric mean as the ${n}$th root of the product of ${n}$ numbers. In the prior step (11), we saw a way to view it as a minimized, least-squares sum of distances on the *logarithmic number line*. Part II will discuss how this abstraction translates into practice.

If you want to pursue this further, there is a classic computer science paper that offers a good worked example: Fleming, P.J., and Wallace, J.J., 1986, How not to lie with statistics: the correct way to summarize benchmark results, *Communications of the ACM*, v. 29. no. 3, p. 218-221. (This link is firewalled, but a Google search on the title of the article will turn up copies of the PDF as part of CS course syllabi at several universities around the world.) The paper provides an empirical and theoretical explanation of the necessity of using the geometric mean (rather than the arithmetic mean) in processor benchmarks. They compare three processors for several performance criteria, and, though they didn’t use the display, their data are ready-made for ternary plots. As a result, it is relatively easy to follow their discussion and “think triad,” with straightforward application of their results to another perspective on what is presented here.^