Steve

Statistics in the Triad, Part VIIIb: Binning, or Where the Data Are actually Concentrated

The first section, ‘Data distribution in a triad’, of Part VIIIa listed some of the quantitative methods for comparison of story data in triads, including confidence regions, point-counting of clusters, and smooth contouring. The last of these uses kernel density estimation (KDE) to calculate a probability density function (PDF) for the data. This statistical alphabetContinue reading “Statistics in the Triad, Part VIIIb: Binning, or Where the Data Are actually Concentrated”

Statistics in the Triad, Part VIIIa: Smoothing, or Where the Data Are not Concentrated

This post is a cautionary tale — a warning buoy, if you will — about a widely-used method to aggregate and smooth data. The caution applies only to SenseMaker projects, which use ternary plots (triads) for both data collection and display, in particular those projects which yield high concentrations of data points near vertices. ForContinue reading “Statistics in the Triad, Part VIIIa: Smoothing, or Where the Data Are not Concentrated”

Statistics in the Triad, Part VII: Mapping The Datasaurus Dozen

Note: If you landed here by searching on “datasaurus” (± “dozen”) and have no idea what SenseMaker is, you can jump to the graphical results. In an earlier post in this series, Part III: Random Data, I showed an example of a SenseMaker triad, with data clustered near vertices, along edges, and in the center;Continue reading “Statistics in the Triad, Part VII: Mapping The Datasaurus Dozen”

Statistics in the Triad, Part VI: The Story as Unit of Observation

If you had asked me a year ago to identify the primary unit of observation in a SenseMaker project, I would have said, without much hesitation, it’s the story, of course. When I started writing Part IV in this series on Confidence Regions, however, I had to revisit that question. I knew what was typicallyContinue reading “Statistics in the Triad, Part VI: The Story as Unit of Observation”

Statistics in the Triad, Part V: Closure and Causal Structure

Here’s one of those articles that I carry around, bound to me by neural Velcro, stored in Instapaper, and gestating in background mode: When Correlation Is Not Causation, But Something Much More Screwy. It’s a 2012 guest piece in The Atlantic by UCLA sociology professor Gabriel Rossman, merging two 2010 posts from his blog, CodeContinue reading “Statistics in the Triad, Part V: Closure and Causal Structure”

Statistics in the Triad, Part IV: Confidence Regions

When we compare two (or more) groups of data on almost any kind of graphical presentation — histogram, box plot, x-y grid, time series, rose diagram, whatever — a near-universal question arises: Are the groups significantly different? The familiar answer is given by error bars or confidence intervals in x-y plots or bar charts, assuming an appropriate statistical model, for exampleContinue reading “Statistics in the Triad, Part IV: Confidence Regions”

Tom Brady and the Intrinsic Narrative

A few Sunday evenings ago, I watched Super Bowl LI, the climactic game of the 2016-17 season of the National Football League. With the loss and exit by the Dallas Cowboys earlier in the playoffs, I had no emotional stake in the outcome of the game, but at least some of the TV commercials might beContinue reading “Tom Brady and the Intrinsic Narrative”

Statistics in the Triad, Part III: Random Data

Story data in a SenseMaker triad tend to cluster in one of seven locations — the three vertices, the midpoints of the three edges, and the center. It’s also common to find “stringers” between the center and one or more of the other six loci (i.e., along the altitudes of the triangle); there are generallyContinue reading “Statistics in the Triad, Part III: Random Data”

The Sensemaking-for-Clients Spectrum, Part I

Several weeks ago, as Laurie prepared for the Cognitive Edge workshop that she and Zhen Goh taught in Houston on September 8-9, Analysis to Action with SenseMaker®, we talked about a question of pedagogy, of “pitch”: How do you teach material when the level of experience and expertise of the potential attendees is diverse? MoreContinue reading “The Sensemaking-for-Clients Spectrum, Part I”

Statistics in the Triad, Part II: Log-Ratio Transformation

[latexpage] This post is a continuation of Statistics in the Triad, Part I: Geometric Mean. The two are meant to be read sequentially, since the mathematical elements of the first are an important and inescapable prerequisite for the second. If you already have a working knowledge of the geometric mean, however, and how its useContinue reading “Statistics in the Triad, Part II: Log-Ratio Transformation”