Skip to content


Fundamental Statistical Concepts in Presenting Data: Principles for Constructing Better Graphics

Via Andrew Gelman I came across this long paper (updated version) on statistical visualization by Rafe Donahue. I haven’t read it through carefully yet, but I enjoyed the examples of visualizations from his children’s schoolwork.

He criticizes boxplots, which caused a discussion in the comments to Andrew’s post. I read Tukey’s EDA recently and was surprised to see how much of Tukey’s work was focused on visualization by hand. The boxplot is a sensible visualization when you had to compute and plot manually. Using only 5 numbers it portrayed much of what was important about the data. However, now that plotting is cheap, it makes a lot more sense to just plot all the data.

In general, summaries, visual or otherwise, which assume a single mode, or worse normality, should be treated with a great deal of caution.

Posted in visualization.


One Response

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. MySchizoBuddy says

    the link points to an outdated version of the paper. the new version is mentioned on the link you provided



Some HTML is OK

or, reply to this post via trackback.