## Fundamental Statistical Concepts in Presenting Data: Principles for Constructing Better Graphics

Via Andrew Gelman I came across this long paper (updated version) on statistical visualization by Rafe Donahue. I haven’t read it through carefully yet, but I enjoyed the examples of visualizations from his children’s schoolwork.

He criticizes boxplots, which caused a discussion in the comments to Andrew’s post. I read Tukey’s EDA recently and was surprised to see how much of Tukey’s work was focused on visualization by hand. The boxplot is a sensible visualization when you had to compute and plot manually. Using only 5 numbers it portrayed much of what was important about the data. However, now that plotting is cheap, it makes a lot more sense to just plot all the data.

In general, summaries, visual or otherwise, which assume a single mode, or worse normality, should be treated with a great deal of caution.

Category: visualization One comment »

December 8th, 2010 at 8:09 am

the link points to an outdated version of the paper. the new version is mentioned on the link you provided