## Four Experiments on the Perception of Bar Charts

November 13th, 2014 — 10:30am

At this year’s InfoVis, I published a paper with two of my Tableau Research colleagues, Vidya Setlur and Anushka Anand. This paper explores how people make perceptual comparisons in a bar charts building on a previous study by Cleveland and McGill.

The paper itself is available: Four Experiments on the Perception of Bar Charts. And here are all the stimuli, data, and analyses.

## Riposte Update

November 13th, 2014 — 10:23am

Some of our continued work on Riposte, a fast runtime for the R language, was published at ARRAY 2014, a workshop collocated with PLDI. This paper focuses on our efforts to increase the performance of short vector code. The paper is here: Just-in-time Length Specialization of Dynamic Vector Code

## An Empirical Model of Slope Ratio Comparisons (Corrected)

February 4th, 2013 — 9:27pm

I’ve posted a corrected version* of our InfoVis paper from last year: An Empirical Model of Slope Ratio Comparisons. In preparing the published version of the paper, we made a change in the parameterization of our space of slope comparisons to simplify the explanation of what we did. In doing this, I made a simple math error that resulted in us using the wrong mid-angles in our analysis. To see the difference, compare Figure 2 in the original and in the updated versions. The impact of the error is minor and doesn’t change our arguments or conclusions, but it required regenerating our plots and it slightly changed our model parameter estimates.

I’ve also posted R code which will reproduce our (corrected) analysis and figures. Along with the stimuli we released earlier, this should allow anyone to reproduce our analysis.

(*The irony of having to correct our paper which itself attempts to correct Cleveland’s earlier paper was not lost on me.)

## CS148

September 7th, 2011 — 5:26am

This summer I taught CS 148, the introduction to computer graphics course at Stanford. This was the first time I taught the class and, despite being incredibly time consuming, I really enjoyed it. I created all my own slides and new projects for the course.

The final project was to combine a path traced object with a real photograph—creating a “special effects” style image. I thought this was quite successful. The winning image was created by Zhan Fam Quek:

This was created using a path tracer written by the students, using a HDR environment map and HDR background image captured by the students.

More images from the final project can be found in the course gallery. Slides and assignments can be found on the course webpage.

## LaTeX Hyphenation Mist-akes*

September 2nd, 2011 — 10:14pm

One of the great things about TeX is that it will automatically hyphenate words when doing so leads to better overall line breaks in a paragraph. This is somewhat difficult task because, when hyphenating words, it is not acceptable to insert the hyphen between just any pair of letters. Some hyphenations, such as “new-spaper” can lead the reader “down a garden path.” That is, when reading the end of the line (“new-“), the reader guesses incorrectly that “new” is a complete stem within a compound word and is then completely confused when confronted with the unlikely terminating word “spaper”. A similar problem occurs when the hyphenation causes the reader to pronounce the head portion incorrectly, causing them to read nonsensical words (I’ll give an example in a second.)

To address this problem some automatic text layout systems rely on dictionaries in which acceptable hyphenation points have been marked by a human. While generally correct, such dictionary-based approaches require a very large data file to store the dictionary and fail when given new words. An alternate approach, taken in TeX, is to summarize hyphenation points into a small set of patterns which can then be applied to any word. The TeX method was described in Franklin Liang’s thesis Word Hy-phen-a-tion by Com-put-er. Liang’s thesis claims that his pattern-based finds about 90% of the human marked hyphenation points and finds essentially no incorrect ones.

Unfortunately, essentially no incorrect ones is not the same as no incorrect ones. In the last two papers I’ve written, I’ve come across words that TeX’s method failed rather dramatically on. Fortunately, LaTeX provides an easy way to override the automatic hyphen selection through the `\hyphenation{}` command.

`\hyphenation{white-space}` fixes TeX’s “whites-pace”, a somewhat racist rendering. Note that this incorrect hyphenation is the opposite of the “new-spaper” example, which came from Liang’s thesis, highlighting the problems facing a purely pattern-based approach.

`\hyphenation{analy-sis}` fixes TeX’s “anal-ysis” which leads to a rather infelicitous mispronunciation of the first part of the word.

Despite doing a good job most of the time, TeX’s automatic hyphenation can and does go awry. Keep your eyes open when proof reading!

*I’m pretty sure TeX gets the hyphenation of “mistakes” correct.

## SIGGRAPH course on Importance Sampling

September 1st, 2011 — 9:14pm

I just noticed that my Masters work on Resampled Importance Sampling was included in a SIGGRAPH course last year.

In standard importance sampling you have to draw samples from a distribution which can be sampled and which has a known scaling factor (so that the area under the pdf = 1). This is very limiting in some MC situations, such as sampling the incoming light direction in a path tracer. The function we want to sample from is the product of the BRDF and the incident light field. However, we can typically only sample from one component or the other, not both. In RIS, we create a sampled, discrete approximation to the desired distribution and then draw our samples from that. Since the approximation is discrete we can easily normalize it and draw samples from it. In my thesis, I showed that when weighted correctly, this results in an unbiased estimate.

RIS will beat IS when it is substantially cheaper to create the approximate distribution than it is to evaluate the true function. In the path tracing context, I did this by evaluating the BRDF and environment map lighting for the approximate distribution, but not the visibility (the ray tracing step). With this approach, the RIS image (on the right) has substantially lower variance than the IS image (left).

As long as the visibility test accounts for a substantial fraction of the execution time, RIS will beat IS. However, this wasn’t particularly true when I wrote the thesis and is probably less true now. In my thesis, I concluded that, for current rendering tasks, RIS wasn’t a big win since the difference in evaluation cost between the approximate samples and the actual samples wasn’t very high. However, I think it could be quite useful in some MC applications where a relatively good and cheap discrete approximation can be found.

## Arc Length-based Aspect Ratio Selection

August 31st, 2011 — 8:15pm

Here’s a preprint of our paper on aspect ratio selection which will appear in InfoVis 2011. In it we propose a new criteria for banking data plots, building on previous ideas from Bill Cleveland and Jeff Heer and Maneesh Agrawala.

We frame the aspect ratio selection problem as one of minimizing the length of the data curve while keeping the area of the plot constant. This leads to a method that is substantially more robust than previous approaches. We’re also able to demonstrate empirically that the resulting aspect ratios are a compromise between those suggested by previous methods. As shown below, the arc length method can also effectively bank both standard line charts (in this case a loess regression line) as wells as contour charts.

Perhaps the most surprising result is that good aspect ratios can be selected without explicit reference to the slopes or orientations of the line segments within the plot.

## C# code for labeling paper

July 23rd, 2011 — 3:57am

I’ve finally had time to pull the labeling algorithm out of my much larger visualization package. It’s now up on github: https://github.com/jtalbot/Labeling. This implements all parts of the labeling paper, including the formatting variations.

Let me know if you run into any problems with it or have any suggestions for improvement.

## R labeling package released on CRAN

December 8th, 2010 — 6:23pm

Version 0.1 of the labeling package has been released on CRAN.

## R code for our axis labeling algorithm

September 3rd, 2010 — 3:52am

The R version of our labeling code is now hosted at R-forge. You can get it here or install it from within R using `install.packages("labeling", repos="http://R-Forge.R-project.org")`.

A few small bugs in the implementation of our algorithm have been fixed thanks to feedback from Ahmet Karahan who is working on a Java version. I have also added a number of other labeling algorithms that have been proposed or used in the past, including those by Sparks, Thayer, and Nelder (from about 40 years ago), and adaptations of the matplotlib, gnuplot, and R’s pretty labeling functions.