8/27/2014
Methods for Creating Jitter in Tableau

The use of jitter is a great technique in dot plots, box plots with dots, and scatter plots. Jitter is a random value (or for our purposes pseudo-random) that is assigned to the dots to separate them so that they aren't plotted directly on top of each other. Tableau doesn't offer a check box, a built-in function, or a parameter to apply jitter, and there is no Tableau function to generate a random number. This is a common question from students in my data visualization class ("how to do jitter in Tableau"), and I was reminded of it recently when looking at a viz by Jon Boeckenstedt. His original viz uses a box plot with a dot plot (aka strip plot).

The problem here is that so many dots are in a similar range that it's really hard to distinguish between dots or to get an idea of how many dots are in a given range. I recommended the use of jitter to Jon and pointed him to a great post by Steve Wexler, here, for instructions on how to do jitter in Tableau.

I want to outline a few techniques that can be used to create jitter in Tableau. For these examples, I will use the sample data set in Tableau called "World Bank Indicators". Here are the steps to begin to set up the basic visualizations for each outlined method.

Step 1: Open Tableau and click the "Sample - World Bank Indicators (Excel)" data set.

Step 2: Build the chart

Under Military and Government related - Move Women parliament (%) to Rows
Move Country / Region to Detail
On the Marks Card change Automatic to Circles
Click Show Me and select the Box Plot

You should now have a box plot with dot plot that looks like this:

Notice how the dots are overlapping and that it's very difficult to know how many dots are in any given range. Applying jitter will spread these points making them easier to visualize.

Update: There is a hidden random function in Tableau, Random(). This function will only work with the Tableau Data Engine. That means it will work with an Extract or a Live Connection where Tableau creates a temporary extract, for example Excel or Text file.

Step 1: Create a calculated field

Calculated Field Name: jitter
Formula: random()

Step 2: Apply Jitter

Move the new field jitter to Columns

Method 1 - Using the Index() function

NOTE - First() or Last() function can be used in place of the Index() function

Step 1: Create a calculated field

Calculated Field Name: jitter
Formula: index()

Step 2: Apply Jitter

Move the new field jitter to Columns
Click the down arrow on jitter (where the triangle shows)
Click Compute using and select Country / Region

You now have a dot plot with jitter with the box plot.

Manually resize the chart area width making it narrow
Right click on the X-axis and uncheck Show Header
Right click on the chart area and select Format
Under the format options click Line
Set Grid Lines and Zero Lines to None

The final result should look like this. Apply further formatting as needed. One formatting option to consider if using box plots is to set the fill to 50% transparency (or lower). This will allow the dots to show through better on the box plot.

Method 2 - Create a pseudo-random number in Tableau

Step 1: Create a calculated field

Calculated Field Name: Random Number
Formula: ((PREVIOUS_VALUE(MIN(327680)) * 1140671485 + 12820163) % (2^24))

NOTE - 327680 is a seed value used by VB to generate pseudo-random numbers. This can be replaced with a parameter if desired.

Calculated Field Name: Random Int
Formula: INT([Random Number] / (2^24) * 100) + 1

NOTE - the value 100 can be any number for the upper limit. In this case, it will return pseudo-random numbers between 1 and 100.

Step 2: Apply Jitter

Move Random Int to Columns
Click the down arrow on jitter (where the triangle shows)
Click Compute using and select Country / Region

You now have a dot plot with jitter with the box plot.

Manually resize the chart area width making it narrow
Right click on the X-axis and uncheck Show Header
Right click on the chart area and select Format
Under the format options click Line
Set Grid Lines and Zero Lines to None

The final result should look like this. Apply further formatting as needed.

Joshua Milligan at VizPainter.com has additional instructions and notes outlined for this method here

Method 3 - Create a pseudo-random number in the data source

One of the easiest methods is adding a single field of pseudo-random numbers in the data source before loading into Tableau. This is very easy in Excel and SQL.

In Microsoft Excel use the function RANDBETWEEN(1,100) to return a number between 1 and 100.
In SQL use RAND() to return a pseudo-random float value from 0 through 1.

Once this is loaded into Tableau, simply use this field on Columns to apply jitter in the same manner as the instructions above.

Method 4 - Use SQL pass-through or Integration with R to generate a pseudo-random number

Directions are outlined here on creating a SQL pass-through.

There are loads of options in R to generate pseudo-random numbers.

sample(1:100, 5, replace=T) will return 5 numbers between 1 and 100.

Both of these options are limiting since it requires a connection and won't work with Tableau Public.

Method 5 - Get creative with the Data

This technique won't work in every case, but in some cases there may be a field that can be used to jitter the points. In fact, in this particular case it may give you additional insight into the data. This particular dataset has the Country / Region as a geographical role matching Tableau's Country/Region. Because of this connection, Tableau auto generates a latitude and longitude in the data set.

Move Longitude (generated) to Columns

Notice that the placement of the jitter matches the countries moving from west to east. This could be useful in this case to see patterns in the data. Again, this won't work in every case. For example, creating a small multiples by adding Region to the Columns will produce very different results using Methods 1-3 vs. this method which will group the points together.

This example shows the use of jitter on a dot plot with performance bands.

This example uses Method 3 creating a field for jitter in the data source and plotting a dot plot to show exam grades across semesters (click the image below for Tableau Public viz).

This is Jon's updated visualization after applying jitter.

Final Note - two items on my "Tableau Wish List".

1.) A function for Random Numbers (See note above - the Random() function does work with certain data connections)
2.) Have a checkbox to apply jitter to dot plots, scatter plots and shapes that would automatically apply jitter. In R using ggplot it is as simple as adding "position=position_jitter(width=1,height=.5))". It would be great if it were simply a checkbox and a slider to control the amount of jitter.