I need to start this post by stating that I am a big fan of Cole Nussbaumer's work at Storytellingwithdata.com. If you are not familiar with her data visualization blog then I would highly recommend that you follow her. Cole offers practical advice for designing great data visualizations and typically has wonderful redesign examples on her website. With that said, I would like to examine one of her redesigns that I came across today and offer an alternative approach.
The original data visualization and her redesign notes are here. This is the original data visualization:
Here is Cole's redesign:
She was gracious enough to post the Excel file that she used to create her redesign. It's available here if you are interested in exploring further.
Point #1: There is a data error in her redesign.
If you examine the original visualization closely you will see that Competitor 5 leads in Category 1 and is in last place in Category #2. However, in Cole's redesign this is not the case. This is because it appears that Cole made an innocent mistake when copying and pasting values from row to row in Microsoft Excel when she rescaled the data. Category 1 and 6 are actually the only ones that are correctly assigned.
In her redesign:
Category 2 is actually Category 3
Category 3 is actually Category 5
Category 4 is actually Category 2
Category 5 is actually Category 4
We all make mistakes. I've certainly made many of them over the years. The only advice I can offer you in this area is 1.) always check your work and 2.) if possible get additional eyes to examine your final product. It's not a foolproof solution, but it will help you catch errors like this. This error is hidden fairly well. It is hard to immediately recognize because of the way color is being used with the bar chart. This leads to the next issue that I have with this redesign.
Point #2: It is very hard to determine one competitor from another.
Cole uses a very simple color palette, a blue-gray color palette that she uses often. It's a very pleasing palette and is non-distracting. However, because all of the competitors are gray she uses the order of the bars to distinguish one competitor from another. This requires the eye to go back and forth to the legend to determine the order and then count down the bars to determine which competitor is which. As a result it is very difficult to quickly determine which competitor has the lead, or any given position, in each category.
Point #3: The scale has been arbitrarily rescaled.
Transforming variables is a standard practice in the field of data analytics and while I like her concept, this rescaling could cause tremendous confusion in certain circumstances. The original visualization has the data points labelled as "weighted performance index". This seems very specific and is likely calculated by the business in a certain way. Cole has rescaled this to a positive scale by adding 1.1 to every number. However, there is no indication in her visualization that it was rescaled, there is no axis to provide a zero base line or the range of the values and she uses the same label of "weighted performance index" as the original visualization. This has the potential to cause great confusion in a real-world application. As an example, what would happen if readers of the report were used to seeing negative values for the "weighted performance index"? There would be lots of questions. In addition, any comparisons to previous reports would be lost. I realize that if this were a new report or the scale isn't a standard business metric that it may not be an issue, but I would prefer a redesign solution that does not rescale the data.
Point #4: The ranking doesn't tell the whole story.
I really like the ranking idea to easily compare "our business" to the competitors. However, it doesn't tell the whole story. For example, consider Category 1. In this category Competitor 1 and "our business" have the same value. This is a common problem with forced rankings. There is a tie for 5th place and a decision has to be made on how to handle that order. In this case Cole made the choice to position "our company" in last place.
There are other nuances that are also lost with the forced ranking system. For example, in Category 5 (Category 4 in Cole's redesign) "our business" ranks 3rd place, but the performance in this category is much higher than the performance in Category 3 (Category 5 in Cole's redesign) where it ranks 2nd place. In other words, we overshadow the fact that 0.51 is higher than 0.19 because ranking becomes the focal point of the comparison. The relative performance in each category is lost. Ideally we could see both, the ranking within a single category and the relative comparison across categories.
Consider the other important questions that will likely be asked:
Who is the best-in-class of all the competitors? i.e. the biggest competition and who is the worst?
Is this easy to determine from the visualization?
Point #5: Is this a bar chart or a rotated histogram?
Bar charts are used for categorical comparison and histograms are used to show distributions. The general guidance on bar charts is to have some space between the bars. There is a fine line between too much space and too little space. Histograms usually have a smaller space to indicate a relationship of the distribution.
Unfortunately there is not much research in this area, but there are some general guidelines that experts suggest. For example, Stephen Few addresses this in his book Show Me the Numbers as well as a post on his blog Perceptual Edge here. Even using a minimal space in this case would be better than no space at all. Without any space between the bars the bars come together in a large block and may it very difficult to read.
Some of these points could be be resolved by making a few minor changes to Cole's visualization. For example, in this redesign I kept the scale the same as the original and added a little space between the bars to make them easier to read.
This doesn't solve point 2, i.e. the difficulty of determining the competitor quickly. Using color is another option, for example using a categorical or sequential color scheme to double encode the competitor with the bar order. This example uses a sequential gray to help differentiate the competitor.
This isn't ideal and I can see why Cole avoided this in her redesign. Color becomes distracting and using a categorical color scheme is even worse. The other problem that is now introduced with this approach is that the reader is now forced to compare the lengths of the bars in two different directions if they are trying to make exact comparisons.
My final redesign uses the original concept of a dot plot. In the original design different shapes and colors were used to differentiate the competitors. Using shapes other than circles or dots is not ideal, but I will save that detailed discussion for another post. The use of color in the original is also not good because they are mixing bright alerting colors of yellow and orange with more subtle blues and purple.
Below is my redesign done in Tableau. Click here for the Tableau Public workbook.
By using a simple dashboard highlighting action in Tableau it makes it even easier to examine each competitor in detail as well. If creating a static report then small multiples could be used to make quick comparisons. It's now very easy to see that Competitor #4 is the best overall and Competitors #2 and #3 are the two worst.
I hope you found this redesign discussion interesting and make sure to follow Cole's blog!. As always, if you have any questions feel free to email me at Jeff@DataPlusScience.com