Real-Time Traffic in Tableau Using Integration with R
8/17/2014





Back in June, I started playing with embedding traffic maps in Tableau. Many of the websites simply don't work. For example, embedding Google or Yahoo maps with a traffic layer as a URL in a Tableau Web Page doesn't work in the same manner as a regular map, Bird's Eye view or Street View.

For example, from the webpage maps.yahoo.com enter "Cincinnati, OH". The resulting URL looks like this:

https://maps.yahoo.com/place/?lat=39.1071529333471&lon=-84.50477600097656&q=cincinnati%2C%20oh&bb=39.285951221980014%2C-84.9184799194336%2C38.92843409820932%2C-84.09107208251953&addr=cincinnati%2C%20oh

Once this displays, there is an option to click traffic and that works fine, but we need to be able to pass a parameter via a URL to automatically turn on traffic, and we need to be able to embed this in Tableau.

The URL to embed is slightly different; it requires an "/e/" in the URL. The string "&trf=1" at the end of the URL turns on the traffic layer.
https://maps.yahoo.com/e/?lat=34.052234&lon=-118.243685&bb=34.135234%2C-118.410685%2C33.969234%2C-118.076685&trf=1

Some web sites use a "zoom=" format to set the zoom level, but other web sites allow for pinpoint accuracy of the viewing window. This is done with the "bb=" which sets the bounding box. A bounding box is an area set by passing two points using latitude and longitude, the bottom left point and the top right point of the "box". Click here for more information on bounding boxes.

Step 1: Build the Starting Data

The starting data file was built from a list here of the top 100 cities in the United States. The next step was geocoding these cities using the batch geocoding interface at findlatitudeandlongitude.com.
Download the finished data file here.

After importing this into Tableau, we can set the bounding box for the viewing area needed with a few simple calculations.

Step 2: Create Some Calculated Fields

   Calculated Field Name: Lat 2
   Formula: [Latitude] +.083

   Calculated Field Name: Long 2
   Formula: [Longitude] - .167

   Calculated Field Name: Lat 3
   Formula: [Latitude] -.083

   Calculated Field Name: Long 3
   Formula: [Longitude] + .167

   Move all Latitude and Longitude fields from Measures to Dimensions

Step 3: Build a Dashboard

   Plot the cities on a map
        Move Longitude to Columns
        Move Latitude to Rows
        Select the points on Alaska and Hawaii and exclude (optional)
        Move Long 2,Lat 2,Long 3, Lat 3, State and Location to Details

   Create a Dashboard and insert the map on a dashboard
   Add a Web Page using this URL with a latitude and longitude for the starting point
         https://maps.yahoo.com/e/?lat=34.052234&lon=-118.243685&bb=34.135234%2C-118.410685%2C33.969234%2C-118.076685&trf=1
  Click Dashboard then Action then Add Actions and select URL
  For "Run action on:" Click Select and enter the following URL
         https://maps.yahoo.com/e/?lat=<Latitude>&lon=<Longitude>&bb=<Lat 2>%2C<Long 2>%2C<Lat 3>%2C<Long 3>&trf=1

You should now have an interactive map that brings up a real-time traffic map in Yahoo when a city is selected.


This was relatively easy to work out, but I wanted to take this to the next level. I wanted the dots on the map to show the current condition of traffic across the country. There were a number of methods I explored to do this, but each one had its own challenges and I was unsuccessful.

After coming back to the problem a number of times over the last 2 months, I finally worked out a solution. Yahoo maps actually utilize the "Here Map Platform". The Here Platform has an API solution that returns traffic information in XML format for each URL that is passed. The URL utilizes the same format as Yahoo using the bounding box. However, the URL also requires an app_id and app_code. This is free and easy to obtain. You can sign up for a "Base" account here. You will need to obtain the app_id and app_code before continuing on to the next step.

Step 4: Build URL to Pass to R

  Create a calculated field

   Calculated Field Name: TrafficURL
   Formula: "http://traffic.cit.api.here.com/traffic/6.1/flow.xml?app_id=INSERT YOUR APP_ID&app_code=INSERT YOUR APP_CODE&bbox=" + str([Lat 2]) + "," + str([Long 2]) + ";" + str([Lat 3]) + "," + str([Long 3])

The XML that will be returned looks something like this:


The example above shows what the nodes in the XML file looks like when querying Los Angeles, California. All of the roads that have traffic information within the bounding box are returned in the XML file. Pay particular attention to the "JF=0.0" and "JF=0.83954". The "JF" is the "Jam Factor" which is the measurement for how bad the traffic is on each road that is listed. It is a number from 0 to 10, with 0 indicating the road completely clear and 10 indicating the road completely closed. The goal then is to get the average "Jam Factor" for each area, i.e. the viewing area, and return it to Tableau.

Summary of steps for each city, i.e. viewing area:
URL for each city goes to R -> Get XML file from Here.com -> Parse all the Jam Factors -> Average all of the Jam Factors -> Return the average to Tableau -> Map


Step 5: Setup Rserve Connection

To parse out and return the average Jam Factor for each city we will use R and a package called XML. This technique also requires using the package Rserve with Tableau as described in Tableau 8.1 and R by Bora Beran here.

   Install and load the packages Rserve and XML in RStudio

      install.packages("Rserve") # only needs to be done once
      install.packages("XML") # only needs to be done once
      library("Rserve")
      Rserve()
      # The XML package will load from Tableau when necessary

   NOTE - Make sure the R connection is set up in Tableau.

Step 6: Enter the R Code

   Create a Calculated Field
   Calculated Field Name: Jam Factor
   Formula:
      SCRIPT_REAL("
      library(XML);
      doc = xmlParse(.arg1);
      a <- xmlToList(doc);
      a1 <- unlist(a, recursive=TRUE);
      b <- as.data.frame(a1);
      b$name <- substr(names(a1), nchar(names(a1))-1, nchar(names(a1)));
      final <- subset(b, name == 'JF');
      final$JamFactor <- as.numeric(levels(final$a1))[final$a1];
      mean(final$JamFactor)
      ", ATTR([TrafficURL]))

Description of R Code:
      # Load "XML" package in R
      # Download and Parse the XML for the URL (.arg1) into doc
      # Put XML to List
      # Unlist the doc for data frame
      # Put in a data frame
      # Parse the last 2 characters of the field
      # Select the records with "JF" to get all of the Jam Factors
      # Create a numeric value for the factors
      # Average all of the Jam Factors in the area and return the average to Tableau

   NOTE - Jam Factor should be Continuous.

Step 7: Format the Map

   On the first map we created:
      Move Jam Factor to Color
      Edit Color to "Red-Blue Diverging" and "Reversed"
      Click Advanced
      Check Start and set to 0
      Check End and set to 2.5
      Check Center and set to 1.25


You should now have a map with real-time traffic for the top 100 cities. I added a list of cities with the Jam Factor in a list. The entire visualization takes about 2 minutes to load, but considering it's downloading 100 XML files, parsing tens of thousands of Jam Factors and returning it to Tableau that seems pretty reasonable. Download my Tableau workbook here.


A few notes:

1.) The "Jam Factor" is a number between 0 and 10. However, the scale for heavy traffic on the visualization shows heavier traffic above 1.25 with dark red at 2.5. This is because the overall traffic of the area is averaging the Jam Factor for every street in the bounding box that has traffic information. Unless some major event occurs, e.g. the ice storm in Atlanta this past January, then it would be very unlikely for every street to have a high Jam Factor. Looking at traffic during rush hour for the major cities, I determined that using 2.5 as the upper bound seemed to be the right range for indicating heavy traffic in a city.

2.) Most traffic maps use red and green in the color palette. Being a professor in data visualization, I can't bring myself to use red-green because of the issues with color-blindness. I chose a red-blue palette instead so that the red would be consistent with the map.


I hope you find this helpful. If you have any questions feel free to email me at Jeff@DataPlusScience.com

Jeffrey A. Shaffer
Follow on Twitter @HighVizAbility

Edited by Breanne LaCamera 1/26/2015 and posted 2/17/2015