Real-Time Traffic in Tableau Using Integration with R
Back in June, I started playing with embedding traffic maps in Tableau. Many of the websites simply don't work. For example, embedding Google or Yahoo maps with a traffic layer as a URL in a Tableau Web Page doesn't work in the same manner as a regular map, Bird's Eye view or Street View.
For example, from the webpage maps.yahoo.com enter "Cincinnati, OH". The resulting URL looks like this:
Once this displays, there is an option to click traffic and that works fine, but we need to be able to pass a parameter via a URL to automatically turn on traffic, and we need to be able to embed this in Tableau.
The URL to embed is slightly different; it requires an "/e/" in the URL. The string "&trf=1" at the end of the URL turns on the traffic layer.
Some web sites use a "zoom=" format to set the zoom level, but other web sites allow for pinpoint accuracy of the viewing window. This is done with the "bb=" which sets the bounding box. A bounding box is an area set by passing two points using latitude and longitude, the bottom left point and the top right point of the "box". Click here for more information on bounding boxes.
Step 1: Build the Starting Data
The starting data file was built from a list here of the top 100 cities in the United States. The next step was geocoding these cities using the batch geocoding interface at findlatitudeandlongitude.com. Download the finished data file here.
After importing this into Tableau, we can set the bounding box for the viewing area needed with a few simple calculations.
Step 2: Create Some Calculated Fields
Calculated Field Name: Lat 2
Formula: [Latitude] +.083
Calculated Field Name: Long 2
Formula: [Longitude] - .167
Calculated Field Name: Lat 3
Formula: [Latitude] -.083
Calculated Field Name: Long 3
Formula: [Longitude] + .167
Move all Latitude and Longitude fields from Measures to Dimensions
Step 3: Build a Dashboard
Plot the cities on a map
Move Longitude to Columns
Move Latitude to Rows
Select the points on Alaska and Hawaii and exclude (optional)
Move Long 2,Lat 2,Long 3, Lat 3, State and Location to Details
Create a Dashboard and insert the map on a dashboard
Add a Web Page using this URL with a latitude and longitude for the starting point
Click Dashboard then Action then Add Actions and select URL
For "Run action on:" Click Select and enter the following URL
https://maps.yahoo.com/e/?lat=<Latitude>&lon=<Longitude>&bb=<Lat 2>%2C<Long 2>%2C<Lat 3>%2C<Long 3>&trf=1
You should now have an interactive map that brings up a real-time traffic map in Yahoo when a city is selected.
This was relatively easy to work out, but I wanted to take this to the next level. I wanted the dots on the map to show the current condition of traffic across the country. There were a number of methods I explored to do this, but each one had its own challenges and I was unsuccessful.
After coming back to the problem a number of times over the last 2 months, I finally worked out a solution. Yahoo maps actually utilize the "Here Map Platform". The Here Platform has an API solution that returns traffic information in XML format for each URL that is passed. The URL utilizes the same format as Yahoo using the bounding box. However, the URL also requires an app_id and app_code. This is free and easy to obtain. You can sign up for a "Base" account here. You will need to obtain the app_id and app_code before continuing on to the next step.
Step 4: Build URL to Pass to R
Create a calculated field
Calculated Field Name: TrafficURL
"http://traffic.cit.api.here.com/traffic/6.1/flow.xml?app_id=INSERT YOUR APP_ID&app_code=INSERT YOUR APP_CODE&bbox=" + str([Lat 2]) + "," + str([Long 2]) + ";" + str([Lat 3]) + "," + str([Long 3])
The XML that will be returned looks something like this:
The example above shows what the nodes in the XML file looks like when querying Los Angeles, California. All of the roads that have traffic information within the bounding box are returned in the XML file. Pay particular attention to the "JF=0.0" and "JF=0.83954". The "JF" is the "Jam Factor" which is the measurement for how bad the traffic is on each road that is listed. It is a number from 0 to 10, with 0 indicating the road completely clear and 10 indicating the road completely closed. The goal then is to get the average "Jam Factor" for each area, i.e. the viewing area, and return it to Tableau.
Summary of steps for each city, i.e. viewing area:
URL for each city goes to R -> Get XML file from Here.com -> Parse all the Jam Factors -> Average all of the Jam Factors -> Return the average to Tableau -> Map
Step 5: Setup Rserve Connection
To parse out and return the average Jam Factor for each city we will use R and a package called XML. This technique also requires using the package Rserve with Tableau as described in Tableau 8.1 and R by Bora Beran here.
Install and load the packages Rserve and XML in RStudio
install.packages("Rserve") # only needs to be done once
install.packages("XML") # only needs to be done once
# The XML package will load from Tableau when necessary
NOTE - Make sure the R connection is set up in Tableau.
Step 6: Enter the R Code
Create a Calculated Field
Calculated Field Name: Jam Factor
doc = xmlParse(.arg1);
a <- xmlToList(doc);
a1 <- unlist(a, recursive=TRUE);
b <- as.data.frame(a1);
b$name <- substr(names(a1), nchar(names(a1))-1, nchar(names(a1)));
final <- subset(b, name == 'JF');
final$JamFactor <- as.numeric(levels(final$a1))[final$a1];
Description of R Code:
# Load "XML" package in R
# Download and Parse the XML for the URL (.arg1) into doc
# Put XML to List
# Unlist the doc for data frame
# Put in a data frame
# Parse the last 2 characters of the field
# Select the records with "JF" to get all of the Jam Factors
# Create a numeric value for the factors
# Average all of the Jam Factors in the area and return the average to Tableau
NOTE - Jam Factor should be Continuous.
Step 7: Format the Map
On the first map we created:
Move Jam Factor to Color
Edit Color to "Red-Blue Diverging" and "Reversed"
Check Start and set to 0
Check End and set to 2.5
Check Center and set to 1.25
You should now have a map with real-time traffic for the top 100 cities. I added a list of cities with the Jam Factor in a list. The entire visualization takes about 2 minutes to load, but considering it's downloading 100 XML files, parsing tens of thousands of Jam Factors and returning it to Tableau that seems pretty reasonable. Download my Tableau workbook
A few notes:
1.) The "Jam Factor" is a number between 0 and 10. However, the scale for heavy traffic on the visualization shows heavier traffic above 1.25 with dark red at 2.5. This is because the overall traffic of the area is averaging the Jam Factor for every street in the bounding box that has traffic information. Unless some major event occurs, e.g. the ice storm in Atlanta this past January, then it would be very unlikely for every street to have a high Jam Factor. Looking at traffic during rush hour for the major cities, I determined that using 2.5 as the upper bound seemed to be the right range for indicating heavy traffic in a city.
2.) Most traffic maps use red and green in the color palette. Being a professor in data visualization, I can't bring myself to use red-green because of the issues with color-blindness. I chose a red-blue palette instead so that the red would be consistent with the map.