The results are in from the 2015 War of Tableau User Groups (TUG), here, and the Cincinnati TUG won! If you haven't seen the collection of final visualizations, you can see all of them here.
I thought it might be interesting to walk through our process and some of the challenges.
When the War of TUGs was first announced I presented the idea at one of our Cincinnati Tableau User Group meetings. We had very strong attendance at that particular meeting, with more than 70 people in attendance. When I inquired about their interest in participating in the War of TUGs, more than half of the people in the room showed interest in participating. So Russell Spangler, one of our group leaders, registered the group for the event.
As a group, we needed to get together for our July meeting, whatever the day, and download the data and complete the viz during that time. We made an effort through our group page and via some emails to narrow down a date. Unfortunately, we didn't get a large response and the responses that we did get were conflicting. Being July, in the middle of summer, people were out of town or had other commitments. We settled on one of the dates, but as we got closer to the date we lost a few more people.
Go or No-Go
Now came a choice, do we move forward as planned? or do we simply back out because of the low response and the conflicts of dates? We even offered to move the meeting data, but time was running out and we didn't have that many options. We noticed a video that was posted on-line that the Seattle TUG posted and saw that the data was on Ohio schools. So we agreed 1.) we already committed to doing this and 2.) the data is in our backyard, so we should participate.
Day of the competition
We set up a shared Dropbox folder and Russell downloaded the data early in the morning and had it all ready to go. Hamed and I planned out a strategy and shortly after noon we were off to the races. I started working with the data, which was performance data for the Ohio schools. I pointed Hamed to Tableau Mapping BI, here, where Tableau Zen Master Allan Walker had posted a file of the US School Districts. We had school district as a field in the file, but it needed to be matched up to connect the two data sources. Hamed began that processing of matching, which also required some data cleaning as a result of many bad polygon points. I purchased and downloaded some images of a chalkboard and we all discussed the theme and agreed on the concept. I created some starter images for the front screen and the dashboard pages and found some cool chalk fonts. Russell worked on some design elements, making titles images from the custom fonts and creating some social media icons, etc. Meanwhile, I started exploring the data.
Hitting a brick wall
We all run into challenges with the data. Data is never perfect and in fact, I tell my students, "Never trust the data". (Michael Wu prefers me to say "Never trust the assumption of the data", which may be more on point.) About 30 minutes into some data exploration and creating a bunch of charts to use as the basis for the visualization, I hit a brick wall. When I started mapping the schools, I discovered that the data was not correct. The school addresses, city, state and latitude/longitude were not correctly lined up with the school information. Each school had numerous columns of performance data, but nothing else was lining up. Schools showing an address in Cincinnati were showing up in a school district in Cleveland. I was able to locate a data source from the State of Ohio, so I was able verify that the performance data was correct. What appeared to have happened was bad geocoding. It looked like the rows were displaced from one another in the geocoding process.
I called Russell and Hamed and we talked about it. Option 1.) Bail, and just notify Shawn Wallwork that the data was bad and we couldn't proceed. 2.) Work with the bad data as best we could making visualizations without school location or just allowing bad data or 3.) try to fix the data. I went back to the file from the State of Ohio and was able to append a new address (i.e. the correct address) for each school. Then I began geocoding. I typically use FindLatitudeandLongitude.com, but it was having some trouble and kept timing out on me, so I used another tool that I really like, Geocodio to geocode the file. The file correction and geocoding really set us back, taking time to build a new file and geocode it. We decided we would get together at my house around 6pm and start over with the new data. That gave use 4 hours.
Hamed arrived around 6:30pm and we got started with the fresh, corrected file. He began matching again with the School District file. Meanwhile, I started putting together the visualization with the correct data. We knew we wanted to use school district and plot schools, so it really was important to have that corrected data.
We worked up until the wire, submitting the visualization to Shawn at 11:57pm. We weren't done, but we were out of time, having lost a good bit of time dealing with the data problem and having to redo some things. All in all, we were pretty close to where we wanted to be. We did go back and add the finishing touches to the visualization. We also wanted to build some additional search functionality for School District and School, maybe text box searching, but we just didn't have the time to get those in.
Never trust the data. When looking at a new data set, it's always important to look around, check the data. Look at min, max, range, nulls, examine the dimensions. In general, just poke around. In this particular case, as soon as I mapped the schools I could see there was a problem with the data. When filtering "Cincinnati" school district and there are schools showing all over the state of Ohio, that was a big read flag. My immediate thought was that I did something wrong, a bad filter or the wrong field, but when I went back to examine the data I could quickly see there was an issue with teh underlining data.
Don't give up. We had problem after problem. Conflicting schedules, people dropping out, bad data issues that we needed to resolve, goecoding problems and a ticking time clock. We kept pressing forward and at the end of the day we were really pleased with what we had accomplished in such a short period of time. It was a great opportunity to compete with some great Tableau User Groups across the world and we had fun doing it.