Visualising HK Transport Department traffic accident data in Google Fusion Tables

Screenshot-Transport Department - Year 2008 - Google Chrome
Step One: Download the data from the Transport Department website at http://www.td.gov.hk/en/road_safety/road_traffic_accident_statistics/2008/index.html. Scroll down and you will find a link to Road Traffic Accident Database 2008.

Screenshot-Google Fusion Tables - Google Chrome
Step Two: Import to Google Fusion Tables. You have to save the XLS file as individual CSVs, since it’ll take only one table at the time, and the number of rows limit is lower for XLS files.

Screenshot-Google Fusion Tables | Vehicles involved in Road Traffic Accidents in 2008 (Hong Kong) - Google Chrome
Step Three: Visualise. Here, we see that an overwhelming proportion of casualties on the road in 2008 involved men (coded as 1 in the data), but it might just be because of demographics.

Because there is not a lot of unique information to plot (like a datetime of the accident), the suggestion with this data is to do an aggregate on your column of interest (say, driver sex), then plot it as the entity, and use the count as your value. Could be nice to mix and match two criteria (are young men more frequently involved in accidents?).

If you want to play with the data yourself, here are the links to the tables, as imported in Google Fusion Tables:

1. Road Traffic Accident Stats in 2008: http://tables.googlelabs.com/DataSource?dsrcid=224727

2. Vehicles involved in Road Traffic Accidents in 2008: http://tables.googlelabs.com/DataSource?dsrcid=225310

3. Casualties in Road Traffic Accidents in 2008: http://tables.googlelabs.com/DataSource?dsrcid=225311

Here is how it compares in terms of age, whether the casualty involved was male or female (note that the scale is different, being much lower for women).


Male driver casualties in 2008 (plotted by age on the x-axis)


Female driver casualties in 2008 (plotted by age on the x-axis)


Overall driver casualties in 2008 (plotted by age on the x-axis)

The current problem with Google Fusion Tables (which is still a Labs product) is that it won’t allow you to compare more than two criteria at the same time in a practical format. For instance, I can’t superimpose graphs for deaths per sex and per age on one single view. Sounds like a pretty basic feature, so I wouldn’t be terribly surprised if it sprung up in a couple of months, if not weeks.

Quality of the data is also questionable since maybe 60-70 people listed as “drivers” are aged 16 or less… Did they mean they were in the driver’s seat or actually driving when the accident occurred?!

***

On another note, I also imported the news agencies database from China’s General Administration of Press and Publication, which is the state agency regulating news and print publication in the PRC. This data was retrieved at around March 2010 from www.gapp.gov.cn using custom scripts systematically reading the GAPP’s webpages. After parsing into a database-friendly format, I used it to build the China Media Map, which might start to include our annotations, soon.

But frankly, there isn’t much to visualise with this data, aside from location, since it has no contextual data attached to it (it’s just an address/phone book, basically). If you can think of something to do with it, drop me a line.

Leave a Reply