Over the last year or so, I've been working with the NYPD Motor Collision Data Reports. I thought that they would be very useful for mapping, provided they could be formatted properly. The problem is that the reports are released as PDF documents, and therefore ineditable. The data is aggregated by month to the closest intersection. This limits what can be done with the data because one does not know where the actual incident occurred.
For this analysis I looked at the "Bicycle" count in the Vehicle Type column. This way, I was looking at total bicycles involved in collisions (as reported) instead of just cyclist injuries.
The NYPD Crash Band-Aid project has been working hard on developing Python scripts that extract the PDF documents into a usable and open format - .csv. I was able to help contribute to the project by providing the initial seeding of Latitude and Longitude coordinates for most of the intersections. The table of intersections was created by using the Department of Planning LION street file. Additional intersections geocoded, were made by using the Dept. of Planning GeoSupport desktop geocoder. Currently, just over 99% of the intersections have coordinates.
Now that the data was in a 'mappable' format, I went to work making a number of maps of bike collisions, as well as other vehicle types. In the absence of actual traffic counts by vehicle types, this helps create a picture of traffic patterns by various vehicles. This assumes that there is a strong correlation between collisions and miles driven in those areas, which may not always be the case.
Livery cab collision density map. August 2011 - September 2013
View Larger Map
Comparison of bike collision density with total vehicle collision density.
I was curious enough to go see where the areas with the highest density of collisions was, so I hopped on a CitiBike. Three hours and 5 different CitiBikes later, I had collected photo documentation of those locations.
And finally, I created a 3D heat map of the bike collision density. Warning, you will need firefox, chrome, or safari browser. The file will have to first download and unpack locally. Look for a future blog post on how this was created.
Ok ok... For the CitiBike analysis.
One of the first things I tried was to compare density of bike collisions for June - Sept. 2013 to the same months in 2012. I wanted to see if there were any visible patterns when comparing the months in 2013 when CitiBike was available vs the previous year. It really didn't show much in terms of patterns. The data is also sparse for this type of use and the validity of any conclusion drawn from it would be questionable. Still, I spend a lot of time creating these maps, so I've left them in here.
Lastly, I created a polygon area that outlines the CitiBike docking stations (shown in blue below). I used this boundary to compare total bike collisions inside of this area to the total outside.
View Larger Map
Below are two charts that graph the NYC bike collision data by month.
The first chart shows two bands of data. The top dark blue columns are the total number of bicycles involved in collisions outside of the CitiBike Area, and the lower light blue columns are the number of bicycles involved in collision within the CitiBike Area. I added the orange colors on the bottom columns so that comparisons for the summer months of 2013 can be easily compared to 2012.
This second chart shows the percentage of bike collisions that occurred in just the CitiBike Area.
There has been an increase in the total bike collisions for the CitiBike area since the bike share program started at the end of May. However the increase is small and total bike collisions has increased also. More importantly, the percent of bike collisions in those four months (June - Sept 2013) has stayed consistent with the historical percentages.
Based on the collision data published by the NYPD, and more importantly on the last 4 months of data since the launch of the bike share program by CitiBike, my analysis shows that there is no significant increase to bike collisions in the areas with CitiBike docking stations compared to all bike collisions in the city.
Link to spreadsheet data by month.
Please stay tuned as I update results new data becomes available I will also be posting more details on how some of the maps in this blog were created.