Monday, January 20, 2014

Mapping NYC Parking Tickets

Over the last 2-3 months I have been working with  the NYC Parking Violations Issued data released on NYC Open Data.  Basically, I wanted to geocode the records, enabling spatial and visualizations and analysis.  I'm still thinking and working on ways to map this data but in the mean time, I thought I would share the data.  

There were a number of challenges working with this data.

The first challenge was to see if the data was really valid.  Creating a histogram of the total reveals that most dates only have what appears to be a small sample of records or the records are incorrectly encoded with the date(there are a number of records in the future).  I was able to settle on a date range (07/29/2013 - 10/28/2013) that seemed to be consistent and reasonable in terms of having complete data.

Second, I needed to geocode the records before I could map them.  For this I decided to make use of the recently released NYC GeoClient API.  Here is the code I used - https://github.com/tswanson/NYCParkingGeocode.  It ran @ 1,500 records per min on an Amazon EC2 server.  The code is quite sloppy.  I just kept adding more code as I found ways to geocode more addresses.   Some, intersections others street addresses.  I also had to determine borough codes from the precincts.

Here is a simple visualization of all tickets in a heat map.



Full map