When Pratt Institute professor Ben Wellington went searching for NYPD crash stats, he did not go to the source.
Instead, he went to the blog of a guy who had crunched all the numbers for him. Wellington used data compiled by John Krauss, a computer and data nerd who lives in Crown Heights. Krauss had built a computer script that allowed him to mine the numbers from the police department’s monthly crash statistics PDFs and place them into an editable spreadsheet.
Wellington then organized Krauss’ figures using the city’s official neighborhood boundaries. Of course, neighborhoods are different sizes with different populations, densities, and traffic volumes, so ranking areas’ dangerousness is not an apples-to-apples endeavor.
Wellington stands by his work, but acknowledges that there could be problems with the data because the city posts the monthly information in a difficult-to-study format and remove’s the previous month’s data whenever a new set is posted.
“Because the city refuses to release this, the methods will be more error-prone than they could be,” said Wellington.” This data is based on somebody’s code, which is based on the city’s PDFs. Mistakes will come out of that process.”