Digital Frameworks

City of Baltimore Crime Analysis Analysis


By Karys Belger

For this data query with postman, I chose to stick with a city and dataset I was already somewhat familiar with. With the last project, I used the data from the city of Baltimore, much like I did for this class. The difference I that this time I focused on crime, more specifically, the crimes that have taken place in the city of Baltimore within a specific stretch of time. Much like the city salaries, the database of crime for the city of Baltimore is extremely large. In order to sort through the data effectively, I had to filter the information before I could even export it from the original data set online. In this case, since I want information about the number of homicides in the city, that’s what I’m going to focus on. It’s important to note that the large amount of data was not a problem for postman, only for Excel and Google Sheets when I was trying to interview the data like we had done in previous assignments. For this, I filtered on of the columns in the data set before exporting the information. I filtered out all of the crimes that weren’t homicides. This gave me every homicide in the city of Baltimore with every other column that was originally included in the dataset.

In postman I was able to sort through the entire database of crimes without having to just filter out one crime in particular. I used the following criteria to find out the number of homicides in the city of Baltimore. For my query, the $select was “crimedate” meaning the day that the incident took place. I chose to have the dates displayed as months. This way, when the data displayed instead of a large amount of singular dates with the one or two crimes, we have the entire month with the sum total of crimes combined so that we can get a better idea of trends in crime rates over time. The good part about this data set is that there is a specific date is assigned to each incident and it’s easy to sort through the data numerically because each crime has a code that allows it to be identified and sorted.

It’s also important to note that unlike in Excel, I was not able to group the data by month in Google sheets. In order for me to properly perform the functions with the data filtered by month, I had to create a separate excel file for everything. This way, I was able to create a data query that would resemble past work. In google sheets, the graph looks much more skewed and doesn’t contain all of the data.

Even with the data separated in either graph, there is a trend of increased homicide during the spring and summer months.There tends to be an average of 11 to 12 homicides in months like January. However, the homicides during the summer range from the teens to the twenties.

The $where for this query wound up being the specific crime that I wanted to track, in this case homicides. For value “description=HOMICIDE” gives us every homicide that occurs in this data set. Since the database records go all the way back to the year 2012, the postman query will reflect data going back that far as well. For “$group” and “$order$, month was the value as well so that all of the homicides in this set will be grouped together numerically by month and also put in order numerically in a similar fashion. So when we’re presented with the data, we’ll see that that there were 11 homicides in Baltimore in January of 2012 and also 11 homicides in Baltimore in February of that year. The data continues on this trend for the remainder of the query. In order to get a similar separation of the data I had to use excel to group the homicides by month because it’s slightly more difficult to do so without distorting the rest of the data in the set using Google sheets.

Copyright © 2017, Karys Belger. All rights reserved.

Created by David Eads and the students of Medill Digital Frameworks. Copyright varies by page and author.