Chicago's Gun Violence

The following visualisation will compare the location of gun shootings within Chicago's communities with socio-economic variables. Mapping out the precise location of gun violence alongside different factors shows significant spatial variation in where these events occur. The tool allows a user to click on a variable to obtain a statistical analysis of how gun shootings compare across Chicago's community areas.

Select variable of interest, explore the map and statistical analysis on the right hand side.

Results & Description

This variable pays attention to the level of occupation in the built environment defined as the level of vacant housing units over the total number of housing units. A high vacancy rate may be associated with either an undesirable neighbourhood where housing is difficult to find occupiers or high turnover areas that are vacant at time of survey.The map shows gun incidents cluster where vacancy rate is high, except in downtown. Vacancy rates tend to be high in the South Side and around Austin where many incidents occur. The R-value indicates this is the strongest predictor of incidents in our variables, at 0.761. We can not say whether higher violence causes higher vacancy or vice versa, but that they are related literature that supports vacancy being a measure of social decline.

Line Slope Intercept P-Value Standard Error R-Value
224.374 -13.803 9.845e-16 22.098 0.761

Lack of a high school diploma is associated with a lack of opportunities. Here the p-value is above the 5% significance level hypothesis test against the slope being zero, but there seems to be an upward trend. The distribution of high school drop out rates is skewed towards the low end and so we log transform the variable for an improved regression fit. The R-value between log drop out rate and gun incidents is 0.299, another weak corrlelation where there is a lot of unexplained variation in the incident rate. A smaller p-value here indicates that the slope is signifiant and there is a general increase in gun incidents where there is a larger high school dropout population, but this is not explained well with simple linear regression.

Results & Description

Line Slope Intercept P-Value Standard Error R-Value
40.9 9.94 0.078 22.93 0.202

Results & Description

This variable is chosen since it indicates the average level of prosperity in each area and is taken from American Community Survey database. There is massive variation within the city from $7,000 to $88,000, and income explains 41.9% of the variation in incident rate, where most incidents occur in the lower income communities.

Line Slope Intercept P-Value Standard Error R-Value
-0.000552 31.577 0.000149 0.000138 0.419

Results & Description

Family structure is taken as the rate of households with children under the age of 18 in households where their parents are married over the total, where where are also in either single parent households or with unmarried partners. This represents the rate to which families exist as a traditional ‘nuclear family’ which conservatives consider necessary for a well society. The R-Value here is 0.724, a strong predictor of incident rate. As the rate of married homes increase, gun incident rate falls. This may be seen as an indicator of ‘social decline’, it is also correlated to a lesser extent (0.563) with income.

Line Slope Intercept P-Value Standard Error R-Value
-66.66 53.11 1.05e-13 7.3416 0.723

Results & Description

Older homes may indicate a long established neighbourhood, but also depending on maintenance may result in decline in the quality of the built environment. The slope of the regression line is greatly influenced by the few communities where properties are on average newest and gun crime is relatively low. Removing the communities where the average age is newer than 1970 the slope is much steeper but has a lower correlation coefficient (0.236).

Line Slope Intercept P-Value Standard Error R-Value
-0.391 779.490 0.0310 0.178 0.246

Results & Description

Given the level of attention to mental health problems to gun violence in the US, the map demonstrates that mental health tends to be poorer where gun incidents occur more densely. This variable is defined as the rate of the population that have self reported to have had suffered poor mental health (widely defined) in at least 14 days of the last month. The slope is small but significant with an R value 0.648, a poor mental health rate is strongly an increase in gun violence.

Line Slope Intercept P-Value Standard Error R-Value
4.349 -38.77 1.903e-10 0.590 0.648

Results & Description

Occupation type of a neighbourhood informs about the terms of residency. An owner occupier can be thought of as an invested in the area with intention to stay long term. Effects of higher owner occupancy are likely to be a stable community with minimal turnover. Meanwhile rented accomodation is associated which short term tenancies and higher turnover, since cost of moving is lower. Rental occupation is often a result of lack of access to capital to purchase ones one house, we therefore see some correlation between this variable and per capita income. Alternatively rented occupation may be due to the prevalence of young single householders, and we see in some areas such as downtown, renter occupation is high whilst income is also high and gun incidents rate is low. The R value is 0.368 which indicates this variable has limited explanatory power.

Line Slope Intercept P-Value Standard Error R-Value
40.699 -4.415 0.000996 11.879 0.368

Results & Description

According the frequency of the incident rate in Chicago’s communities, which is shown in the histogram chart, it seems that more than half of the communities (Chicago has been comprised of 77 communities) have witnessed less than 5 number of gun violence per 100 k persons, while the other communities have had relatively the higher rate of incidents. This issue demonstrates that the distribution of gun related crimes does not comply with a normal distribution between communities (it is a right skewed distribution), and high incident rate have concentrated in some certain communities.

Results & Description

Our cluster analysis puts all communities in to one of 4 clusters, identified by the similarity of the variables above except for incident rate. We find 258 of the 409 incidents contained within one cluster. This means 64.06% of incidents occur within 25.32% of the population. Without including the incident rate within the cluster analysis we find the clusters experience very different proportions of the incidents, and similarity of characteristics finds the areas where there is much more gun violence. The interquartile range for standardised incident rate is firmly above zero in cluster 1, whereas in every other cluster it is below zero

Cluster ID Communities in Cluster Characterisation Population % of total Population Incidents in 2015 % of Incidents Incident Rate
0 6 Downtown 344522 12.78% 16 3.91% 4.64
1 25 Social Disorder? 682548 25.32% 262 64.06% 38.39
2 25 Comfortable 875937 32.50% 49 11.98% 5.59
3 77 Mixed 792591 32.50% 82 20.05% 10.34

Gun violence is one of the main problems in the US, a country that sits at 11th place in the international rank of gun-related deaths. This violence occurs disproportionately in urban regions: in 2015 just 20 cities – covering nearly 9% of the total US population – accounted for almost 25% of gun-related deaths. Chicago, one of the most violent cities in the US, registered more shootings than New York City and Los Angeles combined.

What are the determinant factors in the concentration of gun violence?

So what are the main theories that relate to gun crime according to the literature ?

Responsive image

On the basis of the literature we have chosen the following variables for the analysis:

Responsive image

The aim of this project is to analyse the relationship between gun crime and other factors and visualise our finding on the website. We utilise SQL database servers, analysis in Python and JavaScript programming within the HTML. Our data relates gun crime incidents in 2015 in Chicago. The following paragraphs briefly explain the methodology used in different steps of this research.

Gathering Of Data

The first attribute data are the records of all gun incidents in Chicago in 2015, which are collected and maintained by the Gun Violence Archive. The geographical coordinates of these incidents are provided for use in vector representation as points. Three factors have been considered when gathering the data needed for this research. Firstly, they should be the data that has been released by the official relevant organisations to ensure the data sources are reliable. Secondly, they must be related to the comparable years influencing the gun crime incidents that we are analysing, and finally their scale should be equivalent and relatable to the boundary files. 

Data Cleaning & The Database

For creating a database, MySQL software is used, which allows the import of various tables into the available server ( Also with MySQL scripts the tables can be cleaned, amended and joined, per Census Tracts and grouped for the larger Community Areas. Moreover, by data cleaning the data is ready for the analysis process.


Two methods of analysis conducted on the data related to the gun shooting crime in Chicago, namely regression and cluster analysis.  Firstly, different factors are considered to find any relationship between the gun shooting incident rate in Chicago's communities and the demographic and socioeconomic factors. For finding the said relationship, we perform simple linear regression. Next, we decide about the strength and significance of the regressions with looking at the r-value and P-value. To analyse if gun incidents were occurring at a higher rate in places of common characteristics, cluster analysis is used to further analyse the relationship between our variables and the incident rate. Incident rate is not used in the algorithm but analysed against the output. K-Means clustering algorithm is used to ensure all areas are assigned a cluster for comparison against gun incidents across the whole city.

Server Side Components/ API

Once analysis is completed we need to set up a node.js program that would provide JSON files to return our desired data, through MySQL commands, for use in our website. Endpoints are created for boundary data as GeoJSONs, incident locations and characteristics, and our explanatory variables for varying scales.


The findings of this research have been presented by creating a website that presents information through different formats, namely map, graphs and charts and text. 

The Map   

Our map tool is supported by Google Maps API, which is one of the most popular interactive maps for the web. It allows for customizability of functions through JavaScript and so compatibility with our JSON files. Our visualization presents all incidents as markers, and colours areas according to observed values for visual inferences.

Graphs & Charts  

The graphs and charts related to data analysis and descriptive statistics have been utilised alongside the map. These charts and graphs are provided by Highcharts library which provides interactive charts and graphs for the websites. By changing the relevant variables in the javascript of APIs provided by Highcharts, various appropriate data visualization can be obtained.


The necessary texts for this website related to research objectives, analysis, explanations, methodology etc. has uploaded in certain html files which their style has improved with supporting css files. 


After testing simple linear regressions between incident rate and different demographic and socioeconomic factors in Chicago's communities, it seems that there are quite high direct correlations between gun shooting crime in communities and housing vacant rate, rate of people with less than high school diploma, rate of rental occupied and mental health problems. Also, there are a negative correlation between gun shooting incidents and per capita income, rate of married families with children and median year of buildings.  Our cluster analysis puts all communities in to one of 4 clusters, identified by the similarity of variables. We find 258 of the 409 incidents contained within one cluster. This means 63.1% of incidents occur within 24.7% of the population. Without using the incident rate within the cluster analysis, we find the clusters experience very different proportions of the incidents, and similarity of characteristics finds the areas where there is much more gun violence.

Data Sources


Geographic level: Census Tract (by State--County)
Dataset: 2015 American Community Survey: 5-Year Data [2011-2015, Block Groups & Larger Areas]
1. Family Type by Presence and Age of Own Children Under 18 Years
2. Educational Attainment for the Population 25 Years and Over
3. Per Capita Income in the Past 12 Months (in 2015 Inflation-Adjusted Dollars)
4. Occupancy Status
5. Tenure
6. Median Year Structure Built

2) Centre for Disease Control and Prevention

Data resource: Behavioral Risk Factor Surveillance System (BRFSS): 2014.
Numerator: Respondents aged >18 years who report 14 or more days during the past 30 days during which their mental health was not good.
Denominator: Respondents aged >18 years who report or do not report the number of days during the past 30 days during which their mental health was not good (excluding those who refused to answer, had a missing answer, or answered 'don't know/not sure').

3) Chicago Data Portal

Census Tracts (2010)
Chicago Community Areas

4) Gun Incidents

Data of individual and census tract aggregates gun incidents for 2015 available from the Guardian, acquired from Gun Violence Archive.

5) Map provided by Google Maps API

Responsive image

The "Chicago's Gun Violence" project was developed by the members above for the Spatial Data Capture, Storage & Analysis module as part of the MSc in Smart Cities & Urban Analytics at UCL, London.