Police Shootings Dataset

Author

Ashmita, Diya, Krithika, Arnav

Published

October 20, 2025

Police Shootings

It’s a real-world data set that compiles incidents of fatal police shootings in the US, originally sourced from The Washington Post’s “Fatal Force” database.

Viewing the Dataset

library(DataSetsVerse)
Loading required package: timeSeriesDataSets
Loading required package: educationR
Loading required package: crimedatasets
Loading required package: MedDataSets
Loading required package: OncoDataSets
═══════════════════════════ Welcome to DataSetsVerse ═══════════════════════════
A metapackage for thematic and domain-specific datasets in R.
✔ timeSeriesDataSets v0.1.0
✔ educationR         v0.1.0
✔ crimedatasets      v0.1.0
✔ MedDataSets        v0.1.0
✔ OncoDataSets       v0.1.0
DataSetsVerse()
═══════════════════════════ Welcome to DataSetsVerse ═══════════════════════════ 
A metapackage for thematic and domain-specific datasets in R.

✔ timeSeriesDataSets v0.1.0
✔ educationR         v0.1.0
✔ crimedatasets      v0.1.0
✔ MedDataSets        v0.1.0
✔ OncoDataSets       v0.1.0 
library(crimedatasets)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   4.0.0     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
glimpse(police_shootings_tbl_df)
Rows: 6,421
Columns: 12
$ date                    <date> 2015-01-02, 2015-01-02, 2015-01-03, 2015-01-0…
$ manner_of_death         <chr> "shot", "shot", "shot and Tasered", "shot", "s…
$ armed                   <chr> "gun", "gun", "unarmed", "toy weapon", "nail g…
$ age                     <dbl> 53, 47, 23, 32, 39, 18, 22, 35, 34, 47, 25, 31…
$ gender                  <chr> "M", "M", "M", "M", "M", "M", "M", "M", "F", "…
$ race                    <chr> "A", "W", "H", "W", "H", "W", "H", "W", "W", "…
$ city                    <chr> "Shelton", "Aloha", "Wichita", "San Francisco"…
$ state                   <chr> "WA", "OR", "KS", "CA", "CO", "OK", "AZ", "KS"…
$ signs_of_mental_illness <lgl> TRUE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE,…
$ threat_level            <chr> "attack", "attack", "other", "attack", "attack…
$ flee                    <chr> "Not fleeing", "Not fleeing", "Not fleeing", "…
$ body_camera             <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…

Data Dictionary

date: Date when the shooting occurred

manner_of_death: How the person died

armed: What the victim was reportedly armed with

age: Age of the victim

gender: Gender of the victim ( M = Male, F = Female)

race: Race / ethnicity of the victim (W = White, B = Black, H = Hispanic, A = Asian, N = Native, O = Other)

city: City where the shooting took place

state: U.S. state abbreviation

threat_level: The perceived threat level

flee: Whether the person was fleeing

body_camera: Whether a body camera was in use during the shooting

signs_of_mental_illness: Whether there were reported signs of mental illness

Setting up packages

Registered S3 method overwritten by 'mosaic':
  method                           from   
  fortify.SpatialPolygonsDataFrame ggplot2

The 'mosaic' package masks several functions from core packages in order to add 
additional features.  The original behavior of these functions should not be affected by this.

Attaching package: 'mosaic'
The following object is masked from 'package:Matrix':

    mean
The following objects are masked from 'package:dplyr':

    count, do, tally
The following object is masked from 'package:purrr':

    cross
The following object is masked from 'package:ggplot2':

    stat
The following objects are masked from 'package:stats':

    binom.test, cor, cor.test, cov, fivenum, IQR, median, prop.test,
    quantile, sd, t.test, var
The following objects are masked from 'package:base':

    max, mean, min, prod, range, sample, sum

Attaching package: 'skimr'
The following object is masked from 'package:mosaic':

    n_missing

Attaching package: 'janitor'
The following objects are masked from 'package:stats':

    chisq.test, fisher.test

Attaching package: 'naniar'
The following object is masked from 'package:skimr':

    n_complete

Attaching package: 'tinytable'
The following object is masked from 'package:ggplot2':

    theme_void

Attaching package: 'crosstable'
The following object is masked from 'package:purrr':

    compact
Loading required package: grid

Attaching package: 'vcd'
The following object is masked from 'package:mosaic':

    mplot
Loading required package: gnm

Attaching package: 'gnm'
The following object is masked from 'package:lattice':

    barley

Attaching package: 'vcdExtra'
The following object is masked from 'package:dplyr':

    summarise

Attaching package: 'resampledata'
The following object is masked from 'package:vcdExtra':

    TV
The following object is masked from 'package:datasets':

    Titanic

Attaching package: 'ggmosaic'
The following objects are masked from 'package:vcd':

    mosaic, spine

Our Goal

To analyze patterns in police shootings in the US, including demographic, situational and geographic factors.

Our analyses might explore:

  • Variation by race, gender, age
  • Whether victims were armed
  • The role of mental health indicators
  • The effect of fleeing behavior and perceived threat levels
  • Geographic & state‐level patterns
  • Yearly and monthly trends
police_shootings <- police_shootings_tbl_df
glimpse(police_shootings)
Rows: 6,421
Columns: 12
$ date                    <date> 2015-01-02, 2015-01-02, 2015-01-03, 2015-01-0…
$ manner_of_death         <chr> "shot", "shot", "shot and Tasered", "shot", "s…
$ armed                   <chr> "gun", "gun", "unarmed", "toy weapon", "nail g…
$ age                     <dbl> 53, 47, 23, 32, 39, 18, 22, 35, 34, 47, 25, 31…
$ gender                  <chr> "M", "M", "M", "M", "M", "M", "M", "M", "F", "…
$ race                    <chr> "A", "W", "H", "W", "H", "W", "H", "W", "W", "…
$ city                    <chr> "Shelton", "Aloha", "Wichita", "San Francisco"…
$ state                   <chr> "WA", "OR", "KS", "CA", "CO", "OK", "AZ", "KS"…
$ signs_of_mental_illness <lgl> TRUE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE,…
$ threat_level            <chr> "attack", "attack", "other", "attack", "attack…
$ flee                    <chr> "Not fleeing", "Not fleeing", "Not fleeing", "…
$ body_camera             <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
janitor::clean_names(police_shootings)
# A tibble: 6,421 × 12
   date       manner_of_death  armed        age gender race  city          state
   <date>     <chr>            <chr>      <dbl> <chr>  <chr> <chr>         <chr>
 1 2015-01-02 shot             gun           53 M      A     Shelton       WA   
 2 2015-01-02 shot             gun           47 M      W     Aloha         OR   
 3 2015-01-03 shot and Tasered unarmed       23 M      H     Wichita       KS   
 4 2015-01-04 shot             toy weapon    32 M      W     San Francisco CA   
 5 2015-01-04 shot             nail gun      39 M      H     Evans         CO   
 6 2015-01-04 shot             gun           18 M      W     Guthrie       OK   
 7 2015-01-05 shot             gun           22 M      H     Chandler      AZ   
 8 2015-01-06 shot             gun           35 M      W     Assaria       KS   
 9 2015-01-06 shot             unarmed       34 F      W     Burlington    IA   
10 2015-01-06 shot             toy weapon    47 M      B     Knoxville     PA   
# ℹ 6,411 more rows
# ℹ 4 more variables: signs_of_mental_illness <lgl>, threat_level <chr>,
#   flee <chr>, body_camera <lgl>

Checking for Missing Values

visdat::vis_dat(police_shootings)

There are many missing entries; hence dropping the NA’s

police_shootings_mod1 <- police_shootings %>% 
  naniar::replace_with_na_all(condition = ~ .x %in% common_na_strings) %>%
  naniar::replace_with_na_all(condition = ~ .x %in% common_na_numbers) %>% 
drop_na()
police_shootings_mod1
# A tibble: 5,106 × 12
   date       manner_of_death  armed        age gender race  city          state
   <date>     <chr>            <chr>      <dbl> <chr>  <chr> <chr>         <chr>
 1 2015-01-02 shot             gun           53 M      A     Shelton       WA   
 2 2015-01-02 shot             gun           47 M      W     Aloha         OR   
 3 2015-01-03 shot and Tasered unarmed       23 M      H     Wichita       KS   
 4 2015-01-04 shot             toy weapon    32 M      W     San Francisco CA   
 5 2015-01-04 shot             nail gun      39 M      H     Evans         CO   
 6 2015-01-04 shot             gun           18 M      W     Guthrie       OK   
 7 2015-01-05 shot             gun           22 M      H     Chandler      AZ   
 8 2015-01-06 shot             gun           35 M      W     Assaria       KS   
 9 2015-01-06 shot             unarmed       34 F      W     Burlington    IA   
10 2015-01-06 shot             toy weapon    47 M      B     Knoxville     PA   
# ℹ 5,096 more rows
# ℹ 4 more variables: signs_of_mental_illness <lgl>, threat_level <chr>,
#   flee <chr>, body_camera <lgl>

Data Munging

police_shootings_mod1 %>%  
mutate(
    manner_of_death = as_factor(manner_of_death),
    armed = as_factor(armed),
    gender = as_factor(gender),
    race = as_factor(race),
    city = as_factor(city),
    state = as_factor(state),
    signs_of_mental_illness = as_factor(signs_of_mental_illness),
    threat_level = as_factor(threat_level),
    flee = as_factor(flee),
    body_camera = as_factor(body_camera)
 )
# A tibble: 5,106 × 12
   date       manner_of_death  armed        age gender race  city          state
   <date>     <fct>            <fct>      <dbl> <fct>  <fct> <fct>         <fct>
 1 2015-01-02 shot             gun           53 M      A     Shelton       WA   
 2 2015-01-02 shot             gun           47 M      W     Aloha         OR   
 3 2015-01-03 shot and Tasered unarmed       23 M      H     Wichita       KS   
 4 2015-01-04 shot             toy weapon    32 M      W     San Francisco CA   
 5 2015-01-04 shot             nail gun      39 M      H     Evans         CO   
 6 2015-01-04 shot             gun           18 M      W     Guthrie       OK   
 7 2015-01-05 shot             gun           22 M      H     Chandler      AZ   
 8 2015-01-06 shot             gun           35 M      W     Assaria       KS   
 9 2015-01-06 shot             unarmed       34 F      W     Burlington    IA   
10 2015-01-06 shot             toy weapon    47 M      B     Knoxville     PA   
# ℹ 5,096 more rows
# ℹ 4 more variables: signs_of_mental_illness <fct>, threat_level <fct>,
#   flee <fct>, body_camera <fct>

Examining the Data

Counting the Race, Gender and Age

# Race
police_shootings_mod1 %>%
  dplyr::count(race) %>%
  arrange(desc(n)) %>% 
  tt()
race n
W 2599
B 1363
H 931
A 93
N 78
O 42
# Gender
police_shootings_mod1 %>%
  dplyr::count(gender) %>%
  tt()
gender n
F 246
M 4860
# Age
police_shootings_mod1 %>%
  dplyr::count(age) %>%
  tt()
age n
6 2
12 1
13 2
14 3
15 12
16 29
17 51
18 100
19 84
20 81
21 105
22 120
23 126
24 153
25 182
26 145
27 183
28 164
29 168
30 162
31 178
32 169
33 168
34 177
35 159
36 156
37 152
38 127
39 135
40 114
41 117
42 95
43 89
44 83
45 112
46 88
47 97
48 83
49 79
50 79
51 72
52 65
53 60
54 50
55 51
56 60
57 47
58 50
59 52
60 35
61 35
62 32
63 24
64 18
65 21
67 17
68 13
69 14
70 9
71 7
72 5
73 4
74 5
75 4
76 8
78 1
79 4
80 2
81 2
82 1
83 2
84 4
91 2

Visualizing Race distribution

police_shootings_mod1 %>% 
gf_bar(~ fct_infreq(race), fill = "skyblue") %>%
  gf_labs(
     x = "Race",
    y = "Number of victims",
    title = "Race distribution of victims")

Observation

  • White individuals represent the majority (about half) of police shooting victims

  • Black individuals are the second largest group (about 1/4th), approximately half the frequency of White victims, followed by Hispanics

  • Asian, Native, and Other racial groups collectively represent only 4% of total incidents

Inference

  • The data suggests potential racial disparities in police shooting incidents

  • The 2:1 ratio between White and Black victims may indicate either population distribution patterns or systemic factors

  • When adjusted for population demographics, Black individuals may experience higher per-capita rates of police shootings

Visualizing Gender distribution

police_shootings_mod1 %>% 
gf_bar(~ gender, fill = "steelblue") %>%
  gf_labs(
     x = "Gender",
    y = "Number of victims",
    title = "Gender distribution of victims")

Observation

  • The gender disparity is extreme, with males being about 20 time more to be involved in police shootings than females

Inference

Males are disproportionately involved in situations that are critical and need the use of deadly force, potentially due to:

  • Higher rates of violent crime commission

  • Different patterns of resistance or confrontation with law enforcement

  • Occupational exposure (certain professions with police interaction)

Visualizing Age distribution

police_shootings_mod1 %>% 
  count(age) %>% 
gf_line(n ~ age, colour = "navyblue") %>%
  gf_labs(
     x = "Age",
    y = "Number of victims",
    title = "Age distribution of victims") %>% 
  gf_refine(
    scale_x_continuous(breaks = seq(0, 100, 5)) 
  )

Observation

  • There is a strongconcentrationofage between 20-40 years old, with peak incidence at ages 25-35

  • There is a sharpincrease from teenage years, peaking in mid-to-late 20s, then a gradual decline after age 35

  • Elderly victims are rare but present, with incidents documented up to age 91

Inference

  • The peak in late 20s and 30s aligns with ages of highest criminal offense rates and police encounters

  • This suggests that most incidents occur during police responses to criminal activity

Examining the Situational & Psychological Factors

# Manner of Death
police_shootings_mod1 %>%
  dplyr::count(manner_of_death) %>%
  tt()
manner_of_death n
shot 4829
shot and Tasered 277
# Armed status
police_shootings_mod1 %>%
  dplyr::count(armed) %>%
  arrange(desc(n)) %>% 
  tt()
armed n
gun 3008
knife 780
unarmed 390
toy weapon 193
vehicle 168
undetermined 95
unknown weapon 55
machete 44
Taser 27
sword 22
baseball bat 17
ax 16
hammer 16
gun and vehicle 15
gun and knife 14
metal pipe 14
screwdriver 13
box cutter 12
sharp object 12
hatchet 11
BB gun 9
gun and car 9
scissors 8
piece of wood 7
pipe 6
rock 6
shovel 6
blunt object 5
crossbow 5
meat cleaver 5
straight edge razor 5
vehicle and gun 5
baton 4
chair 4
crowbar 4
metal pole 4
pick-axe 4
samurai sword 4
chain 3
guns and explosives 3
metal object 3
metal stick 3
pellet gun 3
pole 3
Airsoft pistol 2
beer bottle 2
brick 2
flashlight 2
glass shard 2
gun and machete 2
hatchet and gun 2
lawn mower blade 2
metal hand tool 2
pitchfork 2
pole and knife 2
spear 2
tire iron 2
BB gun and vehicle 1
air conditioner 1
air pistol 1
barstool 1
baseball bat and bottle 1
baseball bat and fireplace poker 1
baseball bat and knife 1
bean-bag gun 1
binoculars 1
bottle 1
bow and arrow 1
car, knife and mace 1
carjack 1
chain saw 1
chainsaw 1
contractor's level 1
cordless drill 1
fireworks 1
flagpole 1
garden tool 1
grenade 1
gun and sword 1
hand torch 1
ice pick 1
incendiary device 1
knife and vehicle 1
machete and gun 1
metal rake 1
microphone 1
motorcycle 1
nail gun 1
oar 1
pen 1
pepper spray 1
railroad spikes 1
stapler 1
vehicle and machete 1
walking stick 1
wasp spray 1
wrench 1
# Flee Status
police_shootings_mod1 %>%
  dplyr::count(flee) %>%
  arrange(desc(n)) %>% 
  tt()
flee n
Not fleeing 3382
Car 775
Foot 754
Other 195
# Threat Level
police_shootings_mod1 %>%
  dplyr::count(threat_level) %>%
  tt()
threat_level n
attack 3371
other 1622
undetermined 113
# Mental Health
police_shootings_mod1 %>%
  dplyr::count(signs_of_mental_illness) %>%
  tt()
signs_of_mental_illness n
FALSE 3872
TRUE 1234
# Body Camera
police_shootings_mod1 %>%
  dplyr::count(body_camera) %>%
  tt()
body_camera n
FALSE 4390
TRUE 716

Observation

  • Manner of death: Most victims are shot. Some had to be tasered, is there a specific reason (like weapon type, threat level or mental illness which led to the outcome?)

  • Armed: Most victims were armed with guns, which explains the critical situation leading to being shot. This is followed by knifes and then unarmed

  • Flee: The majority of victims were not fleeing when shot

  • Threat level: About 2/3 of victims were perceived as attacking when shot

  • Signs of mental illness: About 1/4th of the victims have the signs of mental illness

  • Body camera: More than 75% of the victims don’t have body cameras.

Summarizing Full Table

summary(police_shootings_mod1)
      date            manner_of_death       armed                age       
 Min.   :2015-01-02   Length:5106        Length:5106        Min.   : 6.00  
 1st Qu.:2016-05-31   Class :character   Class :character   1st Qu.:27.00  
 Median :2018-01-11   Mode  :character   Mode  :character   Median :34.00  
 Mean   :2018-01-13                                         Mean   :36.58  
 3rd Qu.:2019-08-14                                         3rd Qu.:45.00  
 Max.   :2021-06-25                                         Max.   :91.00  
    gender              race               city              state          
 Length:5106        Length:5106        Length:5106        Length:5106       
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
 signs_of_mental_illness threat_level           flee           body_camera    
 Mode :logical           Length:5106        Length:5106        Mode :logical  
 FALSE:3872              Class :character   Class :character   FALSE:4390     
 TRUE :1234              Mode  :character   Mode  :character   TRUE :716      
                                                                              
                                                                              
                                                                              

Observation

  • The data is logged from the year 2015 to 2021

  • The youngest victim is 6 years and goes up to the age of 91. Average age is 34 years

Building Hypothesis

Is there any association between Race and Gender?

police_shootings_mod1 %>% 
    count(across(c("gender", "race"))) %>%
  tt()
gender race n
F A 4
F B 48
F H 29
F N 5
F O 3
F W 157
M A 89
M B 1315
M H 902
M N 73
M O 39
M W 2442
police_shootings_mod1 %>% 
gf_bar(~ fct_infreq(race), 
       fill = ~gender, 
       position = "dodge") %>%
  gf_labs(
     x = "Race",
    y = "Number of vicyims",
    title = "Race distribution of victims by Gender")

Observation

  • The proportion of females are distinctly less, hence there is not much association between the two

  • The data supports sociological patterns where males tend towards confrontational behaviors and are overrepresented in situations likely to escalate to lethal force

What is the Age Distribution across different Races?

police_shootings_mod1 %>% 
  gf_density(~age, color = ~race, fill = ~race, alpha = 0.2) %>%
  gf_labs(
    title = "Age Distribution by Race",
    x = "Age",
    y = "Density"
  )%>%
  gf_refine(
    scale_x_continuous(breaks = seq(0, 100, 5))
  )

police_shootings_mod1 %>% 
    count(across(c("age", "race"))) %>%
  tt()
age race n
6 W 2
12 W 1
13 B 1
13 H 1
14 H 2
14 N 1
15 A 1
15 B 5
15 H 3
15 W 3
16 A 2
16 B 11
16 H 6
16 W 10
17 B 23
17 H 12
17 W 16
18 A 3
18 B 48
18 H 23
18 N 1
18 O 3
18 W 22
19 A 2
19 B 39
19 H 14
19 N 2
19 W 27
20 A 2
20 B 32
20 H 14
20 N 2
20 O 2
20 W 29
21 A 2
21 B 48
21 H 26
21 N 2
21 W 27
22 A 2
22 B 43
22 H 37
22 N 1
22 O 2
22 W 35
23 A 1
23 B 50
23 H 24
23 N 2
23 O 1
23 W 48
24 B 61
24 H 30
24 N 4
24 O 1
24 W 57
25 A 2
25 B 66
25 H 31
25 N 5
25 O 3
25 W 75
26 A 4
26 B 36
26 H 33
26 N 1
26 O 1
26 W 70
27 A 3
27 B 65
27 H 41
27 N 7
27 O 3
27 W 64
28 A 4
28 B 58
28 H 41
28 N 3
28 O 1
28 W 57
29 A 1
29 B 50
29 H 38
29 N 2
29 O 3
29 W 74
30 A 1
30 B 47
30 H 28
30 N 2
30 O 3
30 W 81
31 A 2
31 B 60
31 H 26
31 N 2
31 O 1
31 W 87
32 A 4
32 B 49
32 H 28
32 N 6
32 O 1
32 W 81
33 A 2
33 B 49
33 H 35
33 N 3
33 O 1
33 W 78
34 A 3
34 B 39
34 H 35
34 N 3
34 O 2
34 W 95
35 A 6
35 B 39
35 H 39
35 N 3
35 W 72
36 A 3
36 B 29
36 H 31
36 N 4
36 O 2
36 W 87
37 A 2
37 B 41
37 H 41
37 N 2
37 W 66
38 A 3
38 B 28
38 H 34
38 N 1
38 O 1
38 W 60
39 A 2
39 B 43
39 H 26
39 N 2
39 W 62
40 A 1
40 B 21
40 H 21
40 N 1
40 O 1
40 W 69
41 A 2
41 B 27
41 H 19
41 N 1
41 O 1
41 W 67
42 A 2
42 B 16
42 H 18
42 N 1
42 O 1
42 W 57
43 A 1
43 B 19
43 H 9
43 N 4
43 W 56
44 A 3
44 B 13
44 H 17
44 N 2
44 W 48
45 A 2
45 B 18
45 H 21
45 N 1
45 O 1
45 W 69
46 A 1
46 B 18
46 H 17
46 N 1
46 O 1
46 W 50
47 A 2
47 B 22
47 H 13
47 W 60
48 A 2
48 B 18
48 H 8
48 O 2
48 W 53
49 A 2
49 B 13
49 H 9
49 N 1
49 W 54
50 A 2
50 B 11
50 H 12
50 N 1
50 W 53
51 A 1
51 B 9
51 H 9
51 N 1
51 O 1
51 W 51
52 A 2
52 B 11
52 H 11
52 W 41
53 A 3
53 B 6
53 H 2
53 N 1
53 W 48
54 A 1
54 B 6
54 H 3
54 N 1
54 O 1
54 W 38
55 A 2
55 B 11
55 H 9
55 W 29
56 A 1
56 B 5
56 H 3
56 O 1
56 W 50
57 B 10
57 H 6
57 W 31
58 B 3
58 H 4
58 N 1
58 W 42
59 A 1
59 B 2
59 H 4
59 O 1
59 W 44
60 A 3
60 B 6
60 H 1
60 W 25
61 A 1
61 B 9
61 H 1
61 W 24
62 A 1
62 B 4
62 H 3
62 W 24
63 B 6
63 H 1
63 W 17
64 B 2
64 H 1
64 W 15
65 B 2
65 H 3
65 W 16
67 B 5
67 W 12
68 B 5
68 W 8
69 B 1
69 H 3
69 W 10
70 B 1
70 H 1
70 W 7
71 H 1
71 W 6
72 B 1
72 W 4
73 H 1
73 W 3
74 B 2
74 W 3
75 W 4
76 W 8
78 W 1
79 W 4
80 H 1
80 W 1
81 W 2
82 W 1
83 W 2
84 W 4
91 W 2
police_shootings_mod1 %>% 
gf_boxplot(age ~ race, fill = ~race, orientation = "x") %>%
  gf_labs(
    x = "Race",
    y = "Age",
    title = "Age distribution across Races"
  )

Observation

  • Black and Others are usually younger in age

  • White and Asian victims are more spread out and Whites have higher median age

  • Hispanics have 2 peaks in their age - around 25s and 40s

  • Blacks and Natives show a distinct peak at around 25 and 30 respectively

Inference

  • Black & Natives: Young adulthood is the highest danger period, likely linked to street policing and urban violence exposure

    White & Asian Pattern: Risk spreads across all adult ages, suggesting more varied incidents like mental health crises, domestic disputes, and elderly situations

    Hispanics: Two different risk groups are seen in younger and older people

What is the association between different situational factors

# Race vs. Manner of Death
vcd::structable(race ~ manner_of_death, data = police_shootings_mod1) %>%
  as.matrix() %>%
  addmargins() %>%
  as_tibble(rownames = "Race")
# A tibble: 3 × 8
  Race                 A     B     H     N     O     W   Sum
  <chr>            <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 shot                85  1283   878    74    38  2471  4829
2 shot and Tasered     8    80    53     4     4   128   277
3 Sum                 93  1363   931    78    42  2599  5106
vcd::structable(race ~ manner_of_death, data = police_shootings_mod1) %>%
  vcd::mosaic(shade = TRUE, legend = TRUE,
              main = "Race vs. Manner of Death",
              gp = shading_max)

# Gender vs. Manner of Death
vcd::structable(gender ~ manner_of_death, data = police_shootings_mod1) %>%
  as.matrix() %>%
  addmargins() %>%
  as_tibble(rownames = "Gender")
# A tibble: 3 × 4
  Gender               F     M   Sum
  <chr>            <dbl> <dbl> <dbl>
1 shot               237  4592  4829
2 shot and Tasered     9   268   277
3 Sum                246  4860  5106
vcd::structable(gender ~ manner_of_death, data = police_shootings_mod1) %>%
  vcd::mosaic(shade = TRUE, legend = TRUE,
              main = "Gender vs. Manner of Death",
              gp = shading_max)

# Manner of Death vs. Weapon type
police_shootings_mod1 %>%
  mutate(armed_top5 = fct_lump_n(armed, 5)) %>%
  vcd::structable(manner_of_death ~ armed_top5, data = .) %>%
  as.matrix() %>%
  addmargins() %>%
  as_tibble(rownames = "Manner of Death")
# A tibble: 7 × 4
  `Manner of Death`  shot `shot and Tasered`   Sum
  <chr>             <dbl>              <dbl> <dbl>
1 gun                2962                 46  3008
2 knife               665                115   780
3 toy weapon          190                  3   193
4 unarmed             346                 44   390
5 vehicle             167                  1   168
6 Other               499                 68   567
7 Sum                4829                277  5106
police_shootings_mod1 %>%
  mutate(armed_top5 = fct_lump_n(armed, 5)) %>%
  vcd::structable(manner_of_death ~ armed_top5, data = .) %>%
  vcd::mosaic(shade = TRUE, legend = TRUE,
              main = "Manner of Death vs. Top 5 Weapon Types",
              gp = shading_max)

# Manner of Death vs. Threat level
vcd::structable(manner_of_death ~ threat_level, data = police_shootings_mod1) %>%
  vcd::mosaic(shade = TRUE, legend = TRUE,
              main = "Manner of Death vs. Threat Level",
              gp = shading_max)

# Manner of Death vs. Mental Illness
vcd::structable(manner_of_death ~ signs_of_mental_illness, data = police_shootings_mod1) %>%
  vcd::mosaic(shade = TRUE, legend = TRUE,
              main = "Manner of Death vs. Mental Illness Signs",
              gp = shading_max)

Observation

  • Race and Gender don’t have any distinct association with the manner of death

  • Victims having gun and those perceived as threatening, show a distinct negative correlation with being shot and tasered meaning some distance had to be maintained

  • Those who are armed with knife, show positive correlation being shot and tasered

  • Victims showing signs of mental illness shows have a positive correlation with being shot and tasered

Inference

  • Weapon type and mental state determine police tactics. Guns trigger immediate lethal response, while other weapons/conditions allow for attempted intermediate options.

Observation

  • ksjeh

  • ksejrh

  • ksj

Inference

  • sldfkj

  • sldkfj

  • sjdfh

How does Age influence other factors?

Association of age with fleeing and weapon type

police_shootings_mod1 %>%
  mutate(
    flee = fct_infreq(flee),      
    armed = fct_lump_n(armed, 5)
  ) %>%
  gf_boxplot(age ~ flee,
             fill = ~armed,
             orientation = "x" ) %>%
  gf_labs(
    title = "Age Distribution by Flee type and Weapon status",
    x = "Flee Type",
    y = "Age of Victim",
    fill = "Weapon Type"
  )

Age distribution by Flee status

police_shootings_mod1 %>%
  gf_density(~age, color = ~flee, fill = ~flee, alpha = 0.3) %>%
  gf_labs(title = "Age Distribution by Flee Status") %>%
  gf_refine(
    scale_color_brewer(palette = "Set1"),
    scale_x_continuous(breaks = seq(0, 100, 5))
  )

Observation

  • Those “Not fleeing” or fleeing by car appear slightly older on average than those fleeing on foot.

  • Median ages for “Not fleeing” and “Car” are higher, while “Foot” and “Other” categories have lower medians.

  • Guns and knives dominate across all flee types.

  • Unarmed individuals appear in every flee category, often with similar or slightly lower median ages compared to armed groups.

  • The variation is generally higher for armed individuals, indicating broader age diversity among armed victims.

  • A few older individuals (60+) appear across all flee types, particularly in the “Not fleeing” group.

Inference

  • There is no strong age-flee interaction, but the slightly higher median ages among “Not fleeing” cases could indicate that older victims are less likely to attempt escape.

  • The consistent presence of unarmed victims across flee types highlights that weapon possession is not always tied to fleeing behavior.

  • The broad age spread, especially in “Not fleeing” cases, suggests police encounters affect a wide demographic, not limited to younger populations.

Age distribution over the years for different Races

police_shootings_mod1 %>%
  mutate(year = as.numeric(format(date, "%Y"))) %>%
  gf_point(age ~ year, alpha = 0.5) %>%
  gf_lm(age ~ year) %>%
  gf_facet_wrap(~ race) %>%
  gf_labs(
    title = "Age Distribution Over Years by Race",
    x = "Year",
    y = "Age"
  )

Observation

  • Across all racial groups , most victims fall in the 25–45 age range, with only slight variations.

  • The median age remains relatively stable over time (2015–2020) with no significant increase or decrease in any group.

  • White and Black individuals have the highest representation and wider spread in age distribution.

  • Other races (Asian, Native, Hispanic, etc.) show smaller sample sizes and thus more variability.

police_shootings_mod1 %>%
  count(city, state) %>%
  arrange(desc(n)) %>%
  head(15) %>%  # Top 15 locations
  unite(location, city, state, sep = ", ") %>%
  select(location) %>%
  inner_join(police_shootings_mod1 %>% 
              unite(location, city, state, sep = ", "), 
            by = "location") %>%
  gf_boxplot(age ~ fct_reorder(location, age), fill = "forestgreen", orientation = "x") %>%
  gf_labs(
    title = "Age Distribution by Location (Top 15 Cities)",
    x = "City, State",
    y = "Age"
  ) %>%
  gf_theme(axis.text.x = element_text(angle = 90, hjust = 1))

Observation

  • Some cities like Chicago, Columbus, and St. Louis have younger victims, while Miami and Las Vegas show slightly higher medians.

  • The spread varies, larger in cities like Los Angeles and Jacksonville, suggesting more diverse victim age groups there.

How does location of a State and city determine the shootings?

police_shootings_mod1 %>%
  count(city, state) %>%
  arrange(desc(n)) %>%
  head(30) %>%  # Taking top 30 locations
  unite(location, city, state, sep = ", ") %>%
  gf_point(n ~ fct_reorder(location, n), size = ~n, color = "red") %>%
  gf_labs(
    title = "Police Shootings by Location",
    x = "City, State",
    y = "Number of Shootings"
  ) %>%
  gf_theme(axis.text.x = element_text(angle = 90, hjust = 1))

Observation

  • Los Angeles, CA reports the highest number of police shootings, followed by Phoenix, AZ, Houston, TX, and Las Vegas, NV.

  • The distribution is right-skewed where a few major cities account for a disproportionately large share of total shootings.

  • Many smaller or mid-size cities (e.g., Colorado Springs, Charlotte) have fewer than 20 recorded incidents.

Inference

  • Police shootings predominantly involve young to middle-aged individuals (25–40 years old) across most races and cities, with no major age trend changes over time. However, incidents are geographically concentrated, a small number of large urban centers (like Los Angeles and Phoenix) account for the bulk of cases, indicating regional disparities that may be tied to population density, policing intensity, or systemic factors.

How does perceived threat level influence the shootings

police_shootings_mod1 %>%
  count(city, state, threat_level) %>%
  group_by(city, state) %>%
  mutate(total = sum(n)) %>%
  filter(total >= 10) %>%  # Cities with 10+ incidents
  ungroup() %>%
  arrange(desc(total)) %>%
  head(50) %>%
  unite(location, city, state, sep = ", ") %>%
  gf_col(n ~ location, fill = ~threat_level, position = "fill") %>%
  gf_theme(axis.text.x = element_text(angle = 90, hjust = 1)) %>%
  gf_labs(title = "Threat Level Distribution by City")

Observation

  • Across most cities, the attack threat level dominates, with smaller proportions of other and undetermined cases.

  • Cities like Los Angeles, Phoenix, and San Antonio show particularly high instances of attack classified incidents, while places like Columbus and Jacksonville have slightly higher variation across categories.

Inference

  • This suggests that police encounters in many urban centers are predominantly categorized as threats requiring lethal or aggressive response levels.

  • The consistency across locations might indicate uniform policy interpretation rather than contextual variations in local threat assessment.

Influence of Weapons on different factors

police_shootings_mod1 %>%
  mutate(weapon_cat = fct_lump_n(armed, 6)) %>%
  count(race, weapon_cat) %>%
  group_by(race) %>%
  mutate(prop = n / sum(n)) %>%
  gf_tile(prop ~ race + weapon_cat) %>%
  gf_labs(
    title = "Weapon Type Distribution by Race",
    x = "Race",
    y = "Weapon Type",
    fill = "Proportion"
  )

Observation

  • The heatmap shows that guns are the most common weapon type across all racial groups, particularly pronounced among Black and White victims.

  • Non-firearm weapons (like knives and toy weapons) appear far less frequently, with a few variations among racial groups.

Inference

  • This pattern could indicate both a systemic emphasis on firearm-related encounters and possible disparities in perceived weapon possession across racial groups.

  • The high prevalence of gun-related cases may also reflect broader accessibility and cultural factors related to weapon ownership.

police_shootings_mod1 %>%
  mutate(weapon_cat = fct_lump_n(armed, 5)) %>%
  count(weapon_cat, threat_level) %>%
  gf_col(n ~ weapon_cat, fill = ~threat_level, position = "dodge") %>%
  gf_labs(
    title = "Weapon Type vs Threat Level",
    x = "Weapon Type",
    y = "Count",
    fill = "Threat Level"
  ) %>%
  gf_theme(axis.text.x = element_text(angle = 45, hjust = 1))

Observation

  • Individuals categorized as notfleeing form the largest proportion across all weapon types, especially for guns, knives, and unarmed cases.

  • Vehicle-related cases show relatively more instances of fleeing by car, while foot fleeing is modestly present across some categories.

Influence

  • This could suggest that most encounters leading to recorded incidents occur when suspects are stationary or restrained rather than actively escaping. However, the relatively higher “car fleeing” in vehicle-related incidents might point to situational factors influencing pursuit and response dynamics.
police_shootings_mod1 %>%
  mutate(weapon_cat = fct_lump_n(armed, 8)) %>%
  count(weapon_cat, flee) %>%
  group_by(weapon_cat) %>%
  mutate(prop = n / sum(n)) %>%
  gf_col(prop ~ weapon_cat, fill = ~flee, position = "fill") %>%
  gf_labs(
    title = "Flee Behavior by Weapon Type",
    subtitle = "Proportion of flee status for each weapon category",
    x = "Weapon Type",
    y = "Proportion",
    fill = "Flee Status"
  ) %>%
  gf_theme(axis.text.x = element_text(angle = 45, hjust = 1))

Observation

  • Gun-related incidents overwhelmingly fall under the attack threat level, followed by knives and unarmed cases at much lower frequencies.

  • Toy weapons and vehicles have very few attack classifications, and undetermined cases remain minimal overall.

Inference

  • Weapons perceived as more dangerous (like firearms) are strongly associated with escalated threat responses. This pattern could imply that the presence of a gun, regardless of situation, significantly biases threat assessment, leading to higher rates of lethal force justification.

Conclusion

  • Demographic Trends: Majority of victims are male (25–40 years); Black individuals face disproportionately higher risk relative to population size.

  • Situational Insights: Most victims were armed, often with guns, but a notable share were unarmed or mentally ill, raising questions about threat assessment and escalation.

  • Behavioral Patterns: Many were not fleeing, showing that fatal shootings often occur in static encounters, not active chases.

  • Threat Perception:Attack” is the most frequent threat label, especially when firearms are involved — suggesting strong bias toward perceiving high threat.

  • Geographic Trends: Incidents cluster in urban hubs like Los Angeles, Houston, and Phoenix, showing concentration in large metro areas.

    The data suggests that systemic patterns, not isolated incidents, underlie police shootings in the U.S. While armed confrontations explain part of the trend, biases in threat perception, inadequate handling of mental health crises, and regional concentration of incidents all point to the need for deeper structural reform.