Examining Racial Bias in Police Stops in Phoenix

Racial disparities in traffic stops are a critical area of inquiry as traffic stops are the most frequent form of police-initiated contact. However, traditional benchmarking methods used to assess racial bias—such as comparing stop rates to census demographics—fail to account for variations in driving exposure, police deployment, and temporal patterns.

To address these methodological shortcomings, this analysis uses the veil of darkness (VOD) hypothesis originally developed by Grogger and Ridgeway (2006). The VOD hypothesis leverages a natural experiment that capitalizes on changes in daylight across the year to evaluate whether officers are more likely to stop drivers of color during times when their race is visible (daylight) versus not (darkness).

Following best practices articulated by Knode et al. (2024), I apply this method to traffic citation data from the Phoenix Open Data Portal. This dataset contains traffic citation information (criminal and civil) from January 2018 forward, including demographic information for officers as well as individuals. I test for racial bias, focusing on stops during the “intertwilight period” where both daylight and darkness occur at the same clock times throughout the year. This approach allows for a more rigorous assessment of racial bias in traffic stop decisions, free from the confounding factors that hinder traditional analyses.

Setup and Loading the Data

First we need to clear the workspace and load the libraries we are going to use.

Load and Prepare the Data

loc <- "data/citations_traffic-citations-details_citationdetail.csv"
phoenix_dat <- read.csv( here( loc ), as.is = TRUE, header = TRUE )

There are 323336 traffic incidents from 2018 to 2025.

Extract and Process Time Variables

phoenix_dat <- phoenix_dat  |> 
  mutate(
    stop_date = mdy( TICK_DATE ),
    stop_time = as_hms( sprintf( "%02d:00:00", HOUR_OF_DAY ) )
  )

Get Sunset and Dusk Times for Phoenix

sun_dates <- tibble(
  date = unique( phoenix_dat$stop_date ),
  lat = 33.4484,
  lon = -112.0740
)

sun_times <- getSunlightTimes(
  data = sun_dates,
  keep = c( "sunset", "dusk" ),
  tz = "America/Phoenix"
)

sun_times <- sun_times |> 
  select( date, sunset, dusk )

phoenix_dat <- phoenix_dat  |> 
  left_join( sun_times, by = c( "stop_date" = "date" ) )

Code Lighting Condition

phoenix_dat <- phoenix_dat  |> 
  mutate(
    sunset = as_hms( sunset ),
    dusk = as_hms( dusk ),
    light_condition = case_when(
      stop_time < sunset ~ "daylight",
      stop_time > dusk ~ "darkness",
      TRUE ~ "ambiguous"
    )
  ) |> 
  filter( light_condition != "ambiguous" )

Clean a Few Variables

There are two variables that are of interest for our analysis:

SIMPLE_SUBJ_RE_GRP is a simple recoding of the SUBJ_RACE and SUBJ_ETHNICITY variables provided in the data. Specifically, this is the race/ethnicity of individuals grouped into four categories: Black or African American, Hispanic, White, Other. To create the simplified race/ethnicity grouping, individuals with an ethnicity entered as “Hispanic” were included in the “Hispanic” category, regardless of the race entered. Individuals with Black or African American entered even if a blend such as Black/ White or American Indian/Black are included in the “Black or African American” category.
SIMPLE_EMPL_RE_GRP is similar to SIMPE_SUBJ_RE_GRP, but pertains to the officer, not the citizen.

For these variables there are 573 in which the citizen’s race/ethnicity is listed as Unknown and 10486 listed as Other. These are removed from the analysis. Unknown and Other represent 0 and 3, percent of all cases, respectively. Also, there are 11893 in which the officer’s race/ethnicity is listed as Other.

These incidents are coded as missing (i.e. NA) and removed from the analysis.

# recode "unknown" and "other" to missing to make it simpler
phoenix_dat <- phoenix_dat |> 
  mutate( SIMPLE_SUBJ_RE_GRP = if_else( SIMPLE_SUBJ_RE_GRP == "Unknown", NA, SIMPLE_SUBJ_RE_GRP ) ) |> 
  mutate( SIMPLE_SUBJ_RE_GRP = if_else( SIMPLE_SUBJ_RE_GRP == "Other"  , NA, SIMPLE_SUBJ_RE_GRP ) ) |> 
  mutate( SIMPLE_EMPL_RE_GRP = if_else( SIMPLE_EMPL_RE_GRP == "Other"  , NA, SIMPLE_EMPL_RE_GRP ) )

# drop missing cases
phoenix_dat <- phoenix_dat  |> 
  filter( !is.na( SIMPLE_SUBJ_RE_GRP ) )  |> 
  mutate( driver_black = if_else( SIMPLE_SUBJ_RE_GRP == "Black", 1, 0 ) ) |> 
  mutate( driver_hisp  = if_else( SIMPLE_SUBJ_RE_GRP == "Hispanic", 1, 0 ) ) |>
  mutate( driver_white = if_else( SIMPLE_SUBJ_RE_GRP == "White", 1, 0 ) )

Create Daylight Dummy Variable

phoenix_dat <- phoenix_dat  |> 
  mutate( daylight = if_else( light_condition == "daylight", 1, 0 ) )

Tables for Stops

Now that the data are prepared, let’s first look at a bivariate table for daytime for night and driver race/ethnicity.

# Summarize race proportions by light condition
race_pct_table <- phoenix_dat  |> 
  group_by(daylight, SIMPLE_SUBJ_RE_GRP) |> 
  summarize(n = n(), .groups = "drop")  |> 
  group_by(daylight)  |> 
  mutate(pct = round(n / sum(n) * 100, 2))  |> 
  select(-n) |> 
  pivot_wider(names_from = SIMPLE_SUBJ_RE_GRP, values_from = pct) |> 
  rename(Category = daylight) |> 
  mutate(Category = case_when(
    Category == 1 ~ "Daylight",
    Category == 0 ~ "Darkness",
    TRUE ~ as.character(Category)
  )) %>%
  select(Category, Black, Hispanic, White)

# Create total row
total_row <- phoenix_dat %>%
  filter(SIMPLE_SUBJ_RE_GRP %in% c("Black", "Hispanic", "White")) %>%
  count(SIMPLE_SUBJ_RE_GRP) %>%
  mutate(pct = round(n / sum(n) * 100, 2)) %>%
  select(-n) %>%
  pivot_wider(names_from = SIMPLE_SUBJ_RE_GRP, values_from = pct) %>%
  mutate(Category = "All Stops") %>%
  select(Category, Black, Hispanic, White)

# Combine and order rows
race_pct_table <- bind_rows(total_row, race_pct_table) %>%
  mutate(Category = factor(Category, levels = c("All Stops", "Daylight", "Darkness"))) %>%
  arrange(Category)

# Display using pander
pander( race_pct_table, caption = "Percentage of Stops by Race and Light Condition" )

Percentage of Stops by Race and Light Condition
Category	Black	Hispanic	White
All Stops	13.82	44.28	41.9
Daylight	13.56	43.05	43.39
Darkness	15.22	50.94	33.84

The table shows the percentage of stops by race/ethnicity.

The All Stops row represents the percentage of all stops that the specific racial/ethnic group represents.
- Citizens who were coded as Black made up 13.82 percent of all traffic stops, whereas those coded as Hispanic made up 44.28 percent.

The Darkness row represents the percentage of stops that were coded as dark and the Daylight row represents the percentage of stops that were coded as light. Comparing these within racial/ethnic groups gives some indication of whether there are differences in citations by day versus night.
- Citizens who were coded as Black represent 13.56 percent of traffic stops during the day whereas this same group represents 15.22 percent of traffic stops at night. In other words, citizens who are Black are less likely to be pulled over during the day, than at night.
- Citizens who were coded as Hispanic represent 43.05 percent of traffic stops during the day whereas this same group represents 50.94 percent of traffic stops at night. In other words, citizens who are Hispanic are less likely to be pulled over during the day, than at night.

Note that these differences do not adjust for any situational characteristics. In the next steps, we want to weight the days by the amount of sun as suggested by Knode et al. (2024).

Weighting for Seasonality

weights_by_day <- phoenix_dat %>%
  group_by( stop_date ) %>%
  summarize( weight = 1 - abs( mean( daylight ) - 0.5 ) )

phoenix_dat <- phoenix_dat %>%
  left_join( weights_by_day, by = "stop_date" )

phoenix_dat <- phoenix_dat %>%
  mutate(
    day_of_week = wday( stop_date, label = TRUE ),
    stop_hour = hour( stop_time ),
    officer_white = if_else( SIMPLE_EMPL_RE_GRP  == "White",    1, 0 ),
    officer_black = if_else( SIMPLE_EMPL_RE_GRP  == "Black",    1, 0 ),
    officer_hisp  = if_else( SIMPLE_EMPL_RE_GRP  == "Hispanic", 1, 0 ),
    officer_male  = if_else( ISSUING_OFFICER_SEX == "Male",    1, 0 ),
    subj_male     = if_else( SUBJ_SEX            == "Male",    1, 0 )
  )

Predicting Driver Race

Now, we can estimate whether daylight (versus night) predicts a stop. Specifically, if Black or Hispanic drivers are more likely to be cited during the day (i.e. a positive coefficient for daylight) then there is evidence of racial bias in citations.

Simple Model for Daylight

# model for black
model_black <- glm( driver_black 
              ~ daylight ,
             data = phoenix_dat,
             weights = weight,
             family = binomial() )


# model for hispanic
model_hisp <- glm( driver_hisp 
              ~ daylight ,
             data = phoenix_dat,
             weights = weight,
             family = binomial() )


# model for white
model_white <- glm( driver_white 
              ~ daylight ,
             data = phoenix_dat,
             weights = weight,
             family = binomial() )

# Updated function: extract estimate, SE, stars
exclude_fx <- function(model, label) {
  tidy(model) %>%
    mutate(
      formatted = sprintf("%.3f (%.3f)%s",
                          estimate,
                          std.error,
                          case_when(
                            p.value < 0.001 ~ "***",
                            p.value < 0.01  ~ "**",
                            p.value < 0.05  ~ "*",
                            TRUE ~ ""
                          ))
    ) %>%
    select(term, !!label := formatted)
}

# Get formatted results
black_terms <- exclude_fx(model_black, "Black Driver")
hisp_terms  <- exclude_fx(model_hisp,  "Hispanic Driver")
white_terms <- exclude_fx(model_white, "White Driver")

# Combine into single comparison table
combined <- full_join(black_terms, hisp_terms, by = "term") %>%
  full_join(white_terms, by = "term") %>%
  arrange(term)

# Reorder so daylight comes right after the intercept
combined <- combined %>%
  mutate(term = factor(term, levels = c("(Intercept)", "daylight", setdiff(term, c("(Intercept)", "daylight"))))) %>%
  arrange(term)

# Display with pander
pander( combined, caption = "Models Predicting Driver Race/Ethnicity among Traffic Stops in Phoenix, Coefficients with Standard Errors")

Models Predicting Driver Race/Ethnicity among Traffic Stops in Phoenix, Coefficients with Standard Errors
term	Black Driver	Hispanic Driver	White Driver
(Intercept)	-1.735 (0.015)***	0.059 (0.011)***	-0.684 (0.011)***
daylight	-0.117 (0.017)***	-0.327 (0.012)***	0.407 (0.012)***

The table above shows three models: one predicting whether the driver is Black, one predicting whether the driver is Hispanic, and one predicting whether the driver is White.

The only predictor in the model is daylight, which is a dummy variable for whether it is daytime (=1) or night (=0). The negative coefficients predicting whether the driver is Black or Hispanic indicates that Black and Hispanic drivers are less likely to be pulled over during the day than at night. In contrast, we see that White drivers, relative to Black and Hispanic drivers are more likely to be pulled over during the day.

Including Additional Predictors

Let’s take a look at the models. Each model includes controls for day of the week, the hour of the stop, officer race/ethnicity, officer sex, driver sex, driver age, year, and beat (much of these are excluded from the table below to help with visualization).

# model for black
model_black <- glm( driver_black 
              ~ daylight 
              + officer_black + officer_hisp + officer_male 
              + subj_male + SUBJ_AGE
              + factor( DAY_OF_WEEK )
              + stop_hour
              + factor( YEAR ) + factor( BEAT_NUM ),
             data = phoenix_dat,
             weights = weight,
             family = binomial() )


# model for hispanic
model_hisp <- glm( driver_hisp 
              ~ daylight 
              + officer_black + officer_hisp + officer_male 
              + subj_male + SUBJ_AGE
              + factor( DAY_OF_WEEK )
              + stop_hour
              + factor( YEAR ) + factor( BEAT_NUM ),
             data = phoenix_dat,
             weights = weight,
             family = binomial() )


# model for white
model_white <- glm( driver_white 
              ~ daylight 
              + officer_black + officer_hisp + officer_male 
              + subj_male + SUBJ_AGE
              + factor( DAY_OF_WEEK )
              + stop_hour
              + factor( YEAR ) + factor( BEAT_NUM ),
             data = phoenix_dat,
             weights = weight,
             family = binomial() )

# Updated function: extract estimate, SE, stars
exclude_fx <- function(model, label) {
  tidy(model) %>%
    filter(!grepl("^factor\\(YEAR\\)|^factor\\(BEAT_NUM\\)|^factor\\(DAY_OF_WEEK\\)", term)) %>%
    mutate(
      formatted = sprintf("%.3f (%.3f)%s",
                          estimate,
                          std.error,
                          case_when(
                            p.value < 0.001 ~ "***",
                            p.value < 0.01  ~ "**",
                            p.value < 0.05  ~ "*",
                            TRUE ~ ""
                          ))
    ) %>%
    select(term, !!label := formatted)
}

# Get formatted results
black_terms <- exclude_fx(model_black, "Black Driver")
hisp_terms  <- exclude_fx(model_hisp,  "Hispanic Driver")
white_terms <- exclude_fx(model_white, "White Driver")

# Combine into single comparison table
combined <- full_join(black_terms, hisp_terms, by = "term") %>%
  full_join(white_terms, by = "term") %>%
  arrange(term)

# Reorder so daylight comes right after the intercept
combined <- combined %>%
  mutate(term = factor(term, levels = c("(Intercept)", "daylight", setdiff(term, c("(Intercept)", "daylight"))))) %>%
  arrange(term)

# Display with pander
pander( combined, caption = "Models Predicting Driver Race/Ethnicity among Traffic Stops in Phoenix, Coefficients with Standard Errors")

Models Predicting Driver Race/Ethnicity among Traffic Stops in Phoenix, Coefficients with Standard Errors
term	Black Driver	Hispanic Driver	White Driver
(Intercept)	-2.368 (0.097)***	-0.216 (0.066)***	-0.379 (0.063)***
daylight	-0.161 (0.024)***	-0.144 (0.018)***	0.267 (0.019)***
SUBJ_AGE	-0.003 (0.000)***	-0.025 (0.000)***	0.027 (0.000)***
officer_black	-0.030 (0.028)	0.061 (0.021)**	-0.038 (0.023)
officer_hisp	-0.026 (0.018)	0.044 (0.013)***	-0.030 (0.014)*
officer_male	0.052 (0.023)*	-0.076 (0.017)***	0.048 (0.018)**
stop_hour	-0.009 (0.002)***	-0.007 (0.001)***	0.014 (0.001)***
subj_male	-0.039 (0.014)**	0.156 (0.011)***	-0.146 (0.011)***

This table shows that the findings from the bivariate table are robust to the inclusion of a variety of covariates. That is, the negative coefficients predicting whether the driver is Black or Hispanic indicates that Black and Hispanic drivers are less likely to be pulled over during the day than at night. In contrast, we see that White drivers, relative to Black and Hispanic drivers are more likely to be pulled over during the day.

Thoughts…

Recall that the VOD hypothesis leverages a natural experiment that capitalizes on changes in daylight across the year to evaluate whether officers are more likely to stop drivers of color during times when their race is visible (daylight) versus not (darkness). This analysis showed that there is not evidence in the Phoenix traffic citation data to indicate that Black or Hispanic drivers are more likely to be pulled over during the day.

Back to Open Criminology Phoenix page

Please report any needed corrections to the Issues page. Thanks!