Introduction

The veil of darkness (VOD), first designed by Grogger and Ridgeway in 2006, is a methodology for analyzing police traffic stops to discern whether racially disparate traffic stop behavior is present. The VOD leverages a natural experiment in seasonal variation of daylight over the course of the year. Daylight serves as the treatment condition, darkness as the control, and driver race acts as the outcome. In this way, if drivers of a racial group are overrepresented during daylight, net of controls, it may be because racial profiling is occurring.

This walkthrough is based on the roadmap outlined in our 2024 article and is meant to serve as a reproducible demonstration for researchers and practitioners to learn how to conduct analyses of their own. Some of the details of our decision making in this analysis are more briefly outlined than within this article, and we encourage those looking for more elaboration to review the article and its dedicated supplementary materials. The dataset we use in this walkthrough is accessible [insert where folks can access this] and consists of de-identified and blurred traffic stop data.

Load in necessary packages

You will need to install several packages to run this code and conduct these analyses, including: - readr to import our dataset.
- lubridate for mathematical manipulation of time and date variables.
- dplyr for basic data management.
- suncalc for generating sun times (e.g., sunset, dusk)
- Epi to create our time spline variable, used to account for time of day in analyses

library(readr)
library(lubridate)
library(dplyr)
library(suncalc)
library(Epi)

Once these packages are loaded, set your working directory to the folder you have the sample dataset in and run the code below to import it.

data <- readr::read_csv("SampleData.csv")

A Note on Data Cleaning…

All datasets are different. The type of cleaning necessary for your data will heavily depend on the fields within the administrative dataset you’re working with. The sample data for this demonstration has already largely been cleaned, as this walkthrough is meant to demonstrate how to conduct a VOD analysis rather than how to clean administrative data. When working with your own data, consider the following questions to make sure your data is prepared for analyses:
- Do the data include pedestrian stops, or only traffic stops? We want to ensure we’re only using traffic stops when conducting analyses.
- Are there leading or trailing white spaces with any of the data?
- Are there any misspellings with data entries?
- Are there any duplicated stops in the dataset?
- Do stops only have one vehicle or multiple vehicles?
- Do any stops have multiple officers on-site, potentially leading to duplicated entries?
- Do stops log multiple individuals’ race, not just the driver’s?
- Should certain variables be coded the same even though they are labeled differently?
- Do we have latitude and longitude for each stop? If not, can we generate latitudes and longitudes that approximate the stop? (e.g., for the city/town, for the county)

Exclusion criteria

The first thing we’ll need to do is remove stops that should not be included in a VOD analysis. For example, if a stop does not involve Discretion (e.g., those resulting from a warrant), it should be excluded.

data <- data[data$Discretion != "None",] 

After removing those 85 stops, we should likewise remove the 610 stops where Race was marked as Unknown, as race is our key outcome variable.

data <- data[data$Race != "Unknown",] 

These are the exclusion criteria for this sample dataset, but keep the questions on data cleaning in mind when excluding stops. For example, are any of the stops included in your dataset conducted for pedestrians or aquatic vehicles? If so, it may be vital for your analyses that these are excluded.

Generating our intertwilight period (ITP)

The VOD operates with the base assumption that it is more difficult to discern the race of a driver during darkness than during daylight. Note that this does not mean that it is impossible to identify a driver’s race during darkness, nor that an officer will always see a driver’s race during daylight - only that there is a difference in visibility. As such, the VOD leverages the fact that sunlight naturally varies over the course of the year to determine whether racial profiling might be occurring.

As such, the next step of our analyses is coding the evening intertwilight period (ITP) for each stop. The ITP is the inclusion criteria for our analyses - we must analyze stops with valid comparison groups, so we only want to look at stops at times that could feasibly be either daylight or dark depending on the time of the year. In other words, a stop at 7 PM might occur during daylight in the summer or during darkness in the winter, but we wouldn’t include stops at 12 PM because those will only occur during daylight for our study region. This means we should only look at stops that take place between the earliest dusk and the latest sunset over the course of the year for the stop’s location.

To begin, our dataset includes dates and times as one column. To manipulate them, we first need to create two new columns: one for Date, one for Time.

data$Date <- format(lubridate::mdy_hm(data$DateTime)) 
data$Date <- as.Date(data$Date)            
data$Time <- format(lubridate::mdy_hm(data$DateTime), "%H:%M:%S") 

The code generating sun times (e.g., sunset and dusk) can take a while to run, especially for those with larger datasets. We have provided two example benchmarks demonstrating the time it took for our code generating ITP start and end times to run below, one with a lower end processor and the other with a more powerful device:
- AMD Ryzen 5 5500U, 2.10 GHz, 8 GB RAM: 23m02s - AMD Ryzen 5 5600G, 4.20 GHz, 32 GB RAM: 8m22s

As such, we should generate our ITPs before coding stops as occurring during daylight or darkness so that the latter step operates with fewer observations. To do so, we first need to find what days the earliest dusk and latest sunset occur. Start by creating a vector of all dates during the year(s) of your data - for us, that’s just 2023 (Dates_2023).

Dates_2023 <- seq(lubridate::ymd("20230101"), lubridate::ymd("20231231"), by = "days") 

Next, we find the earliest dusk and latest sunset over the course of the year for each observation within our dataset. To do so, we use dplyr to group data by Stop_ID, a unique identifier for each traffic stop. We also use the suncalc package to generate the latest sunset and earliest dusk times over the course of the year for the coordinates of each stop, such that we are left with the beginning (ITP_Start) and end (ITP_End) of our ITP.

data <- data %>% 
  dplyr::group_by(Stop_ID) %>%   
  dplyr::mutate(ITP_Start = min(format(lubridate::ymd_hms(suncalc::getSunlightTimes(Dates_2023, 
                                       Lat, Lon, tz = "America/New_York")$dusk), 
                                "%H:%M:%S")), 
         ITP_End = max(format(lubridate::ymd_hms(suncalc::getSunlightTimes(Dates_2023,
                                       Lat, Lon, tz = "America/New_York")$sunset), 
                                "%H:%M:%S")))

Next, code stops according to whether they occur during the ITP. To do so, we create a binary indicator variable, with 1 representing stops that are within the ITP and 0 representing stops that are outside of it.

data$InITP <- ifelse(data$Time < data$ITP_Start, 0,             
                        ifelse(data$Time > data$ITP_End, 0, 1))  

Finally, we limit our dataset to only include stops within the ITP (InITP).

data <- data[data$InITP == 1,]

As you can see, this drastically reduces the number of stops available for analyses to only 3,418. These represent stops within our dataset that can have counterfactuals: they can all theoretically take place during either daylight or darkness. And that’s the next step - coding the daylight variable.

Coding for daylight and excluding ambiguous stops

For evening stops, which are the only ones we are examining in this analysis, daylight is defined as stops that occur before Sunset, whereas darkness is defined as stops that occur after Dusk. Between sunset and dusk, the amount of daylight present is rapidly shifting, making this distinction ambiguous. Because of that, we must exclude these stops. To do this, we’ll first need to code Sunset and Dusk for each stop with the following code. Since we already restricted the data to the ITP, this will take 70-80% less time than it would have otherwise.

data <- data %>%
  dplyr::group_by(Stop_ID) %>%
  dplyr::mutate(Sunset = suncalc::getSunlightTimes(Date, Lat, Lon, tz = "America/New_York")$sunset, 
         Dusk = suncalc::getSunlightTimes(Date, Lat, Lon, tz = "America/New_York")$dusk)
data$Sunset <- format(lubridate::ymd_hms(data$Sunset), "%H:%M:%S") 
data$Dusk <- format(lubridate::ymd_hms(data$Dusk), "%H:%M:%S")

Now, we code these stops. Again, stops occurring before a given day’s sunset time are during Daylight (1) whereas stops occurring after a given day’s dusk time are during darkness (0), and stops between sunset and dusk are ambiguous (2). These ambiguous stops will be excluded from our analyses.

data$Daylight <- ifelse(data$Time < data$Sunset, 1,
                        ifelse(data$Time > data$Dusk, 0, 2))
data <- data[data$Daylight != 2,] 

After the 321 stops during the ambiguous period are excluded, we have 3,097 observations.

Weighting stops to account for seasonality

We’re almost done with preparing the daylight-related variables for analyses - now, we just need to account for seasonal driving variation. If the driving population varies over the course of the year (e.g., summer tourism), not accounting for this could bias our results. Because of this, we’ll be creating weights for each stop based on the proportion of daylight and darkness present in each day. In this way, stops that are almost exclusively coded as daylight (summer) and stops that are almost exclusively coded as darkness (winter) receive lower weights because they might otherwise inflate our estimates if the racial composition of the driving population is considerably different than during other times of year.

Accordingly, we first calculate the number of seconds for each day that take place during daylight (Daylightsec), as well as the number of seconds taking place during darkness (Darksec).

data$Daylightsec <- lubridate::period_to_seconds(lubridate::hms(data$Sunset)) - 
  lubridate::period_to_seconds(lubridate::hms(data$ITP_Start))
data$Darksec <- lubridate::period_to_seconds(lubridate::hms(data$ITP_End)) - 
  lubridate::period_to_seconds(lubridate::hms(data$Dusk))

Some of these values will be negative (when the sunset is before the ITP starts, or when the dusk is after the ITP ends). That will cause mathematical issues later, so we need to set these values to zero.

data$Daylightsec <- pmax(data$Daylightsec, 0)
data$Darksec <- pmax(data$Darksec, 0)

Now, we can calculate the probability of daylight (Pr_Day) relative to darkness for each day. A value of 1 means all stops on the day will be coded as daylight, and a value of 0 means all stops on the day will be coded as darkness.

data$Pr_Day <- (data$Daylightsec/(data$Darksec+data$Daylightsec))

Now, we create a quadratic weight (QWUnfixed) based on this probability. Stops at the farthest end of the spectrum (closer to 0 or 1) receive the lowest weights, while stops near the middle receive the highest.

data$QWUnfixed <- (data$Pr_Day*(1-data$Pr_Day))

We could finish here and use these weights, but there’s a pretty serious bias-variance tradeoff. As stops during summer and winter take place almost exclusively during daylight OR darkness, these will receive weights of 0 and be essentially excluded from analyses. These results will be a little less biased, but we’ll seriously inflate our variance and functionally halve our sample size. This is a big deal, especially for smaller departments. Because of this, we can add a floor to our weights based on the mean weight (MeanQW), such that these results are still downweighted relative to other stops but not completely excluded. We’ll add this floor to our previously generated weights to create the weights we use in our analyses (FinalQW).

Mean_QW <- mean(data$QWUnfixed)
data$FinalQW <- data$QWUnfixed + Mean_QW

Coding and including other covariates for the VOD analysis

We’re done with all of the coding related to sun times, but we still need to code our other important variables for the VOD analysis. To start with, we should make a new binary indicator of Race depending on the racial group of drivers we want to examine in our VOD analysis. For this walkthrough, we will examine how Daylight relates to whether a stopped driver is Black.

data$Black <- as.factor(ifelse(data$Race == "Black", 1, 0))

Next, we need to control for temporal variation in driving population. For example, the drivers on the road will likely differ depending on the day of the week (DOW), so we should create a categorical variable for that.

data$DOW <- as.factor(weekdays(data$Date))

Similarly, the drivers on the road will also likely differ depending on the time of day. To account for this, we need to create a time splines variable with 6 degrees of freedom (TimeSpline1, TimeSpline2TimeSplineX). This essentially says that time can continuously vary, but also accounts for certain time periods being “grouped” together.

TimeSplines <- Epi::Ns(period_to_seconds(hms(data$Sunset)), df = 6)
data <- cbind(data, TimeSplines)
data <- data %>%
  dplyr::rename(TimeSpline1 = `1`,TimeSpline2 = `2`,TimeSpline3 = `3`,
                TimeSpline4 = `4`,TimeSpline5 = `5`,TimeSpline6 = `6`)

Next, because our VOD estimate is meant to help us detect whether racial profiling may be occurring, we need to account for Discretion. Lower discretion stops are more likely to be made regardless of driver race, so failing to account for this might bias our estimates otherwise. The sample dataset here already includes our coding for the level of discretion in a stop, but if using your own dataset, this will likely vary depending on the administrative data in question and departmental reporting practices. For this demonstration, simply set the Discretion variable to be a “factor” variable.

data$Discretion <- as.factor(data$Discretion)
data$Discretion <- relevel(factor(data$Discretion), ref = "Low")

Likewise, it’s important to control for officer Assignment, as the typical hours each assignment works (e.g., regular “business” hours 8-6, vs midnight shifts) and their goal (e.g., crime reduction) may otherwise influence our estimator.

data$Assignment <- as.factor(data$Assignment)
data$Assignment <- relevel(factor(data$Assignment), ref = "General")

Lastly, we should account for our data structure. Sometimes, this might mean including variables if you have specific officer IDs, squads, precincts, and so on. For us, we just need to include Department in analyses.

data$Department <- as.factor(data$Department)

Conducting the VOD analysis

Finally, we’re ready to conduct a VOD analysis. Let’s run through what we need to include: - Black as our outcome variable
- Daylight as our predictor variable
- FinalQW as our seasonality weights
- Department, Assignment, Discretion, DOW, and TimeSplineX as covariates

We use a quasibinomial link function to account for the error distribution of our dichotomous outcome, Black. Let’s run the regression and see our results.

VOD.model <- glm(Black ~ Daylight + Department + Assignment + Discretion + DOW + TimeSpline1 +
                   TimeSpline2 + TimeSpline3 + TimeSpline4 + TimeSpline5 + TimeSpline6, 
                 data = data, weights = FinalQW, family = quasibinomial())
summary(VOD.model)
## 
## Call:
## glm(formula = Black ~ Daylight + Department + Assignment + Discretion + 
##     DOW + TimeSpline1 + TimeSpline2 + TimeSpline3 + TimeSpline4 + 
##     TimeSpline5 + TimeSpline6, family = quasibinomial(), data = data, 
##     weights = FinalQW)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.9693  -0.3255  -0.1781   0.3142   1.2554  
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               -1.953545   0.264217  -7.394 1.83e-13 ***
## Daylight                   0.212601   0.118485   1.794 0.072860 .  
## DepartmentCherry PD       -1.152253   0.152657  -7.548 5.79e-14 ***
## DepartmentMaple PD         0.939931   0.110858   8.479  < 2e-16 ***
## DepartmentOak PD           0.152409   0.140972   1.081 0.279724    
## AssignmentCrime Reduction  1.208104   0.102258  11.814  < 2e-16 ***
## AssignmentOther            1.964003   0.617930   3.178 0.001496 ** 
## AssignmentSpecial Focus    0.856367   0.173305   4.941 8.17e-07 ***
## AssignmentTraining         0.095235   0.200622   0.475 0.635038    
## DiscretionHigh             0.445686   0.116945   3.811 0.000141 ***
## DOWMonday                 -0.011741   0.156929  -0.075 0.940363    
## DOWSaturday               -0.002991   0.156773  -0.019 0.984782    
## DOWSunday                  0.064646   0.157706   0.410 0.681895    
## DOWThursday               -0.077547   0.156119  -0.497 0.619427    
## DOWTuesday                -0.088497   0.161560  -0.548 0.583893    
## DOWWednesday              -0.356796   0.158855  -2.246 0.024771 *  
## TimeSpline1                0.580896   0.320425   1.813 0.069946 .  
## TimeSpline2                0.217210   0.347685   0.625 0.532193    
## TimeSpline3                0.549808   0.309344   1.777 0.075612 .  
## TimeSpline4                0.059536   0.376111   0.158 0.874237    
## TimeSpline5                0.637890   0.598502   1.066 0.286592    
## TimeSpline6                0.227723   0.311821   0.730 0.465261    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for quasibinomial family taken to be 0.1627714)
## 
##     Null deviance: 639.86  on 3096  degrees of freedom
## Residual deviance: 527.96  on 3075  degrees of freedom
## AIC: NA
## 
## Number of Fisher Scoring iterations: 5

Net of controls, it appears that the Daylight variable is marginally significant as a predictor for driver race being Black. This is more easily interpreted if we generate an Odds Ratio:

exp(coef(VOD.model)[2])
## Daylight 
## 1.236891

Accordingly, this can be interpreted as our model saying that Daylight traffic stops are 23.7% more likely to involve Black drivers than stops during darkness. However, as this result is only marginally significant (p = .073), we cannot be confident that these results are not due to chance.