Using Bayesian Analysis of Police Killings

Much has already been written about the draft paper by Roland Fryer, An Empirical Analysis of Racial Differences in Police Use of Force. Fryer makes several findings in his paper, but says the “startling” finding is that “Blacks are 23.8 percent less likely to be shot by police, relative to whites.” (p24—though Fryer observes that non-lethal force is much more likely to be used against blacks than whites). What he means by that is not that more white people in his study were shot at than blacks—indeed, in his data (Table 1C), 46% of total police shootings were of black people (as opposed to 24% non-black, non-Hispanic) and 52% were black in Houston, where he did much of his detailed analysis (compared to 14% non-black, non-Hispanic). The analysis he performs says for any given encounter between cops and citizens, whites are more likely to be shot. But controlling for police encounters removes much of the disparity of interest, as others have said (see Michelle Phelps’ take here, and another take here), and as Fryer himself acknowledged in a Times Q&A). I’m going to add a slight methodological wrinkle based on my sabbatical project: learning Bayesian data analysis, but know that the definitive Bayesian analysis has already been written by Cody Ross, A Multi-Level Bayesian Analysis of Racial Bias in Police Shootings at the County-Level in the United States, 2011–2014. This is just the Cliff’s Notes version. But I figure that’s suitable for a blog. So what could a Bayesian approach help us with?

Bayesianism, in my view, helps surface some common problems about how to analyze results. Bayesian analysis is particularly good for conjoined probabilities: the probability that, given X, Y is true (written, reading right to left, as p(Y|X), leading some to translate it to plain English as the probability of Y given X). The initial thing to note is that p(Y|X) is not the same as p(X|Y). Given that I can see the sun, the probability that it is daytime is high (unless I’m in outer space). But it’s less likely that, given that it’s daytime, I can see the sun (I might be indoors, it might be cloudy or foggy, the sun might be behind a building, etc.). But Bayesianism is also particularly good at expressly taking into account how we should read evidence like Fryer’s, and why “controlling” for the number of stops obscures more than it reveals.

Let me give an example. Let’s say there’s a fatal, but rare, disease, where the probability of any individual having the disease is 1 percent (in equation form, p(D)=.01). We have a test for it, but our tests aren’t that great. If you have the disease, your likelihood of testing positive for it is 80 percent (p(Pos|D)=.80). If you don’t have it, you are likely to get a false positive about 60 percent of the time (p(Pos|not D)=.6). Here’s the pre-Bayesian, “blacks and whites are stopped roughly equally” question: if you get a positive test, is it more likely than not that you have the disease? The answer is no. It’s more likely that you have the disease than it was before you took the test, but, remember, the chance that you had the disease before you took the test was only one percent. Your chances increase, but only so far. This is what Bayesianism is useful for. The prior probability of having the disease was low (the “Bayesian prior” or baserate probability, among other terms), which bakes in to the analysis how you should revise your results.

The top line of the equation tells you the number of true positives: the likelihood that you have the disease, and the likelihood that, if you have the disease, you will test positive. The bottom divides this by all the positive results, both true and false positives. Since the disease population is so small, the numerator is also small, relative to the large denominator.

Plugging in the data, we get:

For a total of 1.3 percent. So it’s slightly more likely that you have the disease, but not by much, since the disease is still really rare (and our tests aren’t that accurate). The false positives swamp the true positives.

Fryer’s work is extremely helpful for its data collection, and there is much to admire in it. But when we get to the “startling” finding, we can map it onto the above equations like this: Fryer also only looked at the probability of lethality given contact with a black suspect—or p(lethal|black encounter) and p(lethal|white encounter). But without understanding the likelihood of those encounters, that’s like looking at disease likelihood alone. Lethal police encounters are rare—about one percent according to this study by Goff, et. al (see Table 1). But stops are not. What I think we really want to know is not necessarily whether a given encounter is more likely to be lethal. What I think we are interested in is p(black stops|lethal): that is, given that someone has been shot, what is the likelihood that it came from a black person being stopped? (Heather MacDonald, whose work I have previously criticized, asks a still different (irrelevant) question: p(cops|black homicides)—the percentage of killings of black people assigned to police.) What we ultimately care about—including Fryer, in his article—is racial disparity.

What do we know? Based on the New York City stop and frisk data, much of the stopping in New York of all races wasn’t fruitful: about 6 percent of all stops (across races) resulted in an arrest, and about 3 percent turned up weapons or contraband. But black people were stopped much more often: about 58 percent of the total, more than twice their percentage of the population. Whites were stopped about 10 percent of the total. Fryer uses this data, but, unfortunately, it doesn’t include lethal events. For those, he uses the figures I quoted earlier, with an emphasis on Houston. I happen to agree with Fryer (p4) that there is too little good data. Here is what I think a good equation would look like, assuming that there are only two races for simplicity:

So let’s take Fryer’s estimates. The probability of a black stop turning lethal is about 75 percent of the probability of a white stop turning lethal. That, in fact, is the same ratio I used in my disease example. But lethal events are still much more likely to involve black people because black people have many more encounters with the police. The huge disparity in p(black stop) relative to p(white stop) overwhelms any difference in the relative lethality of any given encounter. This explains the results Cody Ross and many, many others have found—Ross, in particular, found (among many other interesting things) that the probability of being black, unarmed, and shot by police was about 3.49 times the probability of being white, unarmed, and shot by police. Base rates explain the difference.

Again, there is nothing wrong with Fryer’s answer about the percentage of stops that are lethal—it’s just not a very interesting question, and it lends itself to misunderstanding. Saying blacks are less likely to be shot really depends on the question you’re asking, and I don’t think Fryer’s is the right one. Ignoring base rates of stopping, choosing to start the measurement at the point of police contact, distorts what those who are concerned about racial disparities—including me—care about the most: why so much police activity, including lethality, is directed at black people.

There can be some legitimate debate about what baseline to use. This analysis looks at the rate of lethal encounters per arrest. In those terms, white lethality is more likely than black. But this is where Bayesianism has an advantage: it wears its assumptions on its sleeve. I think that arrests are the wrong baseline to measure, and that’s where the argument needs to take place: not just hand-waving at difficult math (much as I do it myself), but at the model’s assumptions. Many stops don’t result in arrests—in New York, black people were stopped more often and police still didn’t uncover wrongdoing more often. Using arrests, again, starts the clock at the wrong place.

In his paper, Fryer has suggested that the higher incidence of physical (but non-lethal force) is also a cause for concern, given that he found that on a given stop, police were more likely to use force on blacks than whites. When you couple that with the higher likelihood that black people are going to be stopped in the first place, that means the problem is perhaps even greater than he realized.

None of this is to say that any of this is particularly easy. There are legitimate arguments about what goes into the police stopping of black people in the first place. Fryer has, I think, also been unjustly criticized on methodological grounds, even though he acknowledges these deficiencies in the paper. He noted the problems with relying on police department participation (“It is possible that these departments only supplied the data because they are either enlightened or were not concerned about what the analysis would reveal,” p7), potential misrepresentation by police departments, and how representative the cities he chose were. He even acknowledges—without incorporating— “the possibility that there are important racial differences in whether or not these police-civilian interactions occur at all.” (p.25). He does not deal with the critiques that certain crimes are endogenous, the result of prior run-ins with the law (bench warrants for failure to pay fines, felon-in-possession laws).

My analysis is a simple model; Ross’s article is much more sophisticated. But the insights that this very basic analysis gives (from a very basic analyst just getting started) will, I hope, suggest why I think Bayesian analysis is so interesting in the first place.

[Cross-posted at The Reality-Based Community]

Using Bayesian Analysis of Police Killings

Related

David Ball