Thursday, July 14, 2016

On Arrest Filters and Empirical Inferences

I've been thinking a bit more about Roland Fryer's working paper on police use of force, prompted by this thread by Europile and excellent posts by Michelle Phelps and Ezekeil Kweku.

The Europile thread contains a quick, precise, and insightful summary of the empirical exercise conducted by Fryer to look for racial bias in police shootings. There are two distinct pools of observations: an arrest pool and a shooting pool. The arrest pool is composed of "a random sample of police-civilian interactions from the Houston police department from arrests codes in which lethal force is more likely to be justified: attempted capital murder of a public safety officer, aggravated assault on a public safety officer, resisting arrest, evading arrest, and interfering in arrest." The shooting pool is a sample of interactions that resulted in the discharge of a firearm by an officer, also in Houston. 

Importantly, the latter pool is not a subset of the former, or even a subset of the set of arrests from which the former pool is drawn. Put another way, had the interactions in the shooting pool been resolved without incident, many of them would never have made it into the arrest pool. Think of the Castile traffic stop: had this resulted in a traffic violation or a warning or nothing at all, it would not have been recorded in arrest data of this kind.

The analysis in the paper is based on a comparison between the two pools. The arrest pool is 58% black while the shooting pool is 52% black, which is the basis for Fryer's claim that blacks are less likely to be shot by whites in the raw data. He understands, of course, that there may be differences in behavioral and contextual factors that make the black subset of the arrest pool different from the white, and attempts to correct for this using regression analysis. He reports that doing so "does not significantly alter the raw racial differences."

This analysis is useful, as far as it goes. But does this really imply that the video evidence that has animated the black lives matter movement is highly selective and deeply misleading, as initial reports on the paper suggested? 

Not at all. The protests are about the killing of innocents, not about the treatment of those whose actions would legitimately plant them in the serious arrest pool. What Fryer's paper suggests (if one takes the incident categorization by police at face value) is that at least in Houston, those who would assault or attempt to kill a public safety officer are treated in much the same way, regardless of race. 

But think of the cases that animate the protest movement, for instance the list of eleven compiled here. Families of six of the eleven have already received large settlements (without admission of fault). Six led to civil rights investigations by the justice department. With one or two possible exceptions, it doesn't appear to me that these interactions would have made it past Fryer's arrest filter had they been handled more professionally. 

The point is this: if there is little or no racial bias in the way police handle genuinely dangerous suspects, but there is bias that leads some mundane interactions to turn potentially deadly, then the kind of analysis conducted by Fryer would not be helpful in detecting it. Which in turn means that the breathless manner in which the paper was initially reported was really quite irresponsible. 

For this the author bears some responsibility, having inserted the following into his discussion of the Houston findings:
Given the stream of video "evidence", which many take to be indicative of structural racism in police departments across America, the ensuing and understandable outrage in black communities across America, and the results from our previous analysis of non-lethal uses of force, the results displayed in Table 5 are startling... Blacks are 23.8 percent less likely to be shot by police, relative to whites.
His claim that this was "the most surprising result of my career" was an invitation to misunderstand and misreport the findings, which are important but clearly limited in relevance and scope.


Update. If you follow the links at the start of this post, you'll see a case made that Fryer's own findings of bias in the use of non-lethal force suggest that the composition of the arrest pool will be altered by bias in the charging of innocents for resisting or evading arrest.

It occurred to me that the same data used to examine use of non-lethal force (from the citizen's perspective) could also be used to get an estimate of this effect. This is the Bureau of Justice Statistics Police-Public Contact Survey. If anyone had done already this please let me know, I'd be interested to see the findings.


  1. well, again, if there's no bias in violent confrontation, but there is bias in innocent interactions, that would imply that on an aggregated basis, you would see an upward bias in killings to arrests ratios. What you actually see is a significant downward bias.

  2. This would be true if there was bias in shooting innocents but no bias in wrongfully charging innocents with resisting/evading arrest. See the links above for more on this, and the argument the Fryer's own findings of low level harassment imply precisely this kind of inflation of the arrest pool.

    But the point is that you can't make confident inferences from aggregate data or summary statistics one way or another, whether about race or gender or anything else. So Fryer is right to be looking at individual specific interaction data. The disagreement is about the interpretation of his findings, and this can't be settled by pointing to summary stats.

    1. As a layman with some small understanding of the issue
      I find the paper to be illogical in its conclusions. He finds evidence of racial bias in all the interactions that do result in shootings or killings. Generally one of the racially biased interactions would precede the shooting or killing and would suggest to me that the shooting or killing would be an extension of the biased interaction. There is probably some way to control for this but that is beyond my capabilities. That this is not considered or mentioned means, to me, that his study is fundamentally flawed which would render his conclusions fatally flawed.

  3. The take away from Fryer shouldn't be that there is no bias at all. These estimates are always noisy and counfounded in various ways, and it would be surprising if there wasn't at least a little bias of some kind. And we will never be able to measure it very precisely.

    But the Black Lives Matter movement seems to be based on the premise that the bias is *huge*, that it is the major factor driving shootings, that even innocent black people literally need to fear for their lives from police every day. The Fryer results make this idea much harder to swallow. As do simpler analyses that show that most of the differences in shootings are pretty well explained by underlying differences in violent crime rates.

    The news media is driven by anecdotes, and you can find a lot of crazy, tragic stuff in a nation of 320 million people. Of course these events are still terrible, and there are probably lots of things we could do to make them less common. BTW, many have the impression that this kind of thing never happens to white people, but that is wrong, as documented here:

    Overall, I am much more persuaded that there is a problem of racial bias in low level harrassment, at least in some jurisdictions, than I am that there is a widespread problem of racially biased use of lethal force.

    Peter Moskos is very good on this topic:

    1. I think the jury is still out on the shooting issue. Fryer's study is suggestive but far from decisive. And I agree that Moskos is an important voice on this.

      If Fryer is right, it could be for two very different reasons: lethal force is used appropriately on the whole regardless of suspect race, or lethal force is used way too often for both blacks and whites. Moskos and (recently) John McWhorter have been arguing the latter, I think quite convincingly.

  4. Is it not true that Europile argument would only hold if they assume the police to be racist by default and have tampered with the ACTUAL dataset? Seems tautological.

    1. I don't think that's right... the Europile argument is based on bias in charges of evading/resisting arrest, not in tampering with the data set.

    2. Am I correct to understand that Europile is accusing the study of failing to control for variables/biases that 1.)cannot be controlled for since the data relies on police reports, and 2.) would only matter if one assumes the police are biased by default?

      Since the study aims to find out if the police are biased, isn't it rather spurious to dismiss the results by accusing it of not assuming the police are biased?

      I admit that the biases brought up by Europile could be relevant, but they are beyond the scope of the study, and a lot of work is required to validate them.

    3. Well, the study itself finds significant bias in the use of non-lethal force, so it's not a stretch to imagine that it spills over to false arrest. I think the point is that the study is nowhere near as decisive as early reporting seemed to indicate.

    4. This comment has been removed by the author.

    5. I can't edit your comment (I can only delete the whole thing). But I think you can also delete it if you like.

  5. No problem, reposting:

    I agree with your point, but that's on the media, and not the study itself.

    Most importantly, the results are surprising specifically because the bias in non-lethal force does not carry over to lethal force. It would've been self-defeating to assume the opposite before conducing the study.

    Police reports do lie as in the case of Walter Scott. But until we have data, it's disingenuous to infer to what extent from scattered cases.

  6. This all relies on the very strong assumption that police follow the law and tell the truth.