Wednesday, December 14, 2016

Thomas Schelling, Methodological Subversive

Thomas Schelling died at the age of 95 yesterday.

At a time when economic theory was becoming virtually synonymous with applied mathematics, he managed to generate deep insights into a broad range of phenomena using only close observation, precise reasoning, and simple models that were easily described but had complex and surprising properties.

This much, I think, is widely appreciated. But what also characterized his work was a lack of concern with professional methodological norms. This allowed him to generate new knowledge with great freedom, and to make innovations in method that may end up being even more significant than his specific insights into economic and social life. 

Consider, for instance, his famous "checkerboard" model of self-forming neighborhoods, first introduced in a memorandum in 1969, with versions published in a 1971 article and in his 1978 book Micromotives and Macrobehavior. This model is simple enough to be described verbally in a couple of paragraphs, but has properties that are extremely difficult to deduce analytically. It is also among the very earliest agent-based computational models, reveals some limitations of the equilibrium approach in economic theory, and continues to guide empirical research on residential segregation.

Here's the model. There is a set of individuals partitioned into two groups; let's call them pennies and dimes. Each individual occupies a square on a checkerboard, and has preferences over the group composition of its neighborhood. The neighborhood here is composed of the (at most) eight adjacent squares. Each person is content to be in a minority in their neighborhood, as long as minority status is not too extreme. Specifically, each wants strictly more than one-third of their neighbors to belong to their own group. 

Initially suppose that there are 60 individuals, arrayed in a perfectly integrated pattern on the board, with the four corners unoccupied. Then each individual in a central location has exactly half their neighbors belonging to their own group, and is therefore satisfied. Those on the edges are in a slightly different situation, but even here each individual has a neighborhood in which at least two-fifths of residents are of their own type. So they too are satisfied.

Now suppose that we remove twenty individuals at random, and replace five of these, placing them in unoccupied locations, also at random. This perturbation will leave some individuals dissatisfied. Now choose any one of these unhappy folks, and move them to a location at which they would be content. Notice that this affects two types of other individuals: those who were previously neighbors of the party that moved, and those who now become neighbors. Some will be unaffected by the move, others may become happy as a result, and still others may become unhappy. 

As long as there are any unhappy people on the board, repeat the process just described: pick one at random, and move them to a spot where they are content. What does the board look like when nobody wants to move?

Schelling found that no matter how often this experiment was repeated, the result was a highly segregated residential pattern. Even though perfect integration is clearly a potential terminal state of the dynamic process just described, it appeared to be unreachable once the system had been perturbed. The assumed preferences are tolerant enough to be consistent with integration, but decentralized, uncoordinated choices by individuals appear to make integration fragile, and segregation extremely stable. Here's how Schelling summarized the insight:
People who have to choose between polarized extremes... will often choose in a way that reinforces the polarization. Doing so is no evidence that they prefer segregation, only that, if segregation exists and they have to choose between exclusive association, people elect like rather than unlike environments.
One can tune the parameters of the model: the population size and density, or the preferences over neighborhood composition, and see that this key insight is robust. And for reasons discussed in this essay, equilibrium reasoning alone cannot be used to uncover it. 

A very different kind of contribution, but also one with important methodological implications, may be found in Schelling's 1960 classic The Strategy of Conflict. Here he considers the adaptive value of pretending to be irrational, in order to make threats or promises credible (emphasis added):
How can one commit himself in advance to an act that he would in fact prefer not to carry out in the event, in order that his commitment may deter the other party? One can of course bluff, to persuade the other falsely that the costs or damages to the threatener would be minor or negative. More interesting, the one making the threat may pretend that he himself erroneously believes his own costs to be small, and therefore would mistakenly go ahead and fulfill the threat. Or perhaps he can pretend a revenge motivation so strong as to overcome the prospect of self-damage; but this option is probably most readily available to the truly revengeful
Similarly, in bargaining situations, "the sophisticated negotiator may find it difficult to seem as obstinate as a truly obstinate man." And when faced with a threat, it may be profitable to be known to possess "genuine ignorance, obstinacy or simple disbelief, since it may be more convincing to the prospective threatener."

Starting with three classic papers in the same 1982 issue of the Journal of Economic Theory, a large literature in economics has dealt with the implications for rational behavior of interacting with parties who, with small likelihood, may not be rational. While this work has focused on characterizing rational responses to irrationality, Schelling's point speaks also to payoffs, and raises the possibility that departures from rationality may have adaptive value

The methodological implications of this are profound, because the idea calls into question the normal justification for assuming that economic agents are in fact fully rational. Jack Hirshleifer explored the implications of this in a wonderful paper on the adaptive value of emotions, and Robert Frank wrote an entire book about the topic. But the idea is right there, hidden in plain sight, in Schelling's parenthetical comments.  

Finally, consider Schelling's burglar paradox, also described in The Strategy of Conflict:
If I go downstairs to investigate a noise at night, with a gun in my hand, and find myself face to face with a burglar who has a gun in his hand, there is a danger of an outcome that neither of us desires. Even if he prefers to just leave quietly, and I wish him to, there is danger that he may think I want to shoot, and shoot first. Worse, there is danger that he may think that I think he wants to shoot. Or he may think that I think he thinks I want to shoot. And so on. "Self-Defense" is ambiguous, when one is only trying to preclude being shot in self-defense.
Sandeep Baliga and Tomas Sjöström have shown exactly how such reciprocal fear can lead to a fatal unraveling, and explored the enormous consequences of allowing for pre-play communication in the form of cheap talk. And I have previously discussed the importance of this reasoning in accounting for variations in homicide rates across time and space, as well as the effects of Stand-your-Ground laws.

There are a handful of social scientists whose impact on my own work is so profound that I can't imagine what I'd be writing if I hadn't come across their work. Among them are Glenn Loury, Elinor Ostrom, and Thomas Schelling. I can think of at least five papers: on segregation, on variations in homicide across regions and communities, on reputation in bargaining, and on social norms, that flow directly from Schelling's thought. 

It may surprise some to know that Glenn Loury's Du Bois lectures are dedicated to Schelling, but it makes perfect sense to me. Here's how Glenn explains his choice in the preface:
Shortly after arriving at Harvard in 1982 as a newly appointed Professor of Economics and of Afro-American Studies, I begin to despair of the possibility that I could successfully integrate my love of economic science with my passion for thinking broadly and writing usefully about the issue of race in contemporary America. How, I wondered, could one do rigorous theoretical work in economics while remaining relevant to an issue that seems so fraught with political, cultural and psychological dimensions? Tom Schelling not only convinced me that this was possible; he took me by the hand and showed the way. The intellectual style reflected in this book developed under his tutelage. My first insights into the problem of "racial classification" emerged in lecture halls at Harvard's Kennedy School of Government, where, for several years in the 1980s, Tom and I co-taught a course we called "Public Policies in Divided Societies." Tom Schelling's creative and playful mind, his incredible breadth of interests, and his unparalleled mastery of strategic analysis opened up a new world of intellectual possibilities for me. I will always be grateful to him.
As, indeed, will I.

Wednesday, November 02, 2016

The Prediction Market Paradox

There’s a reason why campaigns are eager to publicize polls that show them ahead, while downplaying those in which they happen to be trailing. The perception that a candidate is losing can depress donations and volunteer effort, and lower morale and turnout among supporters. Hence polls that show tightening of a race are often advertised as indicators of momentum by the trailing party, and as outliers by the leader. The actual likelihood of victory is not independent of beliefs about this likelihood.

This gives rise to what might be called a prediction market paradox. If prices are widely believed to accurately reflect underlying probabilities, then there is an incentive for deep-pocketed partisans to try and manipulate these prices at the margin. But if the possibility of manipulation is salient and prices are treated with skepticism, then incentives to manipulate are weakened and prices will in fact be quite accurate reflections of underlying beliefs.

An interesting illustration of this phenomenon is  the recent decision by PredictIt to post an electoral college map, updated by the minute, that aggregates probabilities derived from all its state level markets. Here's what the map looks like at the moment:


There are seven categories: the safe, likely, and leaning states for each candidate and one toss-up category. States shift across categories as prediction market prices cross the relevant thresholds. This way, a broad range of probability assessments is mapped onto a much coarser set that is easy to visualize and process.

But this creates the possibility that small changes in price, of the order of one cent, can lead to reassignments across categories that generate a very different picture. The incentives to manipulate prices is amplified whenever such categorical switches are feasible.

Of course these incentives apply to both sides of the market, with some traders wishing to shift states to the left while others are pushing to the right. As a result, an unusually large number of states may be expected to bounce back and forth across boundaries, and to remain within a narrow band of prices close to those selected (somewhat arbitrarily) by the exchange as thresholds.

This seems to be what we are seeing. The boundary between the lean and likely Clinton states is determined by a 75% threshold, and we see four states (Wisconsin, Michigan, Colorado, and Pennsylvania) all within a point or two of this. Here are those above the threshold:


And those below:


New Hampshire is not far from the boundary either. 

All this could be just coincidence, but if one looks at probabilistic forecasts from other sources, there is no such pattern. The New York Times conveniently collects six probabilistic forecasts including it's own, with the current picture looking like this:


These forecasts (from the Times, FiveThirtyEight, Huffington Post, Predictwise, Princeton Election Consortium and Daily Kos respectively) don't appear to be clustered around the PredictIt thresholds at all.

Still, the evidence is anecdotal at best, and a proper analysis would have to look for a discontinuity in prices around the time that the map was created, with a clustering of prices around boundary points that could not be accounted for by random chance alone. 

Meanwhile, some caution is probably warranted in interpreting prediction market data. This is a case in which the ease of visualization, aggregation and dissemination of data can have an impact on the underlying measurements themselves, and indeed on the objective probabilities that the measures are intended to reflect.

Friday, September 23, 2016

Thine Every Flaw

There’s a verse in America the Beautiful that I absolutely adore; it represents for me the very best traditions of my adopted country:
America! America!
God mend thine ev’ry flaw,
Confirm thy soul in self-control,
Thy liberty in law.
I’ve been thinking about these words a lot over the past year or so, as the election season has revealed just how divided and how lacking in common purpose we are as a nation.

It's glaringly obvious that international trade, migration, and technological progress have brought enormous benefits to many of us. Our handheld devices are more powerful than the computers that launched our first satellites into orbit. Our system of higher education remains a magnet for eager students from every corner of the world, in part because we have attracted and retained the finest research talent. We are on the verge of a revolution in transportation and urban form as driverless cars make their presence felt. Our cultural products—movies and music among them—continue to attract strong global demand. And our Olympic medal winners encompass many different identities, religions, and countries of origin.

But globalization and technological progress have also left in their wake economic devastation and social disintegration across large swathes of the country that were previously prosperous and stable. The kind of deprivation once confined to inner cities—and tolerated for decades by the rest of society—is now pervasive in once-thriving industrial areas. In his recent and acclaimed memoir, JD Vance laments the decline of Middletown, Ohio from a proud and bustling steel town to "a relic of American industrial glory," with abandoned shops and broken windows, derelict homes, druggies and dealers, and places to be avoided after dark.

Anne Case and Angus Deaton have reported a startling increase in midlife mortality among white Americans without a college degree, "largely accounted for by increasing death rates from drug and alcohol poisonings, suicide, and chronic liver diseases and cirrhosis." Stratification by sex reveals that this phenomenon has hit white working class women especially hard. Trends in criminal justice tell a similar story: the incarceration rate for white women has risen by a staggering fifty percent since 2000, while that for black women has fallen more than 30%. Similar, but much less striking trends are in evidence for males.

All this has led to what Dani Rodrik calls the politics of anger. In its American incarnation, this anger has lifted to the helm of a major political party a man who has apparent contempt for the greatest of our traditions: due process even for those accused of the most heinous crimes, the prohibition of cruel and unusual punishment, and freedom from discrimination on the basis of religion or race. He lacks the self-control for which the verse above pleads, and his appeal to liberty and law is opportunistic and entirely self-serving.

This has been too much for some in his own party to stomach. Meg Whitman, a Republican candidate for Governor of California as recently as 2010, has been actively campaigning for Hillary Clinton. And if unconfirmed reports are to be believed, former president George H.W. Bush intends to vote for her too.

But even if we manage to dodge this bullet in November, the conditions that have fueled Trump's rise will remain in place, and the anger will intensify rather than abate. Something has got to be done to prevent our social fabric from fraying further. But what?

Perhaps protectionist and exclusionary policies can provide some measure of short term relief, but much of the dislocation that results from globalization is also a consequence of technological progress, and giving up on the latter is a recipe for economic suicide. Targeted interventions that support retraining and transition to growing sectors of the economy have to be part of the solution, but these are piecemeal efforts with varying effectiveness and the potential for bureaucratic mismanagement.

An alternative approach is to target inequality and poverty directly, through cash transfer schemes such as a universal basic income or a negative income tax. But payments such as these are not contingent on the performance of the economy as a whole, and therefore provide no incentives for people to support policies that are beneficial in the aggregate but impose costs on them as individuals.

What we need is a distributive mechanism that allows for all to benefit when the country benefits. Debraj Ray has recently proposed something along these lines, a universal basic share. This is simply a share of nominal GDP,  the value of which will ebb and flow with the nation's aggregate income. Aside from some obvious advantages relative to a basic income, such as the absence of any need for indexation, this would give all citizens a stake in the prosperity of the country as a whole.

How might such a scheme be implemented? I have previously proposed the creation of individual accounts at the Federal Reserve for every citizen, including minors, which could be credited with the profits of open market operations. These profits are currently transferred to the Treasury. Any shortfall relative to the basic income share would then have to be made up by transfers from the Treasury to the Fed. One considerable benefit of such accounts is that they would do away with the need for deposit insurance, and would remove at a stroke the implicit subsidy that such insurance provides for proprietary trading at commercial banks. 

Policies of this kind already exist. For instance, the Alaska Permanent Fund collects and invests a portion of the revenue from mineral leases, and periodically distributes dividends to all qualified residents of the state.

The hope is that an initiative such as this can distribute more evenly the benefits from policies that raise aggregate incomes, whether through trade, migration, or technological progress. This ought to mitigate the political obstacles to the implementation of such policies. And perhaps the sense of common ownership will help bridge some of the deep divisions that have become so salient during this electoral season.

Through his rhetoric, Donald Trump has emboldened and empowered some of the most virulently racist and anti-Semitic elements in our society. Just take a look, for instance, at the messages received on twitter by the political theorist Danielle Allen, in response to her concerns about a Trump nomination. They are disheartening in the extreme.

But Trump has the support of about 40% of registered voters, which in my estimation is about 88 million people or 36% of the adult population. While many of them may hold views on some matters that are immensely distasteful and deeply hurtful to others, I think that JD Vance is right to point out that it is "difficult in the abstract to appreciate that those with morally objectionable viewpoints can still be good people." 

I have been an American for just six years, and it is far too soon for me write off so substantial a fraction of my fellow citizens. Call it the naive optimism of the newly naturalized if you like, but I really do think that we can get past this. With or without divine intervention, we can mend our individual and collective flaws.

Thursday, July 21, 2016

A Fallacy of Composition

Peter Moskos is a sociologist by training, a professor at John Jay College of Criminal Justice, and a former Baltimore City police officer. In responding to the shooting of Philando Castile, he had this to say:
Honestly, in this shooting, with this cop, in this locale, I don't think there's a chance in hell Castile would have been shot had he been white. 
Nor did he think this was an entirely isolated incident; it reminded him of the (non-fatal) shooting of Levar Jones by Sean Groubert at a traffic stop in South Carolina. I had exactly the same reaction when I saw the Castile video, as did others. Even the Governor of Minnesota conceded that the shooting "probably would not have happened if he were white."

And yet, Moskos was unsurprised by Roland Fryer's recent claims of an absence of racial bias in police shootings:
I was not surprised by Fryer's conclusions... if one wishes to reduce police-involved shootings... there are good liberal reasons to de-emphasize the significance of race in policing.

Jonathan Ayers, Andrew Thomas, Diaz Zerifino, James Boyd, Bobby Canipe, Dylan Noble, Dillon Taylor, Michael Parker, Loren Simpson, Dion Damen, James Scott, Brandon Stanley, Daniel Shaver, and Gil Collar were all killed by police in questionable to bad circumstances... What they have in common is none were black and very few people seemed to know or care when they were killed. 
Moskos is not arguing here that the police can do no wrong; he is arguing instead that in the aggregate, whites and blacks are about equally likely to be victims of bad shootings. 

How can these two views be reconciled? If there is bias in individual incidents, ought it not to show up in aggregate data? Doesn't the congruence between the racial composition of arrestees nationwide and the racial composition of victims of police killings indicate an absence of bias, as Sendhil Mullainathan claimed a few months ago?

I have argued previously that it does not, because of systematic differences in the qualitative nature of encounters. If police initiate more encounters with blacks that are not objectively threatening (but may in some cases be subjectively perceived to be threatening) then parity in killings per encounter can indicate the presence rather than absence of bias. As Andrew Gelman put it at the time, it's all about the denominator

But Moskos offers another, quite different reason why bias in individual incidents might not be detected in aggregate data: large regional variations in the use of lethal force. 

To see the argument, consider a simple example of two cities that I'll call Eastville and Westchester. In each of the cities there are 500 police-citizen encounters annually, but the racial composition differs: 40% of Eastville encounters and 20% of Westchester encounters involve blacks. There are also large regional differences in the use of lethal force: in Eastville 1% of encounters result in a police killing while the corresponding percentage in Westchester is 5%. That's a total of 30 killings, 5 in one city and 25 in the other.

Now suppose that there is racial bias in police use of lethal force in both cities. In Eastville, 60% of those killed are black (instead of the 40% we would see in the absence of bias). And in Westchester the corresponding proportion is 24% (instead of the no-bias benchmark of 20%). Then we would see 3 blacks killed in one city and 6 in the other. That's a total of 9 black victims out of 30. The black share of those killed is 30%, which is precisely the black share of total encounters. Looking at the aggregate data, we see no bias. And yet, by construction, the rate of killing per encounter reflects bias in both cities. 

This is just a simple example to make a logical point. Does it have empirical relevance? Are regional variations in killings large enough to have such an effect? Here is Moskos again:
Last year in California, police shot and killed 188 people. That's a rate of 4.8 per million. New York, Michigan, and Pennsylvania collectively have 3.4 million more people than California (and 3.85 million more African Americans). In these three states, police shot and killed... 53 people. That's a rate of 1.2 per million. That's a big difference.

Were police in California able to lower their rate of lethal force to the level of New York, Michigan, and Pennsylvania... 139 fewer people would be killed by police. And this is just in California... If we could bring the national rate of people shot and killed by police (3 per million) down to the level found in, say, New York City... we'd reduce the total number of people killed by police 77 percent, from 990 to 231!
This is a staggeringly large effect. 

Additional evidence for large regional variations comes from a recent report by the Center for Policing Equity. The analysis there is based on data provided voluntarily by a dozen (unnamed) departments. Take a close look at Table 6 in that document, which reports use of force rates per thousand arrests. The medians for lethal force are 0.29 and 0.18 for blacks and whites respectively, but the largest recorded rates are much higher: 1.35 for blacks and 3.91 for whites. There is at least one law enforcement agency that is killing whites at a rate more than 20 times greater than that of the median agency.

On the reasons for these disparities, one can only speculate:
I really don't know what some departments and states are doing right and others wrong. But it's hard for me to believe that the residents of California are so much more violent and threatening to cops than the good people of New York or Pennsylvania. I suspect lower rates of lethal force has a lot to do with recruitment, training, verbal skills, deescalation techniques, not policing alone, and more restrictive gun laws. 
Moskos expands on these points in a recent conversation with Glenn Loury.

All of this must be interpreted with caution, since the information we have available is so patchy and deficient. As I wrote in a recent opinion piece with Willemien Kets, there is a desperate need for better data, collected and distributed in a comprehensive and uniform manner. Without this we are just groping in the dark.

Thursday, July 14, 2016

On Arrest Filters and Empirical Inferences

I've been thinking a bit more about Roland Fryer's working paper on police use of force, prompted by this thread by Europile and excellent posts by Michelle Phelps and Ezekeil Kweku.

The Europile thread contains a quick, precise, and insightful summary of the empirical exercise conducted by Fryer to look for racial bias in police shootings. There are two distinct pools of observations: an arrest pool and a shooting pool. The arrest pool is composed of "a random sample of police-civilian interactions from the Houston police department from arrests codes in which lethal force is more likely to be justified: attempted capital murder of a public safety officer, aggravated assault on a public safety officer, resisting arrest, evading arrest, and interfering in arrest." The shooting pool is a sample of interactions that resulted in the discharge of a firearm by an officer, also in Houston. 

Importantly, the latter pool is not a subset of the former, or even a subset of the set of arrests from which the former pool is drawn. Put another way, had the interactions in the shooting pool been resolved without incident, many of them would never have made it into the arrest pool. Think of the Castile traffic stop: had this resulted in a traffic violation or a warning or nothing at all, it would not have been recorded in arrest data of this kind.

The analysis in the paper is based on a comparison between the two pools. The arrest pool is 58% black while the shooting pool is 52% black, which is the basis for Fryer's claim that blacks are less likely to be shot by whites in the raw data. He understands, of course, that there may be differences in behavioral and contextual factors that make the black subset of the arrest pool different from the white, and attempts to correct for this using regression analysis. He reports that doing so "does not significantly alter the raw racial differences."

This analysis is useful, as far as it goes. But does this really imply that the video evidence that has animated the black lives matter movement is highly selective and deeply misleading, as initial reports on the paper suggested? 

Not at all. The protests are about the killing of innocents, not about the treatment of those whose actions would legitimately plant them in the serious arrest pool. What Fryer's paper suggests (if one takes the incident categorization by police at face value) is that at least in Houston, those who would assault or attempt to kill a public safety officer are treated in much the same way, regardless of race. 

But think of the cases that animate the protest movement, for instance the list of eleven compiled here. Families of six of the eleven have already received large settlements (without admission of fault). Six led to civil rights investigations by the justice department. With one or two possible exceptions, it doesn't appear to me that these interactions would have made it past Fryer's arrest filter had they been handled more professionally. 

The point is this: if there is little or no racial bias in the way police handle genuinely dangerous suspects, but there is bias that leads some mundane interactions to turn potentially deadly, then the kind of analysis conducted by Fryer would not be helpful in detecting it. Which in turn means that the breathless manner in which the paper was initially reported was really quite irresponsible. 

For this the author bears some responsibility, having inserted the following into his discussion of the Houston findings:
Given the stream of video "evidence", which many take to be indicative of structural racism in police departments across America, the ensuing and understandable outrage in black communities across America, and the results from our previous analysis of non-lethal uses of force, the results displayed in Table 5 are startling... Blacks are 23.8 percent less likely to be shot by police, relative to whites.
His claim that this was "the most surprising result of my career" was an invitation to misunderstand and misreport the findings, which are important but clearly limited in relevance and scope.

---

Update. If you follow the links at the start of this post, you'll see a case made that Fryer's own findings of bias in the use of non-lethal force suggest that the composition of the arrest pool will be altered by bias in the charging of innocents for resisting or evading arrest.

It occurred to me that the same data used to examine use of non-lethal force (from the citizen's perspective) could also be used to get an estimate of this effect. This is the Bureau of Justice Statistics Police-Public Contact Survey. If anyone had done already this please let me know, I'd be interested to see the findings.

Monday, July 11, 2016

Police Use of Force: Notes on a Study

A new empirical analysis of police use of force by Harvard economist Roland Fryer is attracting national attention. The paper deals with both lethal and non-lethal force, using a variety of different data sets, some public and some painstakingly assembled by the author and his team. Given the harrowing events of the past week, it's likely that his results on shootings will attract the most attention, but it's worth carefully considering both sets of findings.

Fryer provides evidence of significant racial disparities in the experience of non-lethal force at the hands of police, even in data that relies on self-reports by officers. Using official statistics from New York City’s Stop, Question and Frisk program, he finds that blacks and Latinos are more likely to be held, pushed, cuffed, sprayed or struck than whites who are stopped. This remains the case even after controlling for a broad range of demographic, behavioral, and environmental characteristics. And using data from a nationally representative sample of civilians, which does not rely on officer accounts, he finds evidence of even larger disparities in treatment.

But Fryer also reports an absence of racial bias in police shootings for a select group of jurisdictions. He recognizes that a proper analysis of police bias in the use of lethal force requires data not only on those incidents in which shootings occurred, but also those in which suspects were successfully pacified and disarmed. Data of this kind is extremely hard to come by, but he has managed to obtain incident reports on arrests in Houston that can be used for this purpose. 

The focus is on arrest categories that are more likely to involve incidents resulting in justified use of lethal force. It turns out that in this arrest data 58% of the population is black, while in the shooting data the corresponding share is 52%. This immediately implies that in the absence of controls for other features of the interaction, blacks in the arrest population are less likely to be shot than whites. He finds that controlling for other features of the interaction "does not significantly alter the raw racial differences." Here is how Fryer characterizes these findings:
Given the stream of video "evidence", which many take to be indicative of structural racism in police departments across America, the ensuing and understandable outrage in black communities across America, and the results from our previous analysis of non-lethal uses of force, the results displayed in Table 5 are startling... Blacks are 23.8 percent less likely to be shot by police, relative to whites.
He describes this as "the most surprising result of my career."

While it is entirely possible that the Houston Police Department doesn't exhibit systematic racial bias in the use of lethal force, I'm not sure such an emphatic conclusion is warranted. A close look at the arrest data (Table 1D) alongside the shooting data (Table 1C, column 2) reveals a number of puzzles that should be a cause for concern. In the arrest data only 5% of suspects were armed, and yet 56% of suspects "attacked or drew weapon." This would suggest that over half of suspects attacked without a weapon (firearms, knives and vehicles are all classified as weapons). Moreover, there are large differences across groups in behavior: two-thirds of whites and one-half of blacks attacked, a difference that is statistically significant (the reported p-value is 0.006).  

What this means is that the pool of black arrestees and the pool of white arrestees are systematically different, at least as far as behavior is concerned. So the raw data comparison described as startling in the quote above is not really valid. (I made a similar point in response to a piece by Sendhil Mullainathan a few months ago). Still, Fryer controls for these differences in behavioral and contextual characteristics and finds that the basic picture doesn't change. This has to be taken seriously. The key question, to my mind, is whether these controls are adequate. 

I personally would be more convinced if the arrestee pool looked more like the shooting victim pool. For instance, 18% of arrestees, but only 4% of shooting victims are female. I suspect that many of the interactions in the arrestee pool are not threatening, even from the subjective perspective of the officers involved. And others are so obviously threatening---for instance those involving suicide-by-cop---that no discretion or judgement is really necessary. Pruning these from the data might give us a clearer picture of bias in the use of discretionary lethal force. 

Despite these concerns, I think that there is a case to be made that there is no systematic bias against blacks in the lethal use of force within the Houston Police Department. What one ought not to conclude, however, is that this applies nationally. The analysis of other jurisdictions considered in the paper is restricted to encounters in which shootings actually occurred, and cannot therefore be used to answer the same kinds of questions that the Houston data allows. 

One last point about shootings: I'm not sure why there are quotation marks around the word "evidence" in the above quote. Video evidence, for all its flaws, is still very powerful evidence. It was video evidence that led to the indictment of Micheal Slager on murder charges, and the conviction of Sean Groubert for assault and battery. It is selective and cannot establish the presence of racial bias in individual cases, but surely it can't be dismissed out of hand.

Finally, consider Fryer's analysis of non-lethal force, which is consistent with earlier findings. Aside from being fundamentally unjust, disparities in the use of non-lethal force have some really important implications for crime rates. The harassment of entire groups based on racial or ethnic identity is a major obstacle to witness cooperation in serious cases, including homicide. In fact, given the importance of corroboration, a belief that other witnesses will not step forward can be self-fulfilling.

With witnesses routinely unwilling to come forward in some neighborhoods, people can be killed with near impunity. And this significantly increases the incentives to kill preemptively, in a climate of reciprocal fear. Low clearance rates for homicide are directly responsible for high rates of killing, and both of these are held in place by distrust of the criminal justice system by potential witnesses. The excessive and discriminatory use of non-lethal force by police thus ends up having indirect lethal effects.

Thursday, July 07, 2016

Deadly Stereotypes

This video is hard to watch but important to think about and learn from:


Here's what appears to have happened. At around 9pm on July 6, Philando Castile was stopped for a broken taillight while driving in Falcon Heights, Minnesota. He was accompanied by his girlfriend, Lavisha Reynolds, and her young daughter. On being asked for his license and registration, Castile informed the officer that he had a firearm in the vehicle, and a concealed carry permit. He then reached for his wallet and was fatally shot. The video above captures the aftermath of the shooting, and was streamed live to a facebook account by Reynolds. 

The incident immediately brought to mind the shooting of Levar Jones by Sean Groubert in September 2014, which was captured on the officer's dashcam video. Again, there was a traffic stop, a request for documents, and multiple shots fired as Jones reached for his wallet:


Jones was hit but survived the shooting, and Groubert would later plead guilty to assault and battery.

What ties these incidents together is that they seem to have been motivated primarily by fear rather than anger or malice. Moreover, this fear turned out to have been unwarranted: neither Jones nor Castile posed an objective threat to the respective officers. The same was true of Amadou Diallo back in 1999, and in the more recent cases of Tamir Rice and John Crawford.

Whether or not the fear was reasonable under the individual circumstances of each case is harder to ascertain, and there is usually enough doubt to preclude criminal prosecution. Nevertheless, there are rare instances in which the unreasonableness of the fear is recognized: Groubert's employment with the South Carolina Department of Public Safety was terminated on the explicit grounds that he "reacted to a perceived threat where there was none."

A question of great moral and social importance is whether or not such fear is driven, in part, by exaggerated stereotypes of black male violence held by some subset of officers. The anecdotal evidence certainly suggests that such stereotypes matter on average, even if they are not implicated in every case. There is also some evidence of implicit bias from video game simulations.

Further evidence can be found in a dataset assembled by The Guardian. According to this source, there were a total of 1,145 police killings in 2015 alone, about half of which involved suspects armed with a gun. A further 13% of those killed were armed with a knife. There is no question, therefore that police officers often face armed and dangerous suspects. However, 18% of whites killed by police in 2015 were unarmed while 52% had a gun; the corresponding figures for blacks were 25% and 46%. This suggests that within the set of encounters that result in police killings, those involving black suspects are less objectively threatening to the officers involved. One possible explanation is that any given encounter is more likely to be perceived by the officer as threatening when the suspect happens to be black.

In the Guardian data, slightly more than half of those killed by police were white, 27% were black, and 17% Latino. The proportion of those killed who were black is roughly the same as the proportion of total arrestees who are black, which has led some to argue that "removing police racial bias will have little effect on the killing rate." But this claim depends on the questionable assumption that encounters involving black citizens are as likely to be objectively threatening to officers and encounters with white citizens. As I have argued previously, there are reasons to believe that they are not.

The health of our society depends on an effective and trusted criminal justice system. In fact, the system cannot be effective if it isn't trusted. Distrust makes witnesses to crimes unwilling to come forward and depresses clearance rates. This allows serious crimes, including homicide, to be committed with impunity. Fear of homicide victimization raises incentives for preemptive killing, resulting in epidemics of violence. At the heart of it all are stereotypes, affecting interactions between victims and offenders, parties to disputes, prosecutors and witnesses, and officers and suspects. And the very same stereotypes also affect the urgency and concern with which the general public views mass incarceration

What can be done? The screening and training of officers has got to take into account the possibility that stereotypes can be deadly. Psychologists have found that exposure to counterstereotypical exemplars can reduce implicit bias, and residency requirements can serve as a screening device. Finally, the construction of a complete and consistent national database of incidents remains imperative. Public action requires broad engagement with the issue and some agreement on the nature of the problem, and this will not be possible while arguments continue to rely on anecdotal and indirect evidence. Such evidence is too quickly dismissed by skeptics and too easily filtered by stereotypes, no matter how shocking and heartbreaking and deeply persuasive a sympathetic observer finds it to be.

Sunday, April 10, 2016

Fee-Structure Distortions in Prediction Markets

Since the launch of the pioneering Iowa Electronic Markets almost thirty years ago, prediction markets have grown to become a familiar fixture in the forecasting landscape. Among the most recent entrants is PredictIt, which has been operating for about a year under a no-action letter from the CFTC.

Both IEM and PredictIt offer contracts structured as binary options: if the referenced event occurs, the buyer of the contract gets a fixed payment at the expense of the seller, and otherwise gets nothing. The price of the contract (relative to the winning payment) may then be interpreted as a probability; an assessment by the "market" of the likelihood that the event will occur. These probabilities can be calibrated against actual outcomes over multiple events, and compared with survey and model based forecasts. Comparisons of this kind have generally found the forecasting performance of markets to be superior on average to those based on more traditional methods.

But interpreting prices as probabilities requires, at a minimum, that the set of prices referencing mutually exclusive outcomes sum to at most one. This condition is routinely violated on PredictIt. For instance, in the market for the presidential election winner by party, we currently have:


Based on the prices at last trade, there is an absurd 108% likelihood that someone or other will be elected president. Furthermore, the price of betting against all three listed outcomes (by buying the corresponding no contracts) is $1.96, even though the payout from this bundle is sure to be $2.00. Since these contracts are margin-linked (the exchange only requires a trader to post his or her worst-case loss) the cost of buying this bundle would be precisely zero in the absence of fees, and this would be as pure an opportunity for arbitrage as one is likely to find.

On IEM, or the now defunct Intrade, such a pattern of pricing would never be observed except perhaps for an instant. The discrepancy would be spotted by an algorithm and trades executed until the opportunity had been fully exploited. Profits would be small on any given trade, but would add up quickly: the most active account on Intrade during the last presidential election cycle traded close to four million contracts for a profit of $62,000 with minimal risk and effort. This trader had a median holding period of zero milliseconds. That is, the trader typically sold multiple candidate contracts simultaneously (with the trades having identical timestamps) in a manner that could not possibly have been done manually.

Why don't we see this in PredictIt? The simple answer is the fee structure. Whenever a position is closed at a profit the exchange takes 10% of the gains; losing trades don't incur fees. Taking account of this fee structure, the worst-case outcome for a trader betting against all three outcomes in the example above would be a win by someone other than a major party nominee. In this case the trader would lose $0.95 and gain $0.99, incurring fees on the latter of around ten cents. The result would be a net loss rather than a gain, and hence no opportunity for arbitrage. Prices could remain at these levels indefinitely.

Still, algorithmic arbitrage can prevent prices from getting too far out of line with meaningful probabilities. The extent to which this happens depends on whether the events in question include some that are considered highly unlikely. In a market with only two possibilities (such as that referencing confirmation of Merrick Garland) price distortion will be lowest if both outcomes are considered equally likely. For instance, if the prices of the two contracts were each 53, betting against both would cost 94, and fees would be a shade above 5 no matter what happens. These prices could not be sustained, so the distortion would be at most 5%.   

But in the same market, prices of 99 and 10 for the two outcomes could be sustained, for a distortion of 9%. The cost of betting against both would be 91 but if the less likely outcome occurs, the fee would wipe out all gains. Hence no opportunity for arbitrage, and no pressure on prices to change. 

Given that PredictIt is operating as an experimental research facility with the purpose of generating useful data for academic research, this situation is unfortunate. It would be easy for the exchange to apply fees only to net profits in a given market, after taking account of all losses and gains, as suggested here. This does not require any change in the manner in which margin is calculated at contract purchase, only a refund once the market closes. If this is done, prices should snap into line and begin to represent meaningful probabilities. The decline in revenue would be partially offset by increased participation. And the transition itself would generate interesting data for researchers, consistent with the stated mission of the enterprise.

Saturday, March 19, 2016

Does Market Microstructure Matter?

The Securities and Exchange Commission has decided to delay for a second time ruling on the application by IEX to register as a national securities exchange. This time they did so without seeking or receiving permission from the applicant, on the grounds that a decision requires clarification of their own order protection rule. Accordingly, they have posted a notice of proposed interpretation and invited the general public to submit comment letters.
 
The key passage in the notice is the following:
Specifically, the Commission preliminarily believes that, in the current market, delays of less than a millisecond in quotation response times may be at a de minimis level that would not impair a market participant’s ability to access a quote, consistent with the goals of Rule 611 and because such delays are within the geographic and technological latencies experienced by market participants today... permitting the quotations of trading centers with very small response time delays, such as those proposed by IEX, to be treated as automated quotations, and thereby benefit from trade-through protection under Rule 611, could encourage innovative ways to address market structure issues.

Accordingly, the Commission today is proposing to interpret “immediate” when determining whether a trading center maintains an “automated quotation” for purposes of Rule 611 of Regulation NMS to include response time delays at trading centers that are de minimis, whether intentional or not.
If this proposed interpretation is sustained, it seems to me that the application would have to be approved. But perhaps I'm not being cynical enough. There will certainly be a flurry of comment letters from those whose current business models are threatened by the entry of IEX, and it's possible that the delay is intended to provide cover for a change in interpretation on the basis of which the application will eventually be rejected.

But one thing I find encouraging about the notice is that it seems to find persuasive two excellent comment letters by RT Leuchtkafer (I flagged the second when it was submitted but had missed the first). Among the many points made in these letters is the following: if an intentional delay in allowing traders access to quotes is a violation of Regulation NMS, then the entire system of co-location services, differential access speeds, and proprietary data feeds would need to end. Here's the logic of the argument:
If deliberately slower access is an "intentional device that would delay the action taken with respect to a quotation," as the IEX Critics' reasoning certainly implies, the problem isn't just that all the major exchange groups use "delay coils" to equalize access within their data centers. The problem is that you have to pay to get into their data centers in the first place, and if you don't it sure looks like you are intentionally delayed compared to those who can and do pay.

It gets worse! Even within exchange data centers, exchanges charge fees depending on the speed of your connections. A 10gb connection is certainly delayed with an "intentional device" you know, routers, switches and the like relative to a much more expensive 40gb connection, especially when the faster connection is priced out of all proportion to its actual cost, especially when the public SIP feeds have average delays of 500 microseconds to one millisecond and the SEC's own statistics show that billions of quotes are stale before they are ever broadcast by the SIPs.

Where does this logic take us? I naturally started to wonder that if the IEX Critics are right, by their own reasoning the exchanges will have to dismantle their co-location facilities and stop offering tiered high-speed network facilities. They are selling faster access to their markets, and if you don't pay, aren't you slower than you could be, aren't you intentionally delayed?
The critics might want to be careful what they wish for. 

I am on the record in support of the IEX application, and hope that the interpretation proposed in the notice is indeed sustained. The IEX design prevents trading based on information from an order that has been partially filled but not fully processed. It therefore moves us closer to a true national market system, in the sense that orders are processed in full in the sequence in which they make first contact with market.

But does all this really matter, except to those whose interests are directly at stake? I believe it does, because the rules governing transactions in asset markets affect the relative profitability of different trading strategies, and this in turn has consequences for share price accuracy and volatility, the allocation of capital across competing uses, the costs of financial intermediation, and the returns to ordinary investors.

There is an extremely diverse set of participants in the secondary market for stocks, with significant differences in goals, investment horizons, and trading strategies. It is useful to group these into three broad categories: (a) long-term investors, who save during peak earning years and liquidate assets to finance consumption during retirement (b) information traders, who seek to profit from deviations between prices and their private estimates of fundamental values, and (c) high-frequency traders, who combine a market-making function with arbitrage and short-term speculation based on rapid responses to incoming market data.

There is clearly a lot of overlap between these categories. For instance, actively managed mutual funds and some hedge funds belong to the second category but often manage money for long-term investors, pension funds, or university endowments.

The traditional market making function involves the placement of passive orders that provide liquidity to the rest of the market. Such passive order placement is subject to adverse selection: if a posted offer to buy or sell is met by an information trader the market maker will suffer losses on average. In order for a market making strategy to be profitable, these losses have to be matched by gains elsewhere. Where do these gains come from?

In standard models of market-making, the bid-ask spread is determined by a balance between losses from transactions with information traders and gains from transactions against those with price-insensitive demands. But this is not the balance that exists in markets today. Instead, high-frequency traders combine passive liquidity provision with aggressive liquidity-taking strategies based on the near instantaneous receipt, processing, and reaction to market data. The posting of bids and offers is motivated less by profiting from the spread than by fishing for information, which can then be used to take and quickly reverse directional positions. The relative weights on passive liquidity provision and aggressive short-term speculation varies considerably across firms, but there is evidence that the most aggressive and profitable among these are able to effectively forecast price movements over very short horizons.

A transition to a truly national market system will affect the competitive balance between information traders and high-frequency traders. It is in the interests of the former to prevent information leakage so that they can build large positions with limited immediate price impact. It is in the interest of the latter to extract this information from market data and trade on it before it has been fully incorporated into prices. Other things equal, the ability to extract information from a partially filled order and trade ahead of it at other exchanges benefits high-frequency traders at the expense of information traders. A truly national market system would mitigate this advantage.

This means, of course, that high-frequency traders would be more vulnerable to adverse selection and would place a lower volume of passive orders to begin with. But the orders would be genuinely available, and not subject to widespread cancellation or poaching if one of them were to trade. Visible bid-ask spreads may widen but there would be no illusion of liquidity.

The shift in competitive balance between these trading strategies would have broader economic implications. The returns to investment in fundamental information would rise relative to the returns to investment in speed, which should result in greater share price accuracy.  Furthermore, there is a real possibility that the aggregate costs of financial intermediation would decline, as expenditures on co-location, rapid data processing and transmission, equipment, energy, and programming talent are scaled back. This would be a desirable outcome from the perspective of long-term investors. After all:
It is the iron law of the markets, the undefiable rules of arithmetic: Gross return in the market, less the costs of financial intermediation, equals the net return actually delivered to market participants.
Finally, extreme volatility events should arise less often. Algorithms making short-term price forecasts may predict well on average but they will sometimes mistake a random fluctuation for a large order imbalance. Such false positives can give rise to a hot potato effect, of the kind that is believed to have been in play during the flash crash.  Of course, such events can occur even in the absence of market fragmentation, and cannot be prevented entirely, but a transition to a true national market system should reduce their amplitude and frequency. 

For these reasons and more, approval of the IEX application would be a modest but meaningful step in the right direction.

Monday, March 07, 2016

Systematic Biases in Prediction Market Forecasts

On Super Tuesday, and then again on March 5, there were systematic biases in prediction market forecasts. Specifically, Donald Trump lost four contests that he was predicted to win (Oklahoma, Alaska, Minnesota and Maine) and won no contests that he was predicted to lose (Texas and Kansas).

Another way to express this is as follows: if someone had bet that Trump would lose all eleven states on Super Tuesday at the prevailing prices, they would have secured a substantial positive return, approximately doubling their money, even though he actually won seven states. Each bundle of such bets, one for each state, would have cost about $2 and paid out $4 for a 100% return. On the other hand, if they had bet that Trump would win all eleven states they would have lost money, paying about $9 per bundle to get back $7. And there was enough liquidity available to scale up these bets quite substantially.

The pattern repeated itself on March 5: betting on Trump losses across the board would have made quite a bit of money, betting on him to win everything would have been a money-losing proposition. This was true even though he won two of four states, since betting on him to win was much costlier in the aggregate than betting on him to lose. 

Yet another way to say this is that the markets were not terribly well-calibrated. But this ought to be a temporary phenomenon as wealth is transferred across accounts and traders update their beliefs after each outcome realization.

It's election day again tomorrow, which gives us an opportunity to see if such a correction has in fact occurred. Four states are in play on the Republican side, with Trump heavily favored to win Michigan and Mississippi:


If he fails to win either one of these states it would suggest to me that the biases previously evident have not yet been eliminated. 

As far as the other two contests are concerned, Cruz is favored in Idaho, with Rubio (marginally) favored in Hawaii:


How do these predictions compare with more traditional poll and model based forecasts? On the Republican side, all we have is a Michigan forecast from FiveThirtyEight:


This is in substantial agreement with the prediction markets, so the outcome will not help adjudicate between the two approaches. Similarly, on the Democratic side, there is negligible difference in forecasts: the prediction markets and FiveThirtyEight both agree that Clinton is heavily favored to win Michigan and Mississippi. 

One of the most striking facts that David Rothschild and I uncovered in our analysis of Intrade data from the 2012 election was that the overwhelming majority of traders bet in only one direction. They could be partitioned into Obama enthusiasts and Romney enthusiasts, changing their exposure over time in response to news, but never switching entirely from one side to the other. (These categories are based only on beliefs about the eventual outcome, as represented in bets placed, and need not correspond to political preferences.)

If this pattern holds for the contests currently underway, then poorly calibrated forecasts could be a consequence of over-representation in the market of Trump enthusiasts, or a willingness of such individuals to place larger bets. Even so, the bias should be self-correcting over time, as the wealth of those making consistently losing bets is gradually depleted.

---

Update (March 9). The results are in and it's safe to say that there was no hint of bias in favor of Trump this time: he won Michigan and Mississippi handily and Hawaii became the first state that he won while being predicted to lose. The big surprise, of course, was on the Democratic side where Sanders prevailed over Clinton in Michigan. This outcome was given a likelihood of less than one in ten by FiveThirtyEight and prediction markets, and was even mistakenly called for Clinton at 9pm ET last  night:


So we still have little to choose from when comparing markets to poll-based predictions, with Kansas being the only state called differently (correctly by markets and incorrectly by FiveThirtyEight). 

The other big news of course was the failure of the Rubio campaign to secure even a single delegate (as of this writing), while Kasich managed to get 17. Even before polls opened on March 5, I wrote:
I suspect that there is a non-negligible probability that Rubio may exit the race before Florida to avoid humiliation there, while Cruz and Kasich survive to the convention.
There's been a lot of chatter about this possibility over the past couple of days, and it seems increasingly likely to me. A Rubio exit before Florida could tip Ohio to Kasich and set up an interesting and unpredictable three-person race going forward. Ohio will also be critical for the continued viability of Sanders. It's a fascinating election cycle.

Saturday, March 05, 2016

Forecasting Elections

This wild and crazy election cycle is generating an enormous amount of data that social scientists will be pondering for years to come. We are learning about the beliefs, preferences, and loyalties of the American electorate, and possibly witnessing a political realignment of historic proportions. Several prominent republicans have vowed not to support their nominee if it it happens to be Trump, while a recent candidate for the Democratic nomination has declared a preference for Trump over his own party's likely nominee. Crossover voting will be rampant come November, but the flows will be in both directions and the outcome remains quite uncertain. 

Among the issues that the emerging data will be called upon to address is the accuracy of prediction markets relative to more conventional poll and model based forecasts. Historically such markets have performed well, but they have also been subject to attempted manipulation, and this particular election cycle hasn't really followed historical norms in any case.

On Super Tuesday, the markets predicted that Trump would prevail in ten of the eleven states in play, with the only exception being a Cruz victory in his home state of Texas. This turned out to be quite poorly calibrated, in the sense that all errors were in a single direction: the misses were Oklahoma and Alaska (which went to Cruz) and Minnesota (where Rubio secured his first victory). But the forecasters at FiveThirtyEight also missed Oklahoma and were silent on the other two so no easy comparison is possible. 

Today we have primaries in a few more states, and another opportunity for a comparison. I'll focus on the Republican side, where voting will occur in Kansas, Kentucky, Louisiana and Maine. Markets are currently predicting a Cruz victory in Kansas (though the odds are not overwhelming):


In contrast, FiveThirtyEight gives the edge to Trump, though again it's a close call:


The only other state for which we have predictions from both sources is Louisiana, but here there is negligible disagreement, with Trump heavily favored to win. Trump is also favored by markets to take Kentucky and Maine, for which we have limited polling data and no predictions from FiveThirtyEight. 

So one thing to keep an eye out for is whether Trump wins fewer than three of the four states. If so, the pattern of inflated odds on Super Tuesday will have repeated itself, and one might be witnessing a systematically biased market that has not yet been corrected by new entrants attracted by the profit opportunity. 

But if the market turns out to be well-calibrated, then it's hard to see how Rubio could possibly secure the nomination. Here's the Florida forecast as of now:


The odds of a Trump victory in Michigan are even higher, while Kasich is slightly favored in Ohio. Plenty of things can change over the next couple of weeks, but based on the current snapshot I suspect that there is a non-negligible probability that Rubio may exit the race before Florida to avoid humiliation there, while Cruz and Kasich survive to the convention. This is obviously not the conventional wisdom in the media, where Rubio continues to be perceived as the establishment favorite. But unless things change in a hurry, I just don't see how this narrative can be sustained.

---

Update (March 5). The results are in, with Cruz taking Kansas and Maine and Trump holding on to Kentucky and Louisiana. The only missed call by the prediction markets was therefore Maine. Still, the significant margins of victory for Cruz in Kansas and Maine suggest to me that traders in the aggregate continue to have somewhat inflated expectations regarding Trump's prospects. And I'm even more confident than I was early this morning that Rubio faces a humbling and humiliating loss in his home state of Florida, though he may have no option now but to soldier on.