Tuesday, June 29, 2010

On Blogs and Economic Discourse

I was making my way back from a conference yesterday and completely missed the uproar over Kartik Athreya's provocative essay on economics blogs. Athreya argued, in effect, that most such blogging is done by ill-informed hacks who ought to be ignored while properly trained experts (such as himself) are left in peace to do the difficult work of making progress in the field. The original post has been taken down but (as a telling reminder that no public statement can subsequently be made private in this day and age) a copy may be viewed here.

The response from the accused was swift and brutal (see Thoma, DeLong, Sumner, Rowe, Cowen, Kling, Avent, Yglesias and Wilkinson for a sample). I don't want to pile on, and there's little I can add to what others have already said. But I'd like to take this opportunity to reiterate and expand upon a couple of points that I have made in previous posts about the rapidly changing role of blogs in economic discourse.

My view of the matter is almost diametrically opposed to that of Athreya: I consider these changes to be both irreversible and potentially very healthy. In a post commemorating the birthdays of two excellent economics blogs, I made this point as follows (see also Andrew Gelman's follow-up):
The community of academic economists is increasingly coming to be judged not simply by peer reviewers at journals or by carefully screened and selected cohorts of students, but by a global audience of curious individuals spanning multiple disciplines and specializations. Voices that have long been silenced in mainstream journals now insist on being heard on an equal footing. Arguments on blogs seem to be judged largely on their merits, independently of the professional stature of those making them. This has allowed economists in far-flung places with heavy teaching loads, or those who pursued non-academic career paths, to join debates. Even anonymous writers and autodidacts can wield considerable influence in this environment, and a number of genuinely interdisciplinary blogs have emerged...
This has got to be a healthy development. One might persuade a referee or seminar audience that a particular assumption is justified simply because there is a large literature that builds on it, or that tractability concerns preclude reasonable alternatives. But this broader audience is not so easy to convince. Persuading a multitude of informed, thoughtful, intelligent readers of the relevance and validity of one's arguments using words rather than formal models is a far more challenging task than persuading one's own students or peers. If one can separate the wheat from the chaff, the reasoned argument from the noise, this process should result in a more dynamic and robust discipline in the long run.
In fact, the refereeing process for blog posts is in some respects more rigorous than that for journal articles. Reports are numerous, non-anonymous, public, rapidly and efficiently produced, and collaboratively constructed. It is not obvious to me that this process of evaluation is any less legitimate than that for journal submissions, which rely on feedback from two or three anonymous referees who are themselves invested in the same techniques and research agenda as the author.

I suspect that within a decade, blogs will be a cornerstone of research in economics. Many original and creative contributions to the discipline will first be communicated to the profession (and the world at large) in the form of blog posts, since the medium allows for material of arbitrary length, depth and complexity. Ideas first expressed in this form will make their way (with suitable attribution) into reading lists, doctoral dissertations and more conventionally refereed academic publications. And blogs will come to play a central role in the process of recruitment, promotion and reward at major research universities. This genie is not going back into its bottle.

---

Update (6/30). Andrew Gelman follows up with a long and thoughtful post on the role of blogs in academic research across different fields:
Sethi points out that, compared to journal articles, blog entries can be subject to more effective criticism. Beyond his point (about a more diverse range of reviewers), blogging also has the benefit that the discussion can go back and forth. In contrast, the journal reviewing process is very slow, and once an article is published, it typically just sits there...

Can/should the blogosphere replace the journal-sphere in statistics? I dunno. At times I've been able to publish effective statistical reactions in blog form... or to use the blog as a sort of mini-journal to collect different viewpoints... And when it comes to pure ridicule... maybe blogging is actually more appropriate than formally writing a letter to the editor of a journal.

But I don't know if blogs are the best place for technical discussions. This is true in economics as much as in statistics, but the difference is that many people have argued (perhaps correctly) that econ is already too technical, hence the prominence of blog-based arguments is maybe a move in the right direction...

Statistics, though, is different... even the applied stuff that I do is pretty technical--algebra, calculus, differential equations, infinite series, and the like... Can this sort of highly-technical material be blogged? Maybe so. Igor Carron does it, and so does Cosma Shalizi--and both of them, in their technical discussions, clearly link the statistical material to larger conceptual questions in scientific inference and applied questions about the world. But this sort of blogging is really hard--much harder, I think, than whatever it takes for an economics professor with time on his or her hands to regularly churn out readable and informative blogs at varying lengths commenting on current events, economic policy, the theories of micro- and macro-economics, and all the rest...

On the other hand, the current system of scientific journals is, in many ways, a complete joke. The demand for referee reports of submitted articles is out of control, and I don't see Arxiv as a solution, as it has its own cultural biases. I agree with Sethi that some sort of online system has to be better, but I'm guessing that blogs will play more of a facilitating informal discussions rather than replacing the repositories of formal research. I could well be wrong here, though: all I have are my own experiences, I don't have any good general way of thinking about this sort of sociology-of-science issue.
One minor point of clarification: I did not say (or mean to imply) that blogs would replace journals as the primary repositories of academic research. My point was simply that blogs are fast becoming an integral part of the research infrastructure and that, looking ahead, many innovative ideas will find initial expression in this format before being subject to further development along more traditional lines.

Tuesday, June 22, 2010

Gamesmanship and Collective Reputation

I've often wondered why diving is so prevalent in football. Even if one manages to fool a referee occasionally, the act is captured on video for all to see and inevitably hurts the reputation of the player and his team. Quite apart from the resulting ridicule, there are also long term costs on the field. Referees are more likely to be suspicious when they see players with tarnished reputations tumbling like bowling pins with little apparent contact. Some legitimate fouls may not be called as a result, and there's always the possibility that a player may be cautioned or sent off for unsportsmanlike conduct. So the whole culture of diving, and the fact that it has been embraced so thoroughly by certain teams while being avoided and frowned upon by others, has always been a bit of a puzzle to me.
In a fascinating article, Andrea Tallarita provides some rationalization for this behavior. He explains that diving is a part of a broad range of calculated tactics that are used to get into an opponent's head, inducing frustration, loss of concentration and overreaction. Zidane's costly headbutt of Materazzi in the 2006 World Cup final is the most famous of many examples. Here's how Tallarita explains the approach: 
Perhaps nothing has been more influential in determining the popular perception of the Italian game than furbizia, the art of guile... The word ‘furbizia’ itself means guile, cunning or astuteness. It refers to a method which is often (and admittedly) rather sly, a not particularly by-the-book approach to the performative, tactical and psychological part of the game. Core to furbizia is that it is executed by means of stratagems which are available to all players on the pitch, not only to one team. What are these stratagems? Here are a few: tactical fouls, taking free kicks before the goalkeeper has finished positioning himself, time-wasting, physical or verbal provocation and all related psychological games, arguably even diving... Anyone can provoke an adversary, but it takes real guile (real furbizia) to find the weakest links in the other team’s psychology, then wear them out and bite them until something or someone gives in - all without ever breaking a single rule in the book of football. 
Viewed in this light, the prevalence of diving starts to make a bit more sense. Even if one doesn't win the immediate foul or penalty, the practice can unsettle an opponent and induce errors. And a reputation for diving can cause an opponent to avoid even minimal, routine contact. This is gamesmanship, pure and simple.
But if gamesmanship is so rewarding, why are some teams reluctant to embrace it? Why do the Spanish play such a clean version of the game and consider these tactics to be beneath them, while their closest neighbors, the Italians and Portuguese, have no such qualms? Here is Tallarita's explanation:
Ultimately, these differences come from two irreconcilable visions of the game. The Spanish style understands football as something like a fencing match, a rapid and meticulous art of noble origins where honour is the brand of valour. To the Italians, football is more like an ancient battle, a primal and inclement bronze-age scenario where survival rules over honour.
But this just begs the question: why are the visions of the game so different in nations that are geographically and culturally so close? I think that the answer (or at least part of it) lies in the fact that once a collective reputation has been established, it becomes individually rational for new entrants to the group to act in ways that preserve it. This mechanism was explored in a very interesting 1996 paper by Jean Tirole in which he explains why "new members of an organization may suffer from an original sin of their elders long after the latter are gone." 
The reason why the past behavior of the group affects the incentives of current and future members is that past behavior is not perfectly observable at the level of the individual. Groups consist of overlapping cohorts, with older members mixed in with newer ones. Those older members who have behaved "badly" in the past and thus ruined their reputations have no incentive to behave "well" currently. But suspicion also falls on the newer members, who cannot be perfectly distinguished from the older ones. This suspicion alters incentives in such a manner as to make it self-fulfilling. Even if the entire group would benefit from a change in reputation, this may be impossible to accomplish. Lifting the reputation of the group would require several cohorts to behave well despite being presumed to behave badly, and this is a sacrifice that does not serve their individual interests.
While I have used Tirole's model here to account for variations across teams in their levels of gamesmanship, his own motivation is much broader: he is interested in understanding variations across societies in levels of corruption and differences among firms in their reputation for product quality. And one can think of numerous other examples in which history has saddled a group with a reputation that is hard to shake because doing so requires significant and sustained collective sacrifices from current and future members.

---

Update (6/25). An excellent comment (as usual) by Andrew Oh-Willeke:
The notion that cultural founder effects have great institutional legacies also has strong implications for bankruptcy policy and for policy related to government bureaucracies.

It suggests that completely shutting down one organization, even if it will be replaced by a new organization doing the same thing with the same technology should often be preferred to trying to reorganize existing organizations, because the failure of the troubled firm or bureaucratic unit may be a problem with organizational culture that would otherwise persist, rather than more "objective" factors.

This might also suggest that seemingly absurd economic development strategies, like Attaturk's law mandating that all men wear bowler hats, may have more merit to them than they seem to at an obvious level. The example Malcolm Gladwell used of this phenomena was the increased safety record that was observed at Korean Airlines when flight crews started to use English rather than Korean.
I hope to say more about this in a subsequent post.

An alternative (and perhaps complementary) perspective on heterogeneity in behavior across teams comes from Cyril Hedoin at Rationalité Limitée, who argues that there are major differences across national leagues in gamesmanship norms, sustained by the sanctioning of those who fail to conform to local expectations.

I'm in Istanbul for a conference at the moment and will be slow to respond to emails and comments for a few days.

Sunday, June 20, 2010

The Diving Champions of the (Football) World

Aside from early losses by Germany and Spain, the biggest surprise of the World Cup so far is probably the inability of Italy (the reigning champions) to win either of their first two games. First they drew with Paraguay, ranked 31st in the world, and then again today against 78th ranked New Zealand.
In both cases the Italians came back from a goal behind, and in the latter game did so on the basis of a dubious penalty. De Rossi's spectacular dive after getting his shirt gently tugged by Smith was a wonder to behold, revealing yet again that the Italians are undisputed masters of the simulated foul. Even the Wikipedia entry on the art of diving acknowledges this:
Diving (or simulation - the term used by FIFA) in the context of association football is an attempt by a player to gain an unfair advantage by diving to the ground and possibly feigning an injury, to appear as if a foul has been committed. Dives are often used to exaggerate the amount of contact present in a challenge. Deciding on whether a player has dived is very subjective, and one of the most controversial aspects of football discussion. Players do this so they can receive free kicks or penalty kicks, which can provide scoring opportunities, or so the opposing player receives a yellow or red card, giving their own team an advantage. The Italian national football team have been well known to use this tactic... In fact, their victory at the 2006 FIFA World Cup has been overshadowed by the sheer volume of controversial dives.
While the anecdotal (and video) evidence against Italy is strong, it would be useful to have a statistical measure of diving on the basis of which international comparisons could be made. One possibility is to use data on fouls suffered. For instance, in the latest game, Italy was fouled 23 times while New Zealand suffered just 10 fouls. Either New Zealand is an unusually aggressive (or clumsy) team, or a number of the "fouls" suffered by Italy were simulated.
Since data on fouls committed and suffered is readily available for all World Cup games, it should be possible to sort all this out statistically. Suppose that in any game, the total number of fouls suffered by a team depends on three factors: its propensity to dive (without detection), the opponent's propensity to foul, and idiosyncratic factors independent of the identity of the teams. Then, with a rich enough data set, it should be possible to identify the diving propensity of each team. There are subtleties that could confound the analysis, but a good forensic statistician should be able to handle these. Perhaps Nate Silver will take up the challenge?
In the meantime, for a lesson on how not to dive, enjoy this legendary "posthumous" effort by Gilardino in a 2007 game between AC Milan and Celtic:

Saturday, June 19, 2010

On Tail Risk and the Winner's Curse

Richard Thaler used to write a wonderful column on anomalies in the Journal of Economic Perspectives. Here's an extract from a 1988 entry on the winner's curse:
The winner's curse is a concept that was first discussed in the literature by three Atlantic Richfield engineers, Capen, Clapp, and Campbell (1971). The idea is simple. Suppose many oil companies are interested in purchasing the drilling rights to a particular parcel of land. Let's assume that the rights are worth the same amount to all bidders, that is, the auction is what is called a common value auction. Further, suppose that each bidding firm obtains an estimate of the value of the rights from its experts. Assume that the estimates are unbiased, so the mean of the estimates is equal to the common value of the tract. What is likely to happen in the auction? Given the difficulty of estimating the amount of oil in a given location, the estimates of the experts will vary substantially, some far too high and some too low. Even if companies bid somewhat less than the estimate their expert provided, the firms whose experts provided high estimates will tend to bid more than the firms whose experts guessed lower... If this happens, the winner of the auction is likely to be a loser.
Thaler goes on to point out that the winner's curse would not arise if all bidders were rational, for they would take into account when bidding that conditional on winning the auction, the valuation of their experts is likely to have been inflated. But he also presents evidence (from laboratory experiments as well as field data on offshore oil and gas leases and corporate takeovers) that bidders are not rational to this degree, and that the winner's curse is therefore an empirically relevant phenomenon. Many observers of the free agent market in baseball would agree.
In Thaler's description, the winner's curse arises despite the fact that bidder estimates are unbiased: their valuations are correct on average, even though the winning bid happens to come from someone with excessively optimistic expectations. Someone familiar with this phenomenon would therefore never conclude that all bidders are excessively optimistic simply by observing the fact that winning bidders tend to wish that they had lost.
By the same token, when firms like BP and AIG are revealed to have underestimated the extent to which their actions exposed them (and numerous others) to tail risk, one ought not to presume that they were acting under the influence of a psychological propensity to which we are all vulnerable. Those who had more realistic (or excessively pessimistic) expectations regarding such risks simply avoided them, and by doing so also avoided coming to our attention.
And yet, here is the very same Richard Thaler arguing that a behavioral propensity to accept "risks that are erroneously thought to be vanishingly small" was responsible for both the financial crisis and the oil spill:
The story of the oil crisis is still being written, but it seems clear that BP underestimated the risk of an accident. Tony Hayward, its C.E.O., called this kind of event a “one-in-a-million chance.” And while there is no way to know for sure, of course, whether BP was just extraordinarily unlucky, there is much evidence that people in general are not good at estimating the true chances of rare events, especially when human error may be involved.
There is certainly a grain of truth in this characterization, but I feel that it misses the real story. As the analysis underlying the winner's curse teaches us, those with the most optimistic expectations will take the greatest risks and suffer the most severe losses when the low probability events that they have disregarded eventually come to pass. But tail risks are unlike auctions in one important respect: there can be a significant time lag between the acceptance of the risk and the realization of a catastrophic event. In the interim, those who embrace the risk will generate unusually high profits and place their less sanguine competitors in the difficult position of either following their lead or accepting a progressively diminishing market share. The result is herd behavior with entire industries acting as if they share the expectations of the most optimistic among them. It is competitive pressure rather than human psychology that causes firms to act in this way, and their actions are often taken against their own better judgment. 
This ecological perspective lies at the heart of Hyman Minsky's analysis of financial instability, and it can be applied more generally to tail risks of all kinds. As an account of the (environmental and financial) catastrophes with which we continue to grapple, I find it more compelling and complete than the psychological story. And it has the virtue of not depending for its validity on systematic,  persistent, and largely unexplained cognitive biases among professionals in high stakes situations.
Both James Kwak and Maxine Udall have also taken issue with Thaler's characterization (though on somewhat different grounds). James also had this to say about behavioral economics more generally:
Don’t get me wrong: I like behavioral economics as much as the next guy. It’s quite clear that people are irrational in ways that the neoclassical model assumes away... But I don’t think cognitive fallacies are the answer to everything, and I don’t think you can explain away the myriad crises of our time as the result of them.
I agree completely. As I said in an earlier post, I can't help thinking that too much is being asked of behavioral economics at this time, much more than it has the capacity to deliver.

---

Update (6/20). In a response to this post, Brad DeLong makes two points. First, he observes that those who underestimate tail risk can make unusually high profits not just in the interim period before a catastrophic event occurs, but also if one averages across good and bad realizations:
To the extent that the optimism of noise traders leads them to hold larger average positions in assets that possess systemic risk, their average returns will be higher in a risk-averse world--not just in those states of the world in which the catastrophe has not happened yet, but quite possibly averaged over all states of the world including catastrophic states.
This is logically correct, for reasons that were discussed at length in Brad's 1990 JPE paper with Shleifer, Summers and Waldmann. But (as I noted in my comment on his post) I don't think the argument applies to the risks taken by BP and AIG, which could easily have proved fatal to the firms. One could try to make the case that even with bankruptcy, the cumulative dividend payouts would have resulted in higher returns than less exposed competitors, but the claim seems empirically dubious to me.

Brad's second point is that my distinction between the ecological and psychological approaches is unwarranted, and that the two are in fact complementary. Here he quotes Charles Kindleberger:
Overestimation of profits comes from euphoria, affects firms engaged in the production and distributive processes, and requires no explanation. Excessive gearing arises from cash requirements that are low relative both to the prevailing price of a good or asset and to possible changes in its price. It means buying on margin, or by installments, under circumstances in which one can sell the asset and transfer with it the obligation to make future payments. As firms or households see others making profits from speculative purchases and resales, they tend to follow: "Monkey see, monkey do." In my talks about financial crisis over the last decades, I have polished one line that always gets a nervous laugh: "There is nothing so disturbing to one’s well-being and judgment as to see a friend get rich."
The Kindeberger quote is wonderful, but the claim is about interdependent preferences, not cognitive limitations. I don't doubt that cognitive limitations matter (I started my post with the winner's curse after all) but I was trying to shift the focus to interactions and away from psychology. In general I think that the Minsky story can be told with very modest departures from rationality, which to me is one of the strengths of the approach.

Tuesday, June 15, 2010

An Extreme Version of a Routine Event

The flash crash of May 6 has generally been viewed as a pathological event, unprecedented in history and unlikely to be repeated in the foreseeable future. The initial response was to lay blame on an external source for the instability: fat fingers, computer glitches, market manipulation, and even sabotage were all contemplated. But once it became apparent that this was a fully endogenous event, arising from interactions among trading strategies, it was time to drag out the perennial metaphor of the perfect storm. Consider, for instance, the response from Barclays Capital:
Thursday’s market action, in our opinion, did not begin and end with trading errors and/or exchange technology failures. Nor, as some commentators are suggesting, were quantitative trading strategies primarily responsible for the events that unfolded. All of these forces may have contributed to the voracious sell-off, but our analysis suggests that last Thursday’s events were more a function of a “perfect storm,” to borrow a cliché phrase.
Resorting to this tired analogy is both intellectually lazy and dangerously misleading. It lulls one into a false complacency and suggests that there is little one can (or needs to) do to prevent a recurrence. And since the correction was quick, and trades at the most extreme prices were canceled, it could even be argued that little damage was done. Might this not reflect the resilience of markets rather than their vulnerability?
But consider, for a moment, the possibility that far from being a pathological event, the flash crash was simply a very extreme version of a relatively routine occurrence. It was extreme with respect the the scale of departures of prices from fundamentals, and the speed with which they arose and were corrected. But it was routine in the sense that such departures do arise from time to time, building cumulatively rather than suddenly, and lasting for months or years rather than minutes, with corrections that can be rapid or prolonged but almost impossible to time.
Viewed in this manner, the flash crash can provide us with insights into the more general dynamics of prices in speculative asset markets, in much the same manner as high speed photography can reveal intricate details about the flight of an insect. The crash revealed with incredible clarity how (as James Tobin observed a long time ago) markets can satisfy information arbitrage efficiency while failing to satisfy fundamental valuation efficiency. The collapse and recovery of prices could not have been predicted based on an analysis of any publicly available market data, at least not with respect to timing and scale. And yet prices reached levels (both high and low) that were staggering departures from fundamental values.
So what can we learn from the crash? The SEC report on the event contains two pieces of information that are revealing: the vast majority of trades against stub quotes of five cents or less were short sales, and there were major departures of prices from fundamentals in both directions, with a number of trades executed at ten million dollars per round lot. It is very unlikely that these orders came from retail investors; they were almost certainly generated by algorithms implementing strategies that involve directional bets for short holding periods in response to incoming market data.
While the algorithmic implementation of such strategies is a relatively recent development, the strategies themselves have been around for as long as securities markets have existed. They can be very effective when sufficiently rare, but become increasingly vulnerable to major losses as they become more widespread. Their success in stable markets leads to their proliferation, which in turn causes the information in market data to become progressively more garbled. These strategies can then become mutually amplifying, resulting in major departures of prices from fundamentals. When the inevitable correction arrives, some of them are wiped out, and market stability is restored for a while. This process of endogenous regime switching finds empirical expression in the clustering of volatility.
The reason why departures of prices from fundamentals were so quickly corrected during the flash crash was because the discrepancies were so obvious. It was common knowledge among market participants that a penny per share for Accenture or a hundred thousand for Sotheby's were not real prices (to use Jim Cramer's memorable expression) and therefore presented significant and immediate profit opportunities. Traders pounced and sanity was restored.
But when departures of prices from fundamentals arise on a more modest scale, a coordinated response is more difficult to accomplish. This is especially the case when securities become overvalued. Bubbles can continue to expand even as awareness of overvaluation spreads because short selling carries enormous downside risk and maintaining short positions in a rising market requires increasing amounts of capital to meet margin requirements. Many very sophisticated fund managers suffered heavy losses while attempting to time the collapse in technology stocks a decade ago. And many of those who recently used credit derivatives to bet on a collapse in housing prices might well have met the same fate were it not for the taxpayer funded rescue of a major counterparty.
Aside from scale and speed, one major difference between the flash crash and its more routine predecessors was the unprecedented cancellation of trades. As I have argued before, this was a mistake: losses from trading provide the only mechanism that currently keeps the proliferation of destabilizing strategies in check. The decision is not one that can now be reversed, but the SEC should at least make public the list of beneficiaries and the amounts by which their accounts were credited. Dissemination of these simple facts would help to identify the kinds of trading strategies that were implicated. And it is information to which the public is surely entitled.