Moneyball for Judges

Why do judges do what they do? It is easy to identify two different answers. The first emphasizes the law. The second emphasizes politics.

Some people say that because they are merely interpreting the law, judges are altogether different from political actors. In his confirmation hearing, Chief Justice Roberts famously described judges as neutral umpires, calling balls and strikes. In 2001, Justice Sotomayor declared that “I would hope that a wise Latina woman with the richness of her experiences would more often than not reach a better conclusion than a white male who hasn’t lived that life,” but at her confirmation hearing in 2009, she said that “judges must apply the law and not make the law. Whether I’ve agreed with a party or not, found them sympathetic or not, in every case I have decided, I have done what the law requires.” Many people are drawn to this general account—if not for judges in general, at least for the judges with whom they agree. Some of Justice Scalia’s admirers lament that judges do not always do “what the law requires,” but they think that Justice Scalia does exactly that. Some people describe this perspective as “legalist.”

The opposing view is that judges are inevitably political actors, and hence their decisions are ultimately based on their ideological convictions. Sure, judges hide behind the law, and they purport to be speaking for it, but we shouldn’t be fooled. If Justice Scalia disagrees with Justice Sotomayor in an affirmative action case or a campaign finance case, the explanation is not that one of them follows the law while the other ignores it, but that conservatives and liberals disagree about affirmative action and campaign finance regulation. In honor of the legal realist movement of the 1930s, which emphasized the role of the judge’s predispositions, we might describe this view as “realist.”

Having written articles with titles such as “A Political Court,” Richard Posner, himself a judge, has long tended to favor the realist view. But as a dedicated empiricist, he knows that the underlying debate is an empirical one, which should be answered by reference to data. In baseball, political campaigns, and many other areas, people are now going beyond intuitions, anecdotes, impressions, and dogmas to incorporate careful statistical analysis. So can we play moneyball with judges? Can the judiciary find its Nate Silver?

A lot of people think so. In recent decades, researchers have been counting and cataloging many thousands of judicial votes. Armed with statistical techniques, they have tested a variety of hypotheses about what judges do and why they do it. Posner likes this enterprise, and he is now a part of it. He has teamed up with Lee Epstein (a political scientist) and William Landes (an economist) to provide a comprehensive, numbers-filled assessment of judicial voting patterns.

The resulting book counts as the most detailed and elaborate quantitative analysis of the federal judiciary to date. Its starting point, and its central creed, is its epigraph from Lord Kelvin: “I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind: it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the state of science, whatever the matter may be.”

The authors’ (scientific) conclusion is that the legalists and the realists are both wrong. The truth lies somewhere in between. That is usually a pretty boring conclusion, but it is where the truth turns out to be. Epstein, Landes, and Posner offer some interesting answers to the question, “Exactly where in between?” They also show that the role of judicial ideology gets a lot bigger as we move up the judicial hierarchy. On the Supreme Court, voting is pretty political; on the district courts, politics turns out to be just about irrelevant. This finding has a concrete implication. In the confirmation process, the Senate sometimes fusses a lot over the political views of district court nominees, but usually it shouldn’t: for the district courts, the legalist view is essentially correct.

But Posner and his colleagues want to do a lot more than to analyze a mountain of numbers. The subtitle of their book says that theirs is a “study of rational choice.” They want readers to take those words extremely seriously. They seek to develop a distinctive theory of judicial behavior, one that treats judges as workers, who are rationally seeking to maximize defined goals. In short, data is not all they have. They also have a theory, which they call “the labor-market theory of judicial behavior.”

Here is the basic idea. Workers in general care about money, but they care about a lot of other things, too, including the intrinsic value of work, prestige, social relationships, and leisure time. They may display “leisure preference” and “effort aversion.” A rational worker “will want to have as great a surplus of benefits from work over costs as he can obtain.” Workers might seek a lot of money even if they don’t get a lot of leisure (if they are mostly concerned about money) or vice versa (if they have strong leisure preferences). Epstein, Landes, and Posner believe that leisure preference is likely to play an especially big role in judicial employment, because salaries cannot go up with good performance. It follows that judges may be averse to effort, which will lead them not to dissent as often as they otherwise would (dissent takes effort, after all), and also to try to maintain the goodwill of their colleagues (since ill will eventually leads to more work).

Epstein, Landes, and Posner think that judges care about a lot of different things, or in economic jargon, have a lot of diverse goods in their utility function—which includes internal satisfaction (from doing the job well), the prospect of promotion (which may motivate lower court judges), friendly social relationships, income (which may lead to remunerative extra-judicial activity, such as speechmaking), and leisure. Being reversed by a higher court is bad; hence the authors refer (a bit clumsily) to “reversal aversion.” A lot of extra judicial work can have a negative effect on judicial utility; the same is true of having to write dissenting opinions.

Epstein, Landes, and Posner believe that a theory of “judicial behavior ... that is rational in the economic sense” is a clear competitor to the legalist approach, which rests content with “orthodox norms of judicial decision-making.” They are certainly correct on that point. Imagine a theater-of-the-absurd confirmation hearing in which Supreme Court nominees spoke not of following the law but of balancing the diverse goods in the judicial utility function. We can be confident that such nominees would not exactly have an easy time with the United States Senate or the American public. Still, Epstein, Landes, and Posner believe that the theory of rational choice is the best account of judicial behavior, and it organizes their more specific arguments.

With respect to the Supreme Court, the authors think that the judicial utility function is much simplified, because career advancement is not an issue, and because the light caseload and large staff mean that the justices will have plenty of leisure time. By contrast, the Supreme Court justices’ “ideological leanings” will particularly matter, because promoting them will increase “the internal satisfactions of the judgeship” and also likely help the justices to “reap prestige, exert power and influence, and achieve celebrity, from attempting to align the law with [their] ideological commitments.” For Supreme Court justices, the authors’ rational choice theory and the realist view lead to precisely the same prediction: that ideology should matter a lot.

The evidence strongly supports this prediction. Investigating individual voting patterns, the authors confirm the conventional wisdom: some judges are a lot more conservative than others, and Republican appointees tend to be far more conservative than Democratic ones. Perhaps their most striking findings are captured in a two-page table that offers an ideological ranking of all forty-four Supreme Court justices from 1937 to 2009. As it turns out, William Rehnquist wins the prize for most conservative and Thurgood Marshall counts as the most liberal. Remarkably, three of the current justices (Thomas, Scalia, Alito) rank among the six most conservative on the entire list, and another (Roberts) is not far behind at ten. By contrast, none of the current justices ranks among the six most liberal, and only one ranks among the top ten (Ginsburg, at eight).

The authors also find—and some people will be surprised by this—that presidents, at least since Nixon, have rarely had reason to be disappointed by the voting patterns of their appointees. Some court-watchers like to say that presidential disappointment is routine and that you really cannot predict how justices will turn out to vote. Not so. With just a few exceptions (most prominently, John Paul Stevens, appointed by President Ford, and David Souter, appointed by President George H. W. Bush), the votes of the justices ought not to have been surprising to the White House that was responsible for putting them on the bench. And indeed, the authors find that ideological voting has increased on the Court, in a finding that “may reflect the growing homogeneity of the political parties.”

Emphasizing this point, Epstein, Landes, and Posner offer two important wrinkles. First, Republican appointees do show a pretty wide spectrum of views, and the same is true of Democratic appointees. Justice Scalia is significantly more conservative than Chief Justice Roberts, and Justice Ginsburg is significantly more liberal than Justice Sotomayor. Second, the authors find that a large number of justices change over time. Of the twenty-three justices who served for a minimum of fifteen terms, four drifted to the right, and no fewer than eight drifted to the left. In general, those shifts were not massive—not a wholesale conversion experience, but an unmistakable movement toward a greater degree of moderation.

While the supreme court attracts the lion’s share of public attention, the courts of appeals are exceptionally important. The justices hear only about eighty cases a year, and thousands of rulings by the appellate courts turn out to be essentially final. On those courts, Posner and his co-authors explore several data sets and emphasize and confirm some well-known findings. (Full disclosure: I helped to compile one of their data sets.) For a start, there is a significant difference between Republican and Democratic appointees, especially in the most controversial areas (abortion, gay rights, campaign finance regulation, disability discrimination, and environmental protection). In these and related areas, Democratic appointees are definitely more likely than Republican appointees to cast stereotypically liberal votes—a point in favor of the realist view.

But the difference is not exactly huge. Across all subject-matter areas in one large data set, Democratic appointees cast liberal votes 52 percent of the time, whereas Republican appointees did so 40 percent of the time. In a data set limited to ideologically divisive areas, Democratic appointees cast liberal votes 51 percent of the time, whereas Republican appointees did so 38 percent of the time. This means that even in such areas, Democratic appointees show close to a 50-50 split between liberal and conservative votes, and that Republicans are casting stereotypically liberal votes in a lot more than one case in three. It follows that Democratic and Republican appointees are going to agree far more often than they disagree, even in the most contentious cases that courts of appeals judges ever see. The legalists can claim some comfort in this fact.

The authors also emphasize that in the context of the courts of appeals, the political party of the appointing president matters, but with some important anomalies. In the modern era, Republican presidents have appointed conservative judges about two-thirds of the time, and Democratic presidents appointed liberal judges at about the same rate—suggesting that Republican presidents are appointing a lot of liberal-voting judges and that Democratic presidents are appointing a lot of conservative-voting ones. The authors explain this puzzling finding in part by emphasizing the constraints of the law, for “conventional legal reasoning,” at least on a lower court, will limit the judges’ room to maneuver.

Epstein, Landes, and Posner also find that judges are greatly influenced by the votes of their colleagues. When Republican appointees sit with two Democratic appointees, their voting patterns look a lot like those of Democratic appointees, and Democratic appointees vote a lot like Republican appointees with they sit with two Republican appointees. It would be natural to describe this somewhat puzzling pattern as the product of a conformity effect. For decades, social scientists have found that human beings tend to move in the direction marked out by those with whom they associate; and apparently judges behave like the rest of us, even when the ideological stakes are high. The authors offer a compatible but more specific explanation, which they call “effort aversion,” which produces “dissent aversion.” People don’t want to increase their workload, and hence they decline to dissent even when they disagree.

What about the district courts? Here the authors’ theory suggests that ideological ambitions are likely to be dampened by caseload pressures and the threat of being reversed, along with “desire for promotion, a different case mix, and lower visibility.” And indeed, the authors find that on district courts, judicial ideology does not much matter. In this domain at least, the legalist view finds strong support. Even in the ideologically disputed cases—affirmative action, disability discrimination, and the like—the difference between Republican and Democratic appointees is somewhere between small and zero. In part because they are low in the judicial hierarchy, the district courts are dissuaded from allowing ideology to determine their decisions. In the area of criminal sentencing, we would expect to find that Republican appointees to the district courts impose stiffer sentences than do Democratic appointees—and there is indeed a difference. But it is modest.

Epstein, Landes, and Posner offer a detailed and separate treatment of the topic of dissent. For every judge, producing a dissent has both benefits and costs. On the one hand, a dissent may influence the development of the law, and it may enhance the reputation of its author. Some judicial dissenters become history’s heroes. On the other hand, a dissent may be hard to produce, and it may engender some ill will; judges in the majority do not like to have to deal with a dissent. This calculus is likely to lead to more dissents on the Supreme Court than on the courts of appeals, as reflected in the 57.4 percent dissent rate on the former and a dissent rate of only 2.7 percent on the latter. Notably, however, the high dissent rate on the Supreme Court is a relatively recent phenomenon, having risen very steeply in the 1940s. The authors conjecture that the rise was a product of the success of dissents by Oliver Wendell Holmes, Louis Brandeis, and others, whose minority positions eventually became law.

In line with their theory of judicial behavior, the authors expect to find an increase in dissents with a decrease in judicial workload and with an increase in the number of judges (which decreases workload). And they do find high correlations. A 10 percent increase in caseload per judge decreases the dissent rate by about 7 percent, and a 10 percent increase in the number of judges in a circuit increases the dissent rate by about 6.5 percent. Moreover, an increase in judicial heterogeneity, leading to more frequent disagreement, increases dissents as well.

Epstein, Landes, and Posner have produced the best and most comprehensive study of judicial behavior by reference to quantitative measures and statistical analysis. Their book will almost certainly define the field for many years to come. I do have some doubts about the general theory, but the particular findings, and the supporting explanations, are almost entirely convincing.

At the outset, we should note some problems with what may well be the most striking pages in the entire book, which offer the ideological ranking of all forty-four Supreme Court justices from 1937 to 2009. To obtain that ranking, the authors use the percentage of conservative votes in a set of non-unanimous decisions in which ideology ought to matter (including civil liberties cases). The initial difficulty with such an approach is this: to rank judges along an ideological spectrum, we want not merely to count votes (liberal or conservative?), but also to weight them, by examining the particular cases and also the relative extremism of the judges’ preferred outcomes. It is one thing to vote to strike down a particular affirmative action program; it is quite another to say that all affirmative action programs should be struck down. It is one thing to vote to uphold a particular restriction on abortion; it is quite another to vote to overrule Roe v. Wade. The “percentage of conservative votes” metric will not pick up such differences.

There is an additional problem. Even if a justice shows a modest percentage of conservative votes, that justice should nonetheless be counted as very conservative if she votes for extremely conservative outcomes in the most important cases. Suppose that Justice Jones votes in a stereotypically liberal direction in a fair number of cases, but that she is reliably conservative in cases that involve affirmative action, gay rights, abortion, and campaign finance. Is she really more liberal than a judge who votes in a liberal direction less often, but is a staunch defender of affirmative action programs, abortion rights, gay rights, and campaign finance legislation? Epstein, Landes, and Posner use a proxy—the percentage of conservative votes—that makes for some easy counting, but the proxy is a bit crude.

There is a more fundamental objection to the ranking exercise, which is that the mix of cases changes dramatically over time. Justice Marshall did not sit on the Roberts Court, and Chief Justice Roberts did not sit on the Warren Court, and neither of them sat on the Court of the 1940s. To know who is more liberal than whom, we would need to see how they would vote in the same set of cases. Since the mix of cases was quite different in the 1970s from what it was in the 1950s, and since the mix was different in the 2000s as well, we cannot really say (as the authors’ ranking suggests) that Justice Alito is more conservative than Justice Lewis Powell, or that Chief Justice Earl Warren was more liberal than Justice Sotomayor, or that Justice Kennedy is more conservative than Justice Potter Stewart. It is imaginable that even if Justice Hugo Black was quite liberal in his time (and he was), he would count as very conservative in ours—just as it is imaginable that even if Justice Alito counts as very conservative in his votes in the current mix of cases (as he does), he would emerge as pretty liberal in the 1940s.

The authors’ ranking exercise is not so different from an effort to rank baseball players by using some analogously crude measure—say, on-base percentage. If we use that measure, Ted Williams ranks first, but John McGraw (now known mostly as a manager) ranks third, the long-forgotten Billy Hamilton fourth, and the obscure Dan Brouthers thirteenth—while all-time great Mickey Mantle ranks sixteenth and the sensational Stan Musial is merely twenty-first. Needless to say, Mantle and Musial were better hitters than McGraw, Hamilton, and Brouthers. The problem is not only or even mostly the use of on-base percentage. It is that baseball is not the same game. When the mix of cases changes significantly, the same is true for judging.

We should not take this objection for more than it is worth. On the basis of their data, the authors can indeed say, with a pretty high degree of confidence, that Justice Thomas is more conservative than Chief Justice Roberts, who is more conservative than Justice Kennedy, who is more conservative than Justice Breyer. These justices have sat together for a significant number of years, hearing exactly the same cases. The problem is that we cannot really credit the ranking of the forty-four justices in the authors’ sample.

The much broader issue is that Epstein, Landes, and Posner have written two somewhat different books. The first, impressive if narrowly empirical, aims to count a large number of votes and to test a set of mid-level hypotheses in order to allow them to explore (among other things) the role of politics in judicial decision-making and the relationship between legalism and realism. Among the most important questions here are whether Republican appointees differ from Democratic appointees, and exactly when, and by exactly how much. As we have seen, there are plenty of other hypotheses to test, and the authors offer a number of illuminating findings.

The second book has a grander ambition, which is to unite everything under the general idea that judges are “rational.” The authors’ largest goal is to establish that a kind of rational choice account explains their data and should organize the whole field. What they say on this account gives the book an unmistakable 1970s flavor: it was then that economists (especially but not only at the University of Chicago) were especially eager to show that any behavior under the sun could be counted as “rational,” once we specify what people care about and then probe their behavior deeply enough. Most of that work was, and continues to be, convincing and highly illuminating. But in some cases, it runs into an obvious problem, which is that the rationality assumption may turn out to be compatible with a wide assortment of results, such that it is a form of hand-waving, or flag-showing, that doesn’t really allow us to test anything.

To give content to the idea of rationality, we need to specify what people care about, or to identify their “utility function,” not merely in the abstract but in a way that is concrete enough to allow predictions to be falsified. If the utility function includes an assortment of diverse goods, it doesn’t rule out any particular set of outcomes. Suppose that we think that smokers are rational, and that they care a lot about social relationships, popularity, smoking, health, and money. Now suppose that government increases the cigarette tax by sixty cents per pack. Would we predict a small reduction in smoking, a big one, or none at all? If all we know is that the smokers like an assortment of good things, we won’t have a prediction—and just about any imaginable outcome would be compatible with rationality.

Many of the authors’ mid-level hypotheses do not require any particular claims about rationality, and they work just fine. How much is added by offering a labor-market theory of judicial behavior? The most obvious answer, and arguably the only one, involves dissenting. The authors contend that a low level of dissent on the courts of appeals—2.7 percent—is a product of “dissent aversion,” which stems from an effort to reduce effort. But suppose that they found a dissent rate that was half as high, or twice as high, or ten times as high? Would their theory be undermined? Not at all. A dissent rate of 1 percent would be perfectly consistent with their theory. And if a much higher dissent rate—say, 27 percent—were observed, their theory would work, too. A high dissent rate would merely show that other components of the judicial utility function (say, self-esteem or a desire to be vindicated by history) overcame the costs of dissenting.

To be sure, the authors find that dissents increase with a decrease in judicial workload and with increases in judicial heterogeneity. But is this surprising, or interestingly a product of rational choice theory? If people have plenty of time on their hands, they are likely to be willing to expend some effort that they might avoid if time were scarce. No surprise there. You would certainly expect an increase in dissent with an increase in disagreement: how could it possibly be otherwise?

Notwithstanding these points, Epstein, Landes, and Posner have performed an important service in establishing the truths, and the limits, of both legalism and realism. Divergences among Republicans and Democrats do map onto differences between Republican and Democratic judicial appointees, especially on the Supreme Court. At the same time, those divergences are significantly reduced on the courts of appeals—and eliminated, or nearly so, on the district courts. True, there is a lot more to learn about judicial behavior, perhaps especially through investigating not merely votes, but also opinions. For judges, money-ball has important limitations, at least in its current form. The good news is that statistical analysis and quantitative measures are enabling us to go far beyond the intuitions and anecdotes that have long dominated academic and public discussions of government’s third branch.

Cass R. Sunstein is the Felix Frankfurter Professor of Law at Harvard University and the author, with Richard H. Thaler, of Nudge: Improving Decisions about Health, Wealth, and Happiness (Yale). He is a contributing editor at The New Republic.