Do Judges Vary in Their Treatment of Race

Yüklə 93,61 Kb.
ölçüsü93,61 Kb.
  1   2   3

Do Judges Vary in Their Treatment of Race?
David S. Abrams, Marianne Bertrand, and Sendhil Mullainathan1


Are minorities treated differently by the legal system? Systematic racial differences in case characteristics, many unobservable, make this a difficult question to answer directly. In this paper, we estimate whether judges differ from each other in how they sentence minorities, avoiding potential bias from unobservable case characteristics by exploiting the random assignment of cases to judges. We measure the between-judge variation in the difference in incarceration rates and sentence lengths between African-American and White defendants. We perform a Monte Carlo simulation in order to explicitly construct the appropriate counterfactual, where race does not influence judicial sentencing. In our data set, which includes felony cases from Cook County, Illinois, we find statistically significant between-judge variation in incarceration rates, although not in sentence lengths.

I. Introduction
In 2008, 38% of sentenced inmates in the U.S. were African-American, with African-American males incarcerated at six and a half times the rate of White males.2 Do these differences in incarceration rates merely reflect racial differences in criminal behavior, or are they also partly an outcome of differential prosecution or sentencing practices? A long-standing principle embedded in our system of justice is that defendants should not be treated differently because of their race. This principle is codified in the “Equal Protection” clause of the 14th amendment to the Constitution.3 Differential sentencing or conviction rates by race are presumably a violation of this clause, making this an important question to answer on legal grounds. Establishing whether or not courts treat minority defendants differently also has important social implications: such practices might further exacerbate social inequalities and might even lead to a self-confirming equilibrium where expectations of racial discrimination affect criminal behavior.

Numerous studies examine this question, and most encounter empirical hurdles, particularly small sample size and omitted variables bias. Although almost all proceedings in U.S. courts are public record, as a practical matter it is quite challenging to obtain a statistically significant sample size. The studies using small samples of archival data have produced mixed results.4 Of equal concern is the fact that cross-sectional studies suffer from a potentially severe omitted variables bias. Apparently significant effects of defendant race may actually be due to omitted case characteristics that are correlated with race, such as criminal history or attorney quality5. Thus there are two potential reasons for finding a significant coefficient on race in a cross-sectional regression: discriminatory sentencing on the part of judges or juries, or unobservable characteristics that drive the sentencing gap. The central difficulty with the cross-sectional methodology is that race is not randomly assigned. Therefore, any regression and interpretation thereof is likely to suffer from omitted variables bias.

In this paper, we take a new approach to studying the impact of race in judicial sentencing, one that avoids some of the methodological pitfalls just discussed, and helps shed light on the central issue.6 We attempt to determine whether there are systematic differences across judges in the racial gap in sentencing. At the heart of our research strategy is the ability to exploit the random assignment of cases to judges. This random assignment ensures that unobservable case and defendant characteristics are the same across judges. It allows us to distinguish between unobservable case and defendant variables on the one hand and judicial behavior on the other as explanations for a racial gap in sentencing.

Under the unobserved variables explanation, where no judge is discriminatory, we may see an overall difference in sentencing by race, but we do not expect systematic variation in that difference across judges, as random assignment ensures that each judge receives the same case and defendant mix. Under the discriminatory sentencing explanation, as long as there is some between-judge heterogeneity in the level of differential treatment, we have the opposite prediction; that is, some judges will systematically sentence African-Americans at a higher rate and some will sentence them at a lower rate. This logic underlies the examination in this paper of whether there is significant inter-judge disparity in the racial gap in sentencing.7

To proceed, we use data from felony cases to compute the racial gap in sentence length and incarceration rate for each judge. The main empirical challenge is to identify the correct counterfactual, in which inter-judge variation is due solely to sampling variability. The asymptotic F distribution is inappropriate for this data set because of the small number of observations at the level at which random assignment occurs. This is a problem that occurs frequently in datasets involving randomization procedures where data is collected over a long period of time.8 We address this problem by employing a Monte Carlo methodology to explicitly construct the counterfactual where race has the same impact on sentencing for all judges. Besides its application to the current study, this technique could benefit a large array of empirical studies facing similar constraints without a great deal of learning costs.9

We find evidence of significant inter-judge disparity in the racial gap in incarceration rates, providing support for the model where at least some judges treat defendants differently based on their race. The magnitude of this effect is substantial. The gap in incarceration rates between White and African-American defendants increases by 18 percentage points (compared to a mean incarceration rate of 51% for African-Americans and 38% for Whites) when moving from the 10th to 90th percentile judge in the racial gap distribution. The corresponding sentence length gap increases by 10 months, but this cannot statistically be distinguished from a situation where race played no role in sentence length.

Although judges differ in the degree to which race influences their sentencing, we do not find evidence that observable characteristics such as judges’ gender or age group significantly predict this differential treatment by race. Similarly, no systematic pattern emerges with respect to work history (such as whether the judge ever worked as a Public Defender). However, there is somewhat stronger evidence that the racial gap in sentencing is smaller among African-American judges. Further, judges who are harsher overall (as measured by incarceration rate) are more likely to sentence African Americans to jail than they are Whites. We also explore an important potential confound: that the heterogeneity we observe in the racial sentencing gap may actually be due to heterogeneity in treatment of type of crime. The results of this analysis indicate that there may be a difference in treatment of drug and non-drug crimes, but that there is still a heterogeneous treatment of race within non-drug crimes.

One limitation to our approach is that, while we can statistically establish that race matters in the courtroom, we cannot formally detect whether this is due to some judges discriminating against African-Americans, or some judges discriminating against Whites, or a mixture of both. In itself, though, the evidence we uncover on the importance of race in judicial decision-making should be of direct relevance to legal policy.

The rest of the paper proceeds as follows. Section II provides a brief overview of prior work on the role of race in judicial decisions. In Section III we describe the data from the courts of Cook County, Illinois. We discuss our econometric methodology, including the simulation procedure in Section IV. In Section V we report our basic results, and we discuss the influence of the crime category in Section VI. Section VII concludes.

II. Literature Review

There has been a great deal of scholarship investigating the role of race in the courtroom. Here we briefly summarize some of the previous research most relevant to this study. Many early studies were cross-sectional, and frequently used data sets that were not rich enough to include controls for important case and individual characteristics, such as criminal history, crime severity, and income. Thus it is unsurprising that an early review of the literature found a lack of consensus among these studies. Daly and Tonry (1997) note some of the shortcomings in some of the work between the 1960’s and 1980’s. Kleck (1981) finds that half of the 40 studies on non-capital cases that he reviews either support a finding of discrimination in sentencing or have mixed results, while the other half do not find evidence of judicial discrimination.

Written nearly two decades later, Spohn (2000) also reviews 40 recent studies on the role of race in sentencing, but splits outcomes into incarceration and sentence length. In her survey of the literature, a majority of studies find that race impacts the incarceration decision, but fewer than one-quarter report evidence that race affects sentence length. In one of the most sophisticated critiques of work on discrimination in the criminal justice system, Klepper, Nagin and Tierney (1983) point out numerous methodological problems, including sample selection and omitted variables. Many of their insights are still often neglected in this field of research, almost three decades later.

Some of the earlier papers such as those by Thomson and Zingraff (1981) and Humphrey and Fogerty (1987) rely on relatively small data sets and are unable to distinguish a race effect from the impact of unobservables. Klein, Petersilia, and Turner (1990) use a dataset from California state courts with a large number of covariates to try to minimize the concern about unobservables. They find no impact of race on either the incarceration or sentencing decision, and little explanatory power. Albonetti (1997) uses federal data from the U.S. Sentencing Commission (USSC) on drug offenders. She finds that African-American and Hispanic defendants are more likely to be incarcerated and for longer duration. Steffensmeier and Demuth (2000) also use federal data collected by the USSC, and thus have a detailed and large data set with which to work. Their cross-sectional OLS and probit regressions indicate that African-Americans and Hispanics are jailed more frequently and receive longer sentences than White defendants. The same authors find similar results using state court data from Pennsylvania in their 2001 paper. This differs to some extent from the findings of Kramer and Steffensmeier (1993), which also used Pennsylvania state court data. This study found a small impact of race on the incarceration decision, but not on the length of imprisonment.

A more recent paper by Mustard (2001) improves on previous work by including additional controls in the regression analysis. Using federal data provided by the USSC, he examines the impact of race on the incarceration and sentencing decisions, as well as on departures from the sentencing guidelines. His cross-sectional regressions include controls for income, as well as interaction terms between race and income, race and education, and race and criminal history. He finds that African-Americans are more likely to be incarcerated and receive longer sentences, although some of this appears to be due to more extensive criminal histories and more severe offenses.

Using state data from Maryland, Bushway and Piehl (2001) estimate a tobit model to isolate the impact of judicial discretion on sentence length. They find a greater impact of race than most prior work. A major strength of this paper is the use of guideline recommendations to instrument for potential unobservable case characteristics. Rachlinski and coauthors (2009) approach the question from an experimental psychological perspective. In a laboratory study of judges they find similar results on the implicit association test to that of the general population, which has been interpreted by some as evidence of bias. In studies with explicit racial identification, however, Rachlinski and co-authors do not find race effects.

A recent contribution to the literature is from Schanzenbach (2005). This study focuses on understanding the impact of judicial characteristics on case outcomes, using variation in judicial characteristics at the federal district level.10 While he finds that female judges reduce sex disparity in sentencing, results on racial disparity are mixed. He also finds no main effect of judges’ race on average sentence length. Zussman and Shayo (2010) take a novel approach to understand the impact of ethnicity of various parties in the legal process. They exploits the random timing and location of terrorist attacks in Israel and shows that there is a short-lived local difference in case outcomes that is a function of defendant, plaintiff and judge ethnicity. Price and Wolfers (2010) also find evidence for race effects in a quasi-judicial context, that of NBA referees. In this paper, we focus primarily on defendant race effects in one large jurisdiction.
III. Data Description

Our data comes from the cases adjudicated in the Cook County Circuit of the Illinois state courts. Cook County is the largest unified court system in the country, with over 2.4 million cases processed per year in both civil and criminal courts.11 It is also a racially mixed urban area, with a population that is 48% White, 26% African-American, and 20% Hispanic (see Table 1). The racial breakdown in our data is 12% White, 72% African-American, and 16% Hispanic, reflecting the substantially different rates of representation by race in the criminal justice system.

Illinois state courts are governed by sentencing guidelines, which provide suggested sentencing ranges by category of offense.12 Previous studies, such as Anderson, et al. (1999), have found that guidelines mitigate interjudge sentencing variation, but not substantially. Judges in Cook County courts are initially appointed or elected, and subsequently subject to retention elections every six years.

While the original data set includes over 600,000 felony cases tried between 1985 and 2004, we use only a subset of the data. We discuss the primary restrictions used to obtain this subset here; further detail can be found in Appendix A. First, individual cases may have multiple defendants and multiple charges. In the data the number of charges per case ranges from 1 to 266 (see Table 2), but the median is 1. We retain one defendant and only the most severe charge for each case, since sentencing across charges for a given case will be highly correlated. Second, for the primary analysis, we restrict the data to defendants who are African-American or White (excluding the 16% of defendants classified as Hispanic).13 Third, we retain only cases that were initiated between 1995 and 2001. The start date is used because it was impossible to verify random assignment of cases prior to 1995. The end date is used to allow sufficient time for completion of cases initiated towards the end of the time range (since some cases can take several years to adjudicate). Fourth, murder cases were excluded from the analysis because assignment of these cases often excluded certain judges.

We further limit the data to those cases adjudicated by a subset of the judges in the Cook County Criminal Courts Building, which handles the bulk of the criminal cases in Cook County. We included judges based on the following criteria: adjudicated at least 10 total cases throughout the time period of study; adjudicated cases only at the central courthouse location (in order to insure that all case randomization was performed on the same set of cases); did not preside over a special type of court (like drug court); did not have any unusual circumstances (such as lengthy capital trials) that would have resulted in non-random assignment of cases.

A full summary of the dataset we construct following the above criteria is provided in Tables 2A and 2B. Nearly all cases (92%) result in a guilty finding. The vast majority of defendants in the sample are African-American (86%), male (83%), and young (mean age is 29 and median age is 27). The mean length of incarceration is 20 months across all cases, and 42 months conditional on incarceration. Note that sentence length is top-coded at 60 years in our data. While the median case has only one charge associated with it in the original data, the average number of charges per case is 2.4. As Table 2B shows, sentencing varies substantially by type of crime, with violent crimes receiving the most severe sentences. African-American defendants receive longer sentences on average and are over 30% more likely to be incarcerated than White defendants, not controlling for any case characteristics.14

Table 3 reports judicial characteristics collected from Sullivan’s Judicial Profiles, A Directory of State and Federal Judges in Chicago, The Directory of Minority Judges of the United States, and several other sources listed in the references. The judiciary included in this study is largely White and male, with an average age of 49. Approximately half of the judges have some prior experience in private practice. Prior experience as a prosecutor is also a very common characteristic of these judges; over 70% have past experience as prosecutors, while 27% had previously served as public defenders or defense attorneys.

A crucial requirement for this analysis is that the court use random assignment of cases to judges. In the following section, we describe an econometric test for random assignment. But to establish even facial plausibility, one of the authors spent several days at the central Cook County Courthouse in Chicago, arranged by Presiding Judge Paul Biebel. Every morning in the courthouse, the clerks receive files for new cases and first remove those that have charges of murder or sex crimes. The remaining case numbers are typed individually into a monochromatic green-screen computer (almost certainly around since the 1980’s) which then randomly chooses one of the judges currently hearing cases. The clerks verified that this procedure has been generally followed at least since the mid-1990s.

IV. Econometric Methodology

The focus of this paper is determining whether the impact of defendant race on sentencing varies across judges. There are two steps to testing this hypothesis. The first is to establish the random assignment of cases to judges, ensuring that sentencing outcomes can be fairly compared across judges. The second is to employ an appropriate method to evaluate whether there is excess heterogeneity in the racial gap in judicial sentencing beyond what would be expected due to sampling variability.

In theory, both steps may be accomplished using an ordinary least squares regression followed by an F-test. Under this approach, the random assignment of cases would be established by regressing a case characteristic, such as defendant age, on various controls and judge fixed effects, such as in Equation 1:

ageijt = α + βXijt + ΣδjDj + mot + εijt (1)

where age is defendant age in years, X is an array of control variables, D are judge fixed effects, mo are month-year dummies, i is a defendant index, j is a judge index, and t a time index. An F-test on the equality of the judge fixed effects tests the hypothesis that cases are randomly assigned (with respect to defendant age). Similarly, in order to test the equality of the racial sentencing gap across judges, one would regress sentence length on a vector of control variables, defendant race, judge fixed effects, and interactions between the judge fixed effects and defendant race, such as in Equation 2:

sentenceijt = α + βXijt + raceijt + ΣδjDj + ΣγjDj*raceijt + mot + εijt (2)

An F-test on the equality of the judge-race fixed effects γj would be a test of the equality of the racial gap in sentencing across judges.

In practice, rather than the asymptotic F-distribution, we rely instead on a Monte Carlo simulation to generate a correct finite-sample distribution. This methodology is analogous in spirit to that described above, but it addresses important shortcomings of using the standard F-test in this context. Specifically, the methodology described above is likely to result in over-rejection of the null hypothesis (of random assignment, or no excess heterogeneity) for two reasons. First, although the overall sample is large, our regressions will suffer from finite sample bias because the sample cells are small within the short time periods that are of relevance. Indeed, it is necessary for the analysis to condition on short time periods because the random assignment of cases to judges occurs within these short periods, and there is substantial temporal variation in the judges available and the mix of case and defendant attributes. Our data structure will therefore not satisfy the large N assumption that the distribution of the F-statistic relies on. A second reason for not using the conventional F-statistic is that it will over-reject the null hypothesis when the errors are not normally distributed, as is the case where the dependant variable is Bernoulli with a mean substantially different from 0.5. This applies to several of the variables of interest here, such as race (test of random assignment) or incarceration (test of excess heterogeneity).15

The aforementioned reasons for empirically computing the finite-sample F-distribution are not unique to this paper, rather they are relatively frequent occurrences. In the law and economics literature, any study that compares judge effects without very high caseloads, like Cheng (2008) or Fischman (2010), is likely to suffer from the same problem. But this phenomenon is certainly not confined to judges; it applies to teacher studies, CEO’s, leaders (see the discussion in Jones and Olken 2005), and numerous other contexts. Fortunately, the availability of cheap computing power makes the identification of the problem and the solution straightforward.

One way to test whether the small sample is a concern in this context is to simulate the F-distribution under the null for the given data set. Figure 1 illustrates the need for the simulation methodology in this context. In order to generate it, we ran 1000 tests similar to those we describe below, where by construction the null should not be rejected. Theoretically this should yield a uniform distribution. The dark bars are produced using the simulation methodology, and is nearly uniform. The light bars are produced using the standard F-test methodology. There is clearly an excess of p-values less than 0.05, which would lead to an over-rejection of the null.

For these reasons we instead use a Monte Carlo simulation methodology to both verify random assignment of cases to judges and to determine whether there is excess heterogeneity in the inter-judge racial gap in sentencing. Random assignment is verified by comparing the heterogeneity of the empirical distribution of case characteristics to that found in simulated data. The heterogeneity of the inter-judge racial gap is tested similarly. In both cases, statistical significance is determined by the dispersion of the empirical data relative to the distribution generated by the simulations. We now describe the implementation of the simulation method, first for the random assignment test, and then for the test of excess heterogeneity across judges.

Yüklə 93,61 Kb.

Dostları ilə paylaş:
  1   2   3

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2022
rəhbərliyinə müraciət

    Ana səhifə