Simpson's paradox (Yule-Simpson effect) occurs when a trend or pattern apparent in a series of data groups reverses itself once the groups are aggregated (Simpson 1951; Yule 1903). The paradox results from the combination of a lurking explanatory variable and a data set comprised of unequally sized groups (Goltz and Smith 2010).
Well known to statisticians, Simpson's paradox is less familiar to criminal justice policy makers, police executives, and crime analysts. We found 210 instances of Simpson's paradox in 15 years of clearance rate data from the 50 largest Canadian police jurisdictions, including four cases in which a reversal occurred simultaneously in all crime categories and subcategories. Caution is therefore necessary when using clearance rates as a comparative measure of police performance, particularly between jurisdictions or time periods with different crime mixes.
Simpson's paradox is an important statistical anomaly because the reversal typically cannot be anticipated and may therefore lead to erroneous conclusions, as the whole does not accurately reflect the sum of its parts. Structurally, Simpson's paradox is compatible with a relatively wide range of scenarios and causal relationships (Pearl 2014). For simple statistical problems that can be summarized in a 2 x 2 contingency table, Simpson's paradox can be expected to appear by chance in 1 case out of 60 (Pavlides and Perlman 2009). The probability of the paradox decreases exponentially with the number of subgroups and is, therefore, rare in situations with more than two subgroups.
Empirical instances of Simpson's paradox have been previously documented in medical treatment data (Charig, Webb, Payne, and Wickham 1986; Julious and Mullee 1994), baseball data (Friedlander 1992), jury selection data (Westbrooke 1998), hospital data (Reintjes, de Boer, van Pelt, and Mintjes-de Groot 2000), and graduate student university admission data (Bickel, Hammel, and O'Connell 1975).
One of the most famous examples of Simpson's paradox in the criminal justice system is found in the Baldus study for the United States Supreme Court. In 1978, Warren McCleskey, a black man, was convicted of murdering a white police officer during a store robbery in Georgia and sentenced to death. McCleskey filed a petition in the Federal District Court for the Northern District of Georgia with 18 claims, one of which was that the capital sentencing process in Georgia was racially discriminatory in violation of the US Constitution (McCleskey v Kemp 1987). In support of his claim, McCleskey proffered a statistical study by Baldus, Pulaski, and Woodworth (1983), commissioned by the National Association for the Advancement of Coloured People (NAACP) Legal Defence Fund. The Baldus study examined 2,484 murder cases in the State of Georgia between 1973 and 1979. White defendants were found to be more likely to be sentenced to death: 7.4% (60/ 808) of white defendants, compared to 4.1% (68/1,676) of black defendants. On the other hand, a black defendant was more likely to be sentenced to death if the murder involved a white victim: 21.5% (50/233) of black defendants, compared to 7.8% (58/748) of white defendants. However, a black defendant was not more likely to be sentenced to death if the murder involved a black victim: 1.2% (18/1,443) of black defendants, compared to 3.3% (2/60) of white defendants.
Illusory correlations are not uncommon in criminal justice research (Kaye and Freedman 2011). They are a consequence of the highly aggregated nature of most administrative criminal justice data sets, including the police-reported clearance rates compiled by Statistics Canada for the annual Uniform Crime Reporting (UCR) survey. Even though clearance rates are commonly used to inform inter-agency comparisons, (1) crime analysis, resource planning, policy development, and legislative changes (Mahony and Turner 2012), they are sensitive to the type and mix of crime (Ouimet and Pare 2003; Pare, Felson, and Ouimet 2007). Moreover, the prevalence of Simpson's paradox in the context of police-reported clearance rate data is not well documented and consequently may not always be given proper consideration by policy makers, the media, or researchers who are not experts in advanced statistical analysis.
Through its annual UCR survey, the Canadian...