NYPD stop-and-frisks from 2003 to 2013 [OC]

The issue isn't whether or not blacks and Hispanics committed more crimes or not, but that biased sampling processes result in biased outcomes and statistics.

Stop-and-frisk is a sampling technique. If it were random, universal (everybody), or done in exact proportion to the percentage of the population for each racial group, the outcomes would be statistically representative. If you bias the sampling to a single racial group, then you'll catch a higher percentage of people violating some law in that group compared to the percentages of other groups, and your resulting outcome statistics would then create a feedback loop to self justify even higher sampling of that group.

If you don't understand the problem, at least you should be able to understand the extreme case. Imagine you believe that blacks commit a higher percentage of crimes than whites, so you aim to maximize your crime-fighting by only searching blacks. Then 100% of the people caught will be black and 0% will be white. You can then use this statistic to validate your approach because 100% of those caught are black and 0% are white, so no need to search whites. The problem is that the statistics are a result of the sampling method so it can't self-justify.

Furthermore, if crime itself is a feedback loop -- i.e., being caught for a minor crime in stop-and-frisk ultimately leads you to larger crime because its now harder to get a job, be legit, etc., and jail/prison just hardens you more, then the sampling method also causally increases the true statistical crime rates by race.

Note this isn't to say that blacks and Hispanics do not tend to commit more crimes in either absolute or proportional terms, or describe why that may be. Rather it is a purely mathematical argument that racial distribution of crimes is irrelevant to the problem of biased sampling methods like stop-and-frisk.

/r/dataisbeautiful Thread Link - i.imgur.com