Discussion questions for Hanna Wallach’s article.
But when it comes to addressing bias, fairness, and inclusion, perhaps we need to focus our attention on the granular nature of big data, or the fact that there may be many interesting data sets, nested within these larger collections, for which average-case statistical patterns may not hold.
…it’s possible to obtain all kinds of local government data via public records requests, including data on bias, fairness, and inclusion. Of course, in order to do this, you have to know about these laws, how to issue a public records request, and so on and so on - all of which is arguably more difficult than pulling in data from the Twitter firehose, but may ultimately help address bigger societal issues.
…if we want to achieve fairness, we need to perform rigorous error analysis and model validation.
Being aware that these “implicit biases” exist, and that everyone possesses them - even scientists - is an important step toward drawing fair and unbiased conclusions.
If we want people to draw responsible conclusions using our models and tools, then we need people to understand how they work, rather than treating them as infallible “black boxes.” This means not only publishing academic papers and making research code available, but also explaining our models and tools to general audiences and, when doing so, focusing on elucidating implicit assumptions, best practices for selecting and deploying them, and the types of conclusions they can and can’t be used to draw.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".