I like this framework from @harini824 on how different sources of bias have different causes.
This is important, because gathering a more diverse dataset will help in some cases (representation bias), but not in others
paper: https://arxiv.org/abs/1901.10002
blog: https://medium.com/@harinisuresh/the-problem-with-biased-data-5700005e514c
This is important, because gathering a more diverse dataset will help in some cases (representation bias), but not in others
paper: https://arxiv.org/abs/1901.10002
blog: https://medium.com/@harinisuresh/the-problem-with-biased-data-5700005e514c
The Compas Recidivism Algorithm:
- it's no more accurate than random people (Amazon Mechanical Turk)
- it's a black box with 137 inputs but no more accurate than linear classifier on 2 vars
- Wisconsin Supreme Court upheld its use (it is still used in other states as well)
- it's no more accurate than random people (Amazon Mechanical Turk)
- it's a black box with 137 inputs but no more accurate than linear classifier on 2 vars
- Wisconsin Supreme Court upheld its use (it is still used in other states as well)
Link to study cited, "The accuracy, fairness, and limits of predicting recidivism"
https://advances.sciencemag.org/content/4/1/eaao5580
https://advances.sciencemag.org/content/4/1/eaao5580
Evergreen recommendation for @random_walker's 21 Definitions of Fairness Tutorial https://twitter.com/math_rachel/status/976591520575897600?s=20
Even if race & gender are not inputs to your algorithm, it can still be biased on these factors. Machine learning excels at finding latent variables.
I regularly hear people wrongly say that not using race as an input will prevent racial bias.
I regularly hear people wrongly say that not using race as an input will prevent racial bias.
Runaway feedback loops are a big issue for machine learning (including for predictive policing and recommendation systems).
Feedback loops can occur whenever your model is controlling the next round of data. The data quickly becomes contaminated by the model.
Feedback loops can occur whenever your model is controlling the next round of data. The data quickly becomes contaminated by the model.
Many examples of bias won't be fixed by gathering different data or features: “Historical bias is a fundamental, structural issue with the first step of the data generation process and can exist even given perfect sampling and feature selection.” @harini824
The concept of "biased data" is often too generic to be useful: https://twitter.com/math_rachel/status/1113203073051033600?s=20
Many AI ethics concerns are about civil rights and human rights.
One way to regulate AI is to consider what rights we want to protect regarding: housing, education, employment, criminal justice, voting, & medical care.
One way to regulate AI is to consider what rights we want to protect regarding: housing, education, employment, criminal justice, voting, & medical care.
Algorithmic fairness is not the same as justice. https://twitter.com/math_rachel/status/1188889407329193985?s=20
It is important to understand how pervasive unjust bias is. These are just a few of the many, many, many studies on the topic. Racial bias is present in all sorts of data: medical, ads, sales, housing, political, criminal justice, etc.
https://www.nytimes.com/2015/01/04/upshot/the-measuring-sticks-of-racial-bias-.html
https://www.nytimes.com/2015/01/04/upshot/the-measuring-sticks-of-racial-bias-.html
Given that humans are biased, why does algorithmic bias matter?
Algorithmic bias matters because:
- Algorithms & humans are used differently
- Machine learning can amplify bias
- Machine learning can create feedback loops
- Technology is power. And with that comes responsibility
Algorithmic bias matters because:
- Algorithms & humans are used differently
- Machine learning can amplify bias
- Machine learning can create feedback loops
- Technology is power. And with that comes responsibility
Algorithms are used differently than human decision makers:
- people assume algorithms are objective or error-free
- algorithms more likely to be implemented with no process for recourse
- algorithms used at scale
- algorithms are cheap
read more: https://www.fast.ai/2018/08/07/hbr-bias-algorithms/
- people assume algorithms are objective or error-free
- algorithms more likely to be implemented with no process for recourse
- algorithms used at scale
- algorithms are cheap
read more: https://www.fast.ai/2018/08/07/hbr-bias-algorithms/
A terrifying example of a city official assuming that ML is always 99% accurate (in reference to use of IBM Watson for predictive policing): https://twitter.com/math_rachel/status/1121505971140907008?s=20
Some steps towards doing better:
1. Analyze a project at your workplace/school.
2. Work closely with domain experts & those impacted.
3. Increase diversity in your workplace.
4. Advocate for good policy.
5. Be on the ongoing lookout for bias.
1. Analyze a project at your workplace/school.
2. Work closely with domain experts & those impacted.
3. Increase diversity in your workplace.
4. Advocate for good policy.
5. Be on the ongoing lookout for bias.
Some questions to ask when analyzing an algorithmic system:
Should we even be doing this?
What bias is in the data? (all data is biased, need to know how)
Error rates on different sub-groups? (e.g. GenderShades)
Is there an appeals process?
How diverse is the team that built it?
Should we even be doing this?
What bias is in the data? (all data is biased, need to know how)
Error rates on different sub-groups? (e.g. GenderShades)
Is there an appeals process?
How diverse is the team that built it?
To improve diversity, start at the opposite end of the pipeline: your workplace.
Improve the experience of the women of color who are already there so they don't leave due to discrimination & mistreatment.
read more: http://bit.ly/not-pipeline and http://bit.ly/women-quit-tech
Improve the experience of the women of color who are already there so they don't leave due to discrimination & mistreatment.
read more: http://bit.ly/not-pipeline and http://bit.ly/women-quit-tech
Even though Eric Schmidt wants us to stop "yelling" about bias, I continue because real people are being harmed, and: https://twitter.com/math_rachel/status/1188946996029083648?s=20