@timnitGebru identified a huge problem w/artificial intelligence today: English text generators (language models) are built using online text from sources like Twitter. But there& #39;s so much abuse towards women, people of color & queer people on Twitter. So we participate less. https://twitter.com/timnitgebru/status/1381366153109508097">https://twitter.com/timnitgeb...
Posters who are sexist, racist, or homophobic and write tweets that display these biases are enabled to tweet more. These tweets get fed into English text generators (like Google& #39;s GPT-3 generator) & produce sexist, racist, and homophobic text that is difficult to moderate.
This is a big AI problem in the tech industry because some companies want to pretend that this problem doesn& #39;t exist, and that English text generators like Google& #39;s GPT-3 generator are incapable of showing bias. (This article investigates GPT-3:
https://towardsdatascience.com/is-gpt-3-islamophobic-be13c2c6954f)">https://towardsdatascience.com/is-gpt-3-...
https://towardsdatascience.com/is-gpt-3-islamophobic-be13c2c6954f)">https://towardsdatascience.com/is-gpt-3-...
Gebru and her co-authors show that English text generators need human curation that is culturally & historically sensitive: http://faculty.washington.edu/ebender/papers/Stochastic_Parrots.pdf
I">https://faculty.washington.edu/ebender/p... personally believe that the tech industry pretends this problem doesn& #39;t exist because it would cost money & shift priorities.
I">https://faculty.washington.edu/ebender/p... personally believe that the tech industry pretends this problem doesn& #39;t exist because it would cost money & shift priorities.
No matter what the intent is of tech companies, the result is the same: English text generators like GPT-3 are not getting less racist, sexist, homophobic from being exposed to more data from online interactions like tweets--the interactions themselves encode societal biases.
Gebru and her co-authors push AI research to focus on data curation & responsibility instead of vacuuming all data on the internet. Its being selective about what you choose--like a librarian. This has cultural ramifications that tech companies don& #39;t want responsibility for.
There were dire consequences for Gebru & her fellow researchers at Google for publishing "On the Dangers of Stochastic Parrots". Gebru & Margaret Mitchell were fired from Google. Google launched a campaign against them after they spoke up: https://www.theverge.com/22309962/timnit-gebru-google-harassment-campaign-jeff-dean">https://www.theverge.com/22309962/...
Google& #39;s harsh overreaction to this paper has rocked Google, the tech industry, artificial intelligence research, academia, & people of color in tech. It feels like the era of AI exuberance that we were in during the 2010s is over & now we& #39;re looking to rebuild from the wreckage.
Thanks everyone for the likes and retweets! I like to try to explain current tech industry & computer science problems in an accessible way, and it makes me happy to see that this thread is accomplishing that!