1/ I wrote in @Wired that the rush to capitalize on #BigData to solve #COVID19 may lead to deeply unjust outcomes for people living in poverty and other marginalized populations. This thread discusses additional research and latest developments: https://www.wired.com/story/big-data-could-undermine-the-covid-19-response/
2/ ICYMI “data for good” projects seek public health insights from changes in ppl’s movements and behaviors over time. At least 2 problematic assumptions here: a) location data is a reliable proxy of people’s movements; and b) people’s movements/behaviors ≡ disease spread
3/ #COVID19 is not the first public health crisis that has created what @seanmmcdonald calls a “disaster market” for big data. The Ebola outbreak triggered numerous requests for call detail records to help build predictive models of disease spread: https://github.com/cis-india/papers/raw/master/CIS_Papers_2016.01_Sean-McDonald.pdf
6/ The assumption that online / in-app behaviors are a reliable predictor of disease spread is also questionable. The now-defunct Google Flu Trends, modeled on search query patterns, was widely panned for “persistently overestimating flu prevalence”: https://science.sciencemag.org/content/343/6176/1203
7/ “Data for good” projects do not stop at location data. @facebook disease prevention maps include a “social connectedness index” that show “friendship” patterns and where those hardest hit by #COVID19 “might seek support.” https://about.fb.com/news/2020/04/data-for-good/
8/ Boggles the mind that @Facebook would try to quantify “friendship” with data. Also suggests that they use non-location data - the carefully worded release references “aggregated data” but not location data. But what other data are they using, and how?
9/ Certain maps are only shared with FB partners, so we will never know. Many “data for good” initiatives are secretive and profoundly undemocratic. When they involve agreements with govs, hard to FOI or challenge because "trade secrecy." https://ainowinstitute.org/aiareport2018.pdf
13/ Data/AI for Good is not unique to the public health space. @latonero has scrutinized Big Tech’s growing interest to solve complex global problems with big data, from humanitarian aid delivery to wildlife poaching: https://www.wired.com/story/opinion-ai-for-good-is-often-bad/
14/ Good intentions do not erase social biases from large datasets, which can elude even the most scrupulous programmers. @s010n & @aselbst 2016 paper on Big Data’s Disparate Impact provides an excellent rundown of the ways that big data discriminates: http://www.californialawreview.org/wp-content/uploads/2016/06/2Barocas-Selbst.pdf
15/ Good intentions also do not erase privacy concerns. @CDCgov partnership with “mobile advertising industry” relies on anonymized data, but it is well-known that this merely delays, not prevents its unmasking. @atsaraharrison @themarkup has more: https://themarkup.org/ask-the-markup/2020/03/24/when-is-anonymous-not-really-anonymous
17/ How will these initiatives play out in Global South? Big data hubris during the Ebola outbreak had shades of digital colonialism. @chinmayiarun has examined how the broader #AI / #BigData discourse entrenches North-South and intra-South inequalities: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3403010
18/ In particular, “The elite that govern the countries from which the [data] extraction takes place are often complicit in this extraction. The burden of the extraction is borne by the disenfranchised.” https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3403010
You can follow @AmosToh.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: