Introducing today to the world: “The Observatory of Anonymity”, spanning 89 countries and allowing you to test your degree of anonymity when sharing data online.
https://cpg.doc.ic.ac.uk/observatory/ 

All statistical models run entirely in the browser, we don't collect any personal data. (1/n)
Two years ago, we published our findings in Nature Comms showing that even heavily sampled, anonymous datasets can be re-identified. The code—a mixture of Fortran, Julia, and Python—is not easy to train and run from your phone. (2/n) https://twitter.com/cynddl/status/1153711987878223873
Have you ever wondered “This is a lot of information in this form to be truly anonymous”? The Observatory estimates the probability that your profile would be correctly re-identified in ‘anonymised’ data. The more information entered, the more likely you are to be unique. (3/n)
Companies and governments both routinely collect and use our personal data. Anonymised data falls outside of the scope of modern data protection regulations, such as CCPA or GDPR (Recital 26). But what makes data anonymous? (4/n)
Many have downplayed the risk of re-identification by arguing that the data they collect are always incomplete. They argue that identifying the correct person in anonymous data is difficult, since one might have re-identified someone else with the same characteristics. (5/n)
The Observatory demonstrates that combining a few pieces of basic information can be enough to identify people correctly in the complete population. Not much information is needed to go from anonymous back to personal data (6/n). https://twitter.com/doctorow/status/1384884730575917057
What can you do on the Observatory? There's first an initial quiz, which I'm gonna let @doctorow present: (7/n) https://twitter.com/doctorow/status/1384884734048755712
Then, you can explore the correctness of re-identification in others countries (89!) using various combinations of demographic data.

We trained our models using census data from @ipumsi and @Statbel_en. If you want to add other countries, pm me. (8/n) https://cpg.doc.ic.ac.uk/observatory/explore
Finally, the gem: you can upload your own datasets (as CSV files) and your phone or computer will silently crush numbers and train our model in real-time using Fortran code ported to Javascript. 🤯 (9/n)
This project has been made possible thanks to the good work of my colleages Sundar (whose behind all the javascript madness) and @yvesalexandre at @imperialcollege, Julien Hendrickx at @UCLouvain_be; support from @ICOnews and kind data access by @Statbel_en. (10/n)
The Observatory will appear in the proceedings of the @TheWebConf soon: https://rocher.lc/observatory-www21.pdf!

We accept kind citations of any sorts, my grant applications love them. (11/n)
You can follow @cynddl.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: