Two new developments in #differentialprivacy for the #2020census
(thread)
NEW report from the JASON brain trust evaluates DP
1. Is re-identification a realistic threat?
2. Has Census properly assessed re-id vulnerability?
3. What are trade-offs between privacy-loss and accuracy & how can those be managed?
https://www2.census.gov/programs-surveys/decennial/2020/program-management/planning-docs/privacy-methods-2020-census.pdf
Some of the findings are as expected
Re-ID is a risk, and DP could be helpful
BUT
"At some proposed levels of confidentiality protection, and especially for small populations, census block-level data become noisy and lose statistical utility"
Some of the findings are... quite blunt about challenges encountered in the process
One of the recommendations is specific--report LESS but with more ACCURACY:
"Evaluate the trade-offs between re-identification risk and data utility arising from publishing fewer tables (e.g. none at the block-level) but at larger values of the privacy-loss parameter"
There's more, but it's 142 pages. If you're following #differentialprivacy for #2020census I'd recommend reading. The report does a nice job of summarizing competing needs for preventing re-ID *and* providing data to public that is fit-for-use.
Also if you didn't already know, in addition to the JASON report, the Census Bureau has engaged an expert group* through @theNASEM to help wade through these thorny issues.
.
(*It's quite a cast of characters. They even let me in.)
The OTHER news is that #2020census has made an appearance in the latest House coronavirus relief bill.
The first bit extends reporting deadlines.
https://docs.house.gov/billsthisweek/20200511/BILLS-116hr6800ih.pdf (starts page 752)
But the #differentialprivacy news here is that this version of the bill also includes a comment about data QUALITY for #2020census
https://docs.house.gov/billsthisweek/20200511/BILLS-116hr6800ih.pdf (page 753)
What does this mean for #2020census?
Unclear.
The bill may not move forward in its current form.
The language may only pertain to elements (like state population) that were already expected to be unadjusted by #differentialprivacy
BUT the bill language (if enacted) may also affect population data for redistricting, which would have dramatic implications for #differentialprivacy
So... stay tuned.
You can follow @DataGeekB.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: