Since the earliest days of digital legal records, redaction failures have been a source of perpetual mirth and chaos. The most common failure is simply adding black boxes over text in PDFs; the text can be easily recovered by selecting the underlying text and copying it.

1/
This 2011 study by @binarybits for @recapthelaw reveals how widespread the problem was a decade ago:

https://freedom-to-tinker.com/2011/05/25/studying-frequency-redaction-failures-pacer/

3/
It's only gotten worse since. Better redaction systems - blurring and pixelation - turn out be vulnerable to machine learning attacks that unblur these elements:

http://arxiv.org/pdf/1609.00408v2.pdf

4/
Within a few hours, journalists at @Slate had reversed many of these redactions! Their secret weapon was the deposition's index, which was also redacted, but which nevertheless served as a key for uncovering the masked-out names.

7/
For example: the journalists saw that a redacted word that fell alphabetically between "client" and "clock" appeared on several pages. They know that this is a name that starts with "Cl." But only SOME instances of that name have been redacted.

8/
On page 135, line 7, that name appears in the clear: "President Clinton." Now we know that all the places in which that name is redacted, it can be unmasked as "President Clinton."

9/
A similar method revealed the places where Alan Dershowitz's name had been blacked out: a word that comes between "Airport" and "Alcohol" appears before a word that comes between "Depth" and "Describe" on several pages.

10/
The inference that the A-word is "Alan" and the D-word is "Dershowitz" is validated through context.

A related technique reveals the blacked-out instances of Prince Andrew's name.

11/
All in all, the journalists de-redacted mentions of 15 people, from Chelsea Clinton to Marvin Minsky to Kevin Spacey to Al Gore. Note that their presence in this record is not proof of their direct complicity in sex-crimes.

12/
Epstein's method involved mixing legitimate business (particularly scientific research) with child rape in ways that blended people who suspected his crimes, knew of his crimes, and participated in his crimes, all together in a jumble of varying complicity and knowledge.

13/
I don't know if we'll ever know the full truth of the crimes committed (and abetted) by wealthy, powerful people.

14/
But this de-redaction attack is noteworthy irrespective of the Epstein case. In some ways, it militates for a heavier hand in redaction, blocking all instances of a term (even those that don't reveal sensitive info) and/or redacting indexes.

15/
As to the Maxwell deposition, the Slate journalists are seeking help in reversing the remaining redactions in the document.

eof/
You can follow @doctorow.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: