New preprint from @andyguess and me!

We attempt to unify disparate literatures and establish "digital literacy" as a key concept for online social science.

Further, we find that MTurk is uniquely bad in its lack of low digital literacy respondents.

1/n https://osf.io/3ncmk 
Theoretically, we argue that "audience capacity" is relatively unimportant for the broadcast technologies(television and radio) that dominate our intuitions about media

Those are "easy" to consume; the internet and social media are much harder

(yes there are graphs coming)

2/n
We define "digital literacy" as "online information
discernment combined with the basic digital skills necessary to attain it"

And we think this is the key measure of audience capacity for social media.

3/n
We use three survey batteries designed to measure some elements of digital literacy:

@eszter's 21-question digital literacy battery (Hargittai 2009)

Sundar and Marathe (2010)'s "Power User" scale

A novel "Low End" scale to capture variation in low-DL users

4/n
We conduct a "horse race" between these scales (plus metadata on browser/os/version) in their ability to differentiate between two "purposive samples"

High DL: tech company employees
Low DL: students in intro computer classes at Brooklyn or Princeton Public Library

5/n
For comparison, we also recruited larger samples from three standard sources of online subjects:

MTurk

Facebook ads

Lucid (nationally representative on age, gender, ethnicity, and region)

the distribution of all three measures is quite different across samples!

6/n
But most striking is the age distribution.

This project was inspired by the evidence of massive differences in fake news exposure/sharing by older Americans.

MTurk is easily the worst in terms of age representativeness. Next we show that this actually a problem.

7/n
The Facebook and Lucid samples provide evidence of a strong negative correlation between age and digital literacy.

In the MTurk sample, however, there is *NO* such relationship. The average 20 year-old and 70 year-old MTurker have the same digital literacy.

8/n
We asked people to perform an "information verification" task during the survey---to "cheat" and look up the answers to questions like "Who is the Prime Minister of Croatia?"

Older FB users were worse at it...but older MTurkers were *better*

9/n
In table form, with a control for measured digital literacy, we see that the estimated effect of age on correct information retrieval is *exactly the opposite* in the two samples.

The MTurk sample produces a statistically significant result in the wrong direction

10/n
Here's a DAG of why

It's conditioning on a collider!
To conclude, we look at the individual components of the three measures of digital literacy and see which best predict membership in one of the two samples.

We thus propose a hybrid measure, one that we have reason to think will be high in temporal validity

11/n
Thanks for reading! Any comments much appreciated.

As more of our lives take place online, we need to appreciate the extent of the variance in digital literacy in the US today. Academics/journalists are all way off on the high end, and we need to get outside our bubbles

12/12
You can follow @kmmunger.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: