** #Tweetorial**

Today, I presented @SAA2020, on the use of compliance thresholds in intensive longitudinal data. This thread is for those who couldn’t’ make it.

⚠️Warning⚠️
This may upset ambulatory assessment researchers.

(1)
Ecological momentary assessment studies ask ppl to complete the same questions over & over again in their daily lives. Often 40-100+ times

Answering these questions isn’t for the faint of ❤️ & often leaves large % of missing data for some

🤫Even me trialing my own studies

(2)
There are many different flavors of missingness:

1-Missing completely due to chance (missing completely at random)
2-Missing due to variable observed in the study (missing at random)
3-Missing due to an outside influence not measured in the study (missing not at random)

(3)
Many of you are probably already aware that it's problematic to throw out an entire record b/c a single entry is missing (i.e. listwise deletion).

Listwise deletion:
⛔️Lowers power
⛔️Biases parameter estimates
⛔️Affects standard error estimation

(3)
However, in 2020 the "recommended" practice in ambulatory assessment is use and justify compliance thresholds.

🤔Hmm, so we "should":
Do more than just throw out incomplete observations, & throw out many completed rows of data

Data that ppl painstakingly responded to?
😵

(4)
🚩Systemic problem🚩:
Excluding those struggling the most

E.G. I study depression.

Those with ⬆️depression is probably less likely to be able to answer all of these prompts.

Using these arbitrary thresholds systematically eliminates many of the people I want to study

(5)
In a 50-day daily diary study in a sample of 176 people:

I studied whether I could predict who would have high compliance based on individual differences in personality, affect, and psychopathology traits using #MachineLearning

(6)
I could predict those with high compliance w/ high precision (AUC = 0.78)

Those in the predicted top half of the range score were 15x more likely to be “compliant” than those in the bottom half.

(7)
This means that data are *NOT* missing due to chance.

When we throw out ppl’s responses by using compliance thresholds, we are introducing bias as to who is included in our studies

(8)
Appropriate methods can ⬆️% missing even in complicated multivariate models with high lags (70%+ here):
https://link.springer.com/article/10.3758%2Fs13428-018-1101-0

My simulations w/ 90%+ missing data can achieve:
✅Good point estimates
✅Good SEs

More important than % missing:
# of complete observations

(10)
I am calling for researchers to 🛑STOP🛑 this practice!

Do: Use all the observations in the study from all participants!
Do: Use appropriate model-based missingness strategies
Do: Use passive sensing to give you more information about participants

(11)
Reviewers/Editors, you🛑 too!

🚫Penalize studies that find ⬆️missingness

This could be a naturally occurring based on the population being studied. Evaluating study quality by % missingness WILL bias literature.

👍 Do: Ask for all persons to be included in analyses

(12)
My slides are all available here:
http://www.nicholasjacobson.com/talk/saa2020_compliance_thresholds/

Paper & more data will be coming soon to a journal near you.

Thanks to all who have already suffered by twitter rants, including @FallonRGoodman, @aidangcw, @aaronjfisher, among others

(/end rant)
#SAA2020aus
You can follow @NC_Jacobson.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: