đź’ˇEconometrics Treadđź’ˇIdentification, Inference and Sensitivity Analysis for Causal Mediation Effects by Imai, Keele and Yamamoto (IKY, 2010, https://bit.ly/3emZ4Ag )

Mediation analysis decomposes a treatment effect in different causal mechanisms.
It is a hard problem in social sciences and health sciences. IKY offer a simple nonparametric solution under a sequential exogeneity assumption. But before we discuss all the details, let's think about an example to fix ideas. (This is not their empirical application.)
Imagine a new educational program for high schoolers, e.g., extra Math classes. Does it impact labor earnings 10 years after graduation? A better understanding of Math may have an impact on productivity and increase labor earnings directly.
It may also increase the probability of college attendance, increasing labor earnings indirectly. It is important to separately identify both types of effects. For example, if the first type is small, we may want to modify the content of the classes.
We denote attending the extra Math classes as T, attending college as M, and labor earnings as Y. Instead of the usual 2 potential outcomes, we now have 6 potential variables, which is a challenge in itself: M(0), M(1), Y(t=0, m=0), Y(t=0, m=1), Y(t=1, m=0), Y(t=1, m=1).
IKY's identifying assumption constraints all those variables. Equation 4 says that attending extra classes is independent from potential college attendance and potential earnings. If we have an RCT where we randomize who can enroll in this course, we can ensure Assumption 4.
Assumption 5 is the tricky one, even in a RCT. It says that conditional on attending extra math classes, attending college is independent from potential labor earnings. This assumption seems concerning.
IKY are fully aware of this problem and propose a sensitivity analysis in section 5. Although I am a fan of sensitivity analyses, I will not discuss it here because it would require heavier notation. But, for now, let's assume that assumption 5 holds.
Under Assumption 4 and 5, IKY can nonparametrically identify the direct and the indirect effects. The formulas look somewhat scary, but IKY provide an R package ( https://bit.ly/36uLuZ4 ) that implements everything easily.
But nonparametric stuff can be problematic if there are many covariates. So you may prefer to use a simple linear model. As Brazilians say, you may prefer rice and beans. They propose a small extension of the classic model proposed by Baron and Kenny (1986).
Their equations are
M = a + bT + U
Y = c + d T + f M + g T M + V
Under assumption 4 and 5 and linearity, the direct effect is given by d + g(a + bt) while the indirect effect is given by b(f + gt). They can be estimated using OLS and standard errors can be computed by bootstrap.
The really important point of this linear result is that the direct effect is not simply "d" (or even d + g) and the indirect effect is not simply "f" (or even f + g). In other words, mediation analysis is more complicated than simply interaction effects.
You can follow @PossebomVitor.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: