Today, I will not talk about one particular paper. I will explain the basics of MTE and will mention the papers that I believe to be important. As my wife says, I like to preach and spread the Gospel of MTE.
                        
                        
                        First, a disclaimer: I definitely missed some important and interesting papers in this very long thread. And I apologize if I forgot your favorite paper. In this case, feel free to include it in this thread!
                        
                        
                                                    
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Before defining the MTE, let& #39;s explain the underlying selection-into-treatment model. We have the usual potential outcomes, Y1 and Y0. Treatment (D) selection follows an index model, where Z is the instrument and V is some sort of unobserved heterogeneity.
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Let& #39;s put some names on those variables to make interpretation easier. D can be attending college, Y can be labor earnings, V can be the individuals& #39; benefit from attending college and P(Z) is the cost of attending college, which is a function of the cost Z (tuition).
                        
                        
                                                    
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        This example is analyzed by Carneiro, Heckman and Vytlacil (2011,  https://bit.ly/3eu4ppD )">https://bit.ly/3eu4ppD&q... who analyze the college wage premium in the US. Another example is Bhueller, Dahl, Loken and Mogstad (2020,  https://bit.ly/3eFnJAv ),">https://bit.ly/3eFnJAv&q... who study incarceration and recidivism.
                        
                                                
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        When I first say this model during my master course, I thought that it was more restrictive than the monotonicity assumption used to identify the LATE. I was wrong. Vytlacil (2002,  https://bit.ly/3da3MRW )">https://bit.ly/3da3MRW&q... shows that the index model and the monotonicity assumption are equivalent.
                        
                                                
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        In other words, the index model implies the monotonicity assumption and, if the monotonicity assumption holds, we can find an index model that rationalizes the data. I find this result mind-blowing and Heckman and Pinto (2018,  https://bit.ly/3dbZ5Hd )">https://bit.ly/3dbZ5Hd&q... offer a very elegant proof.
                        
                        
                                                    
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        But what do we gain by being explicit about our selection model? We can define the marginal treatment effect, which, at this point, can be interpreted as the average treatment effect for an individual whose unobserved heterogeneity is equal to v.
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                        
                        
                        In our example, it is the college premium for someone whose utility to attend college is equal to v.
This parameter is beautiful because many famous treatment parameter can be written as weighted integrals of the MTE. For example, ATE, ATT and LATE are functions of the MTE.
                    
                                    
                    This parameter is beautiful because many famous treatment parameter can be written as weighted integrals of the MTE. For example, ATE, ATT and LATE are functions of the MTE.
                        
                        
                        The result in the last detailed was detailed by Heckman and Vytlacil (2005,  https://bit.ly/2ZNkL8v ).">https://bit.ly/2ZNkL8v&q... They also discuss the Policy Relevant Treatment Effect, that allow us to think about counterfactual policies that are not associated to one particular instrument.
                        
                                                
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        This possibility relies on the policy invariance of the MTE. Intuitively, this property means that the MTE definition does not depend on the instrument. Note that the LATE, whose definition depends on the compliers of a specific instrument, is not policy invariant.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Now, how can we identify the MTE? If the instrument is continuous, we can use the LIV estimator in the lef-hand side of this equation to identify the MTE evaluated at V = p. This result is reason behind the MTE& #39;s name.
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                        
                        
                        It identifies the average treatment effect for someone at the margin of indifference between taking the treatment or not. In our example, that& #39;s the college premium for someone whose cost to attend college is P(Z)=p and whose benefit of attending college is V = p.
                        
                        
                                                    
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        Before we discuss estimation, let& #39;s discuss some empirical results. Carneiro, Heckman and Vytlacil (2011) estimate the marginal college premium. When V (U_S in the figure) is small, it means that the idiosyncratic benefit of attending college is large. (This bit is confusing.)
                        
                        
                        
                        
                                                
                        
                                                
                    
                    
                                    
                    
                        
                        
                        They find that the individuals who are more likely to attend college due to their large idissioncratic benefits are also the individuals with the largest college premiums. Individuals who are unlikely to attend college (large V) have negative returns to college.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        To estimate the LIV, you can use your favorite non/semi-parametric estimator. If you don& #39;t have covariates, Calonico, Cattaneo and Farrel (2019) offer R and Stata packages that easily implement a fully nonparametric (local polynomial) estimator for the LIV.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        If you have covariates, Carneiro, Lokshin and Umapathi (2016,  https://bit.ly/3caF6HD )">https://bit.ly/3caF6HD&q... explain how to estimate the LIV semiparametrically. They also provide code.
All those methods require a continuous instrument. What should we do if our instrument is discrete?
                    
                                    
                    All those methods require a continuous instrument. What should we do if our instrument is discrete?
                        
                        
                        There are two solutions. Brinch, Mogstad and Wiswall (2017,  https://bit.ly/2zGbtR5 )">https://bit.ly/2zGbtR5&q... impose a parametric model and provide point-estimates for the MTE. Mogstad, Santos and Torgovitsky (2018,  https://bit.ly/2ZNCiO2 )">https://bit.ly/2ZNCiO2&q... provide partial identification results in a nonparametric model.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        The latter is implemented in a R package by Shea and Torgovitsky ( https://bit.ly/36PrnFf ).
Another">https://bit.ly/36PrnFf&q... easy way to estimate the MTE is to use Andresen& #39;s Stata package ( https://bit.ly/2M6HHYD ).">https://bit.ly/2M6HHYD&q...
                    
                                    
                    Another">https://bit.ly/36PrnFf&q... easy way to estimate the MTE is to use Andresen& #39;s Stata package ( https://bit.ly/2M6HHYD ).">https://bit.ly/2M6HHYD&q...
                        
                        
                        What are some cool things that can be done using the MTE?
Empirically, a recent and interesting example was given by Cornelissen, Dustmann, Raute and Schönberg (2018, https://bit.ly/2TQvEmt ).
Theoretically,">https://bit.ly/2TQvEmt&q... I have discussed the work by Zhou and Xie ( https://twitter.com/PossebomVitor/status/1246818315273752577).">https://twitter.com/PossebomV...
                    
                                    
                    Empirically, a recent and interesting example was given by Cornelissen, Dustmann, Raute and Schönberg (2018, https://bit.ly/2TQvEmt ).
Theoretically,">https://bit.ly/2TQvEmt&q... I have discussed the work by Zhou and Xie ( https://twitter.com/PossebomVitor/status/1246818315273752577).">https://twitter.com/PossebomV...
                        
                        
                        Eisenhauer, Heckman and Vytlacil (2015,  https://bit.ly/3gx1lLm )">https://bit.ly/3gx1lLm&q... add a little bit more structure to the selection model and can discuss marginal costs and marginal benefits. It connects beautifully to standard economic theory models.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                         @OBartalotti, Kedagni and I ( https://bit.ly/3de8SfP )">https://bit.ly/3de8SfP&q... have a working paper that partially identifies the MTE when there is endogenous sample selection on top of selection-into-treatment. We address simultaneously two identification problems using easy-to-interpret assumptions.
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                         @autoregress, Arnold and Dobbie ( https://twitter.com/autoregress/status/1244649172474740744)">https://twitter.com/autoregre... also discuss the MTE without the IV monotonicity assumption. Peter already wrote a super interesting thread on the topic.
                        
                            
                            
                            
                        
                        
                        
                        
                                                
                    
                    
                                    
                    
                        
                        
                        There is a lot of interesting work being done on this area and some papers are not yet available. So, let& #39;s anxiously wait for some updates! If you are working on the topic, feel free to include your paper in this thread!
                        
                        
                                                    
                        
                        
                                                
                    
                    
                
                 
                         Read on Twitter
Read on Twitter 
                             
                             
                                         
                                         
                             
                             
                                     
                                    