So proud of everyone involved in this study, 2nd of the @USAID #cashbenchmarking initiative. @AndrewZeitlin and Craig McIntosh, @EDCtweets, @GiveDirectly, @USAIDRwanda, @poverty_action, @CEGA_UC , @LizBrow29385570 and the incomparable @DanielHandel17

BIG REFLECTIONS THREAD https://twitter.com/poverty_action/status/1301542920940589059
Setting up & executing high-quality RCTs is really HARD. @deankarlan has a book about the things that can go wrong. If N = number of things that can go wrong in a standard RCT, evaluating 2 orgs with distinct interventions is not 2N, its N^2. H/T to PIs & IPs managing complexity!
@USAID relies on both hierarchical AND consensus-driven decision-making. Table stakes for any new program: bring all of the relevant stakeholders into the fold, BUT ALSO buy-in from leaders. These studies couldnt have happened w/o @DanielHandel17 and an ARMY of other advocates.
USAID also relies heavily on precedent. Costs of innovation are high to innovators, but once you've done something new like #cashbenchmarking or #impactbond with @village_ent & @DFID, its easier for the rest of the agency to replicate the innovation. https://twitter.com/village_ent/status/917811982509551616?s=20
But the key questions are: do the innovations scale? Does the agency change its approach in response to evidence?

Well.... these are the questions we should be asking after the 2 benchmarking studies in Rwanda!

So what have we learned?
Newest study has 2 main findings:
1⃣ Neither cash nor the job training program (Huguka Dukore or HD) ⬆️ employment *main outcome of interest*
2⃣ Cash *outperforms* HD on all other primary and secondary outcomes except business knowledge

So HD does not clear the cash benchmark
Put this evaluation in context: a recent performance evaluation by @DexisConsulting concluded "the majority of youth find new or improved employment after the HDAK project and therefore have higher incomes now than before training"

Is the perf eval wrong?
https://pdf.usaid.gov/pdf_docs/PA00WGVG.pdf
NO, but it does not evaluate HD against a COUNTERFACTUAL (what happens in the absence of the program). If you look at Andy/Craig's study, you'll see control employment rt. @ baseline = 33%, endline = 48%.

People tend to go from unemployed to employed over time... yeah!
The timeline of the performance eval, btw, partially overlapped with the study period. So roughly same point in time.

Therefore a majority of HD participants can find employment and increase income relative to pre-program... AND HD CAN STILL HAVE NO IMPACT on income/employment.
Zooming out a bit: that Dexis performance eval - historically that's the main type of eval the agency conducts on its activities. ~90+% of all evaluations ever conducted at USAID are performance evals! There's useful info in there... but it cant tell us if we're really ⬆️outcomes
Where does this leave us? Are we just peering into the abyss?

I am reminded of this great piece by @80000Hours. Two big takeawys:
1⃣ Most (70-90%) social programs don't hold up under rigorous evaluation
2⃣ There are still interventions we know WORK WELL https://80000hours.org/articles/effective-social-program/
So lets do more of the stuff that works and continue rigorously evaluating so we can LEARN. and huzzah to USAID for this bold research agenda - onward!

And ask people like @AnneHHealy @MichaelEddy @NormaAltshuler @sasha_gallant @KareninKenya @DanielHandel17 about what works!
You can follow @jcarbiv.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: