there seems to be an interesting asymmetry between MCMC and variational inference, in the sense that

- there are some cases in which MCMC can be used (≈ naturally), to improve VI, but
- there are relatively fewer in which VI methods can be used to improve MCMC.

[ 1 / 4 ]
i think it& #39;s a local / global thing; it& #39;s relatively plausible that any local algorithm (e.g. MCMC, gradient descent, ...) could be embedded in some other method. on the other hand, VI is basically trying to solve a global problem, and often (?) not that well.

[ 2 / 4 ]
most of the more convincing `VI-within-MCMC& #39; schemes seem to boil down to either preconditioning or reparametrisation. it& #39;s a sensible enough goal - to be worthwhile, you only need to improve on the original parametrisation, which should be possible fairly often.

[ 3 / 4 ]
i& #39;m curious to know whether there are other useful ways of nesting VI in MCMC, though it& #39;s not clear to me what the right approach is. it& #39;s tricky to pin down what a variational fit buys you in this context that e.g. a { MAP / Laplace approx./ ... } doesn& #39;t.

[ 4 / 4 ]
You can follow @sam_power_825.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: