Thread by @StatsForBios, For the past 4 years I've mostly worked on disaggregation regression. As [...]

For the past 4 years I& #39;ve mostly worked on disaggregation regression. As this work is mostly published or in preprint, and because I doubt I& #39;ll be working much more on it, I thought I& #39;d do a thread covering the work.

Disaggregation regression is regression where the response data is at course resolution and the covariates or random effects are at a higher resolution.

Here resolution typically refers to space, but it could be time, phylogeny, or anything else. All of my work has been spatial. I& #39;d love to see someone apply disaggregation regression to phylogeny though!

The basic model, with high res variables referenced with j and low res bars with I, looks like this:
yj ~ Pois(incj)
incj = sum( inci x popi )
inci = exp(bXi + spatial_fieldi).

The summation is over all pixels i, in low res area j.

Generally, standard software can& #39;t fit these types of models. We have been using TMB and @aknandi had written an R package to make the models more accessible.
https://github.com/aknandi/disaggregation
https://github.com/aknandi/d... href=" https://arxiv.org/abs/2001.04847 ">https://arxiv.org/abs/2001....

aknandi/disaggregation

R package containing methods for Bayesian disaggregation modelling - aknandi/disaggregation

https://github.com/aknandi/disaggregation

If you want a linear link function, and low resolution covariates, you can fit these models in INLA. @Paula_Moraga_ https://www.sciencedirect.com/science/article/pii/S2211675317301318?casa_token=WQ-_QbEl-HYAAAAA:QcS6frINyNsoEDzbcJA5TtQsQSfU47thRXvjcb2F7QnrlxgmT1rcWG_8baVgU2TA1KTnzT8GRL0">https://www.sciencedirect.com/science/a...

A geostatistical model for combined analysis of point-level and area-level data using INLA and SPDE

In this paper a Bayesian geostatistical model is presented for fusion of data obtained at point and areal resolutions. The model is fitted using the I…

https://www.sciencedirect.com/science/article/pii/S2211675317301318?casa_token=WQ-_QbEl-HYAAAAA:QcS6frINyNsoEDzbcJA5TtQsQSfU47thRXvjcb2F7QnrlxgmT1rcWG_8baVgU2TA1KTnzT8GRL0

@RohanArambepola did a big simulation study to explore under what circumstances the models work well. He also found low res CV is an ok predictor of high res performance.
https://arxiv.org/abs/2005.03604 ">https://arxiv.org/abs/2005....

We& #39;ve looked at a number of ways to combine low res incidence data with point level prevalence data. We fitted #MachineLearning models to prevalence data and used disaggregation regression on incidence to ensemble the predictions. Modest benefits. https://www.biorxiv.org/content/10.1101/548719v1.abstract">https://www.biorxiv.org/content/1...

Model ensembles with different response variables for base and meta models: malaria disaggregation...

Maps of infection risk are a vital tool for the elimination of malaria. Routine surveillance data of malaria case counts, often aggregated over administrative regions, is becoming more widely...

https://www.biorxiv.org/content/10.1101/548719v1.abstract

We also looked at full joint models of prevalence and incidence data on different spatial scales. This was definitely more finickety. There are benefits (more statistical power, spatial information) and disbenefits (the model learn biases in the prevalence data.

This preprint might be about to undergo quite large changes so take that into account. https://www.medrxiv.org/content/10.1101/2020.02.14.20023069v1">https://www.medrxiv.org/content/1...

Mapping malaria by sharing spatial information between incidence and prevalence datasets

As malaria incidence decreases and more countries move towards elimination, maps of malaria risk in low prevalence areas are increasingly needed. For low burden areas, disaggregation regression...

https://www.medrxiv.org/content/10.1101/2020.02.14.20023069v1

We have applied these models at scale. Here we apply then to predict malaria Vivax globally 2000-2019.
https://www.sciencedirect.com/science/article/pii/S0140673619310967">https://www.sciencedirect.com/science/a...

Here we predict malaria falciparum incidence outside of Africa using disaggregation regression. In Africa @DrSamirBhatt primarily used prevalence data.
https://www.sciencedirect.com/science/article/pii/S0140673619310979">https://www.sciencedirect.com/science/a...

Finally, Leon Law used similar models but using variational Bayes to have a full Gaussian process on covariates and space. The maths is fairly beyond me... I just helped interpret the malaria case study.
http://papers.nips.cc/paper/7847-variational-learning-on-aggregate-outputs-with-gaussian-processes">https://papers.nips.cc/paper/784...

Latest Threads Unrolled: