Hi everyone! Delighted to be the first tweeter for #WallingfordECRTweets @UK_CEH. I’ll be tweeting about the methods of my current work: How to integrate data for large-scale #SpeciesDistributionModels using INLA. (1/11)
Species distribution models ( #SDMs) predict species distributions by relating information on environmental conditions to known locations of a species. They are commonly built around one particular type of data. But what if we have several different datasets? (2/11)
Integrated SDMs allow us to make use of several datasets without ignoring their differences. It enables us to retain the strengths of each dataset and correct their weaknesses to a degree by accounting for the observation process of each dataset. (3/11)
We want to model species distribution as a function of some covariates. The actual distribution is a latent state, i.e. can’t be directly observed. We use observation models to describe how data were generated from the latent state. The two models form a state-space model. (4/11)
How do we infer the latent state if the observation models describe different ‘currencies’ (e.g. occurrence and abundance)? We use a point process model, which describes how points are distributed in space through an intensity surface that represents density of points. (5/11)
Now let’s integrate some data! I am using @BobOHara’s R package (not yet on CRAN) PointedSDMs ( https://github.com/oharar/PointedSDMs) to integrate two different datasets. The general steps are: Clean data. Make a mesh. Create stacks. Run the model. Admire output. (7/11)
The data I work with has restricted access so I won’t be able to share a map with the data points. Data A are presence-absence, data B are presence-only, covariates are broadleaf and arable land cover. For this demo I'm only using only a small area (Somerset + Dorset). (8/11)
The arguably most important part is the mesh. Space is continuous and needs to be approximated in our model. The mesh is a way to do that by dividing space into triangular tiles. The mesh can be customised - this is what mine looks like. The blue outlines the study area. (9/11)
Now we make the so called stacks. Stacks are a way to organise data, effects, and projection matrices in INLA. We can then use all of these stacks to fit our model. And behold, we have a fitted distribution for our mystery species! (10/11)
Feel free to ask me any questions! And don’t forget to add the paper that this work is based on to your lockdown reading list http://doi.org/10.1016/j.tree.2019.08.006. A big thank you to my supervisors @drnickisaac @ProfKateJones & Katherine Boughey. And to @BobOHara for all the help!(11/11)
You can follow @LeaDambly.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: