R code and packages
MRFcov: Markov Random Fields with additional covariates in R
I co-developed and maintain the MRFcov R package (described by Clark et al, published in Ecology’s Statistical Reports Section), which provides some of the best statistical tools for detecting how network interactions change over time or across environmental gradients. You can download the article PDF here. In short, the package’s primary functions are used for approximating interaction parameters of nodes in undirected Markov Random Fields (MRF) graphical networks. Models can incorporate covariates (a class of models known as Conditional Random Fields; CRFs; following methods developed by Cheng et al 2014), allowing users to estimate how interactions between nodes are predicted to change across covariate gradients.
In principle, MRFcov models that use species' occurrences as outcome variables are similar to joint species distribution models (see here for a nice review of these models, and here for an example that my coauthors and I published) in that variance in occurrences can be partitioned among abiotic and biotic drivers. However, key differences are that MRFcov models can:
(1) Produce directly interpretable coefficients that allow users to determine the relative importances (i.e. effect sizes) of species' interactions and environmental covariates in driving occurrence probabilities
(2) Identify interaction strengths, rather than simply determining whether they are "significantly different from zero"
(3) Estimate how interactions are predicted to change across environmental gradients
MRF and CRF interaction parameters are approximated using separate regressions for individual species within a joint modelling framework. Because all combinations of covariates and additional species are included as predictor variables in node-specific regressions, variable selection is required to reduce overfitting and add sparsity. This is accomplished through LASSO penalization using functions in the glmnet R package. Methods such as this could be increasingly important as habitat modification and climate change continue to disrupt natural communities. You can easily install the package from GitHub (make sure that the devtools package is installed first) using:
Then, follow the vignettes to get started:
vignette("CRF_data_prep") and vignette("Bird_Parasite_CRF")
Find out a bit more about how MRF and related methods outperform more traditional co-occurrence methods at this blogpost (R code included). Please give the package a run and start gleaning additional insights from your multi-species community datasets!
MalAvi Global Biogeography R Code
Here at my figshare account, I have provided all R code (including a range of useful functions) that I developed for my recent paper in Global Ecology and Biogeography (get the PDF here). These functions build on code written by Dr Vincenzo Ellis for the malaviR package to download avian blood parasite occurrence and cytochrome-b sequence data from the MalAvi database. I then demonstrate how to estimate phylogenetic diversity for parasite communities and use hierarchical linear models to carry out global-scale analyses of community phylogenetic turnover.
In my view, this packaging of all R code, including the functions needed to download data, represents a much-needed step forward to generating more reproducible research. Please let me know if you have comments or plan to use the functions in your own research, I’m very happy to chat about ideas for projects.
Other Research Papers with Open-Source R Code and Datasets
Please see the following papers for other examples of open-access R code and datasets:
Clark, Nicholas J. and Soares Magalhães, R.J. (2018). Airborne geographical dispersal of Q Fever from livestock holdings to human communities: a systematic review and critical appraisal of evidence. BMC Infectious Diseases doi: 10.1186/s12879-018-3135-4 PDF | Altmetric
Clark, Nicholas J., Wells, K., Lindberg, O. (2018). Unravelling changing interspecific interactions across environmental gradients using Markov random fields. Ecology doi: 10.1002/ecy.2221 PDF | Blog summary | Altmetric
Clark, Nicholas J., Seddon, J.M., Kyaw‐Tanner, M., Al-Alawneh, J., Harper, G., McDonagh, P., and Meers, J. (2018). Emergence of canine parvovirus subtype 2b (CPV-2b) infections in Australian dogs. Infection, Genetics and Evolution doi: 10.1016/j.meegid.2017.12.013. PDF | Blog summary | Altmetric
Clark, Nicholas J., Seddon, J.M., Šlapeta, J., and Wells, K. (2018). Parasite spread at the domestic animal - wildlife interface: anthropogenic habitat use, phylogeny and body mass drive risk of cat and dog flea (Ctenocephalides spp.) infestation in wild mammals. Parasites & Vectors doi: 10.1186/s13071-017-2564-z. PDF | Blog summary | Altmetric | The Conversation
Clark, Nicholas J., Clegg, S.M., Sam, K., Goulding, W., Koane, B. & Wells, K. (2017). Climate, host phylogeny and the connectivity of host communities govern regional parasite assembly. Diversity and Distributions doi: 10.1111/ddi.12661. PDF | Blog summary | Altmetric
Clark, Nicholas J. and Clegg, S.M. (2017). Integrating phylogenetic and ecological distances reveals new insights into parasite host specificity. Molecular Ecology 26(11), 3074-3086. PDF | Blog summary | Altmetric
Return to top of page