Generate null ecological niche models (ENMs) and compare null with empirical performance metrics
ENMnulls.Rd
ENMnulls()
iteratively builds null ENMs for a single set
of user-specified model settings based on an input ENMevaluation object,
from which all other analysis settings are extracted. Summary statistics of
the performance metrics for the null ENMs are taken (averages and standard
deviations) and effect sizes and p-values are calculated by comparing
these summary statistics to the empirical values of the performance metrics
(i.e., from the model built with the empirical data). See the references
below for more details on this method.
Usage
ENMnulls(
e,
mod.settings,
no.iter,
eval.stats = c("auc.val", "auc.diff", "cbi.val", "or.mtp", "or.10p"),
user.enm = NULL,
user.eval = NULL,
user.eval.type = NULL,
userStats.signs = NULL,
removeMxTemp = TRUE,
parallel = FALSE,
numCores = NULL,
quiet = FALSE
)
Arguments
- e
ENMevaluation object
- mod.settings
named list: one set of model settings with which to build null ENMs.
- no.iter
numeric: number of null model iterations.
- eval.stats
character vector: the performance metrics that will be used to calculate null model statistics.
- user.enm
ENMdetails object: if implementing a user-specified model.
- user.eval
function: custom function for specifying performance metrics not included in ENMeval. The function must first be defined and then input as the argument
user.eval
.- user.eval.type
character: if implementing a user-specified model, specify here which evaluation type to use – either "knonspatial", "kspatial", "testing", or "none".
- userStats.signs
named list: user-defined evaluation statistics attributed with either 1 or -1 to designate whether the expected difference between empirical and null models is positive or negative; this is used to calculate the p-value of the z-score. For example, for AUC, the difference should be positive (the empirical model should have a higher score), whereas for omission rate it should be negative (the empirical model should have a lower score).
- removeMxTemp
boolean: if TRUE, delete all temporary data generated when using maxent.jar for modeling.
- parallel
boolean: if TRUE, use parallel processing.
- numCores
numeric: number of cores to use for parallel processing; if NULL, all available cores will be used.
- quiet
boolean: if TRUE, silence all function messages (but not errors).
Value
An ENMnull
object with slots containing evaluation summary
statistics for the null models and their cross-validation results, as well
as differences in results between the empirical and null models.
This comparison table includes z-scores of these differences and their
associated p-values (under a normal distribution). See ?ENMnull for more
details.
Details
This null ENM technique is based on the implementation in Bohl et al. (2019), which follows the original methodology of Raes & ter Steege (2007) but makes an important modification: instead of evaluating each null model on random validation data, here we evaluate the null models on the same withheld validation data used to evaluate the empirical model. Bohl et al. (2019) demonstrates this approach using a single defined withheld partition group, but Kass et al. (2020) extended it to use spatial partitions by drawing null occurrences from the area of the predictor raster data defining each partition. Please see the vignette for a brief example.
This function avoids using raster data to speed up each iteration, and instead samples null occurrences from the partitioned background records. Thus, you should avoid running this when your background records are not well sampled across the study extent, as this limits the extent that null occurrences can be sampled from.
References
Bohl, C. L., Kass, J. M., & Anderson, R. P. (2019). A new null model approach to quantify performance and significance for ecological niche models of species distributions. Journal of Biogeography, 46: 1101-1111. doi:10.1111/jbi.13573
Kass, J. M., Anderson, R. P., Espinosa-Lucas, A., Juárez-Jaimes, V., Martínez-Salas, E., Botello, F., Tavera, G., Flores-Martínez, J. J., & Sánchez-Cordero, V. (2020). Biotic predictors with phenological information improve range estimates for migrating monarch butterflies in Mexico. Ecography, 43: 341-352. doi:10.1111/ecog.04886
Raes, N., & ter Steege, H. (2007). A null-model for significance testing of presence-only species distribution models. Ecography, 30: 727-736. doi:10.1111/j.2007.0906-7590.05041.x
Examples
if (FALSE) { # \dontrun{
library(ENMeval)
# first, let's tune some models
occs <- read.csv(file.path(system.file(package="predicts"),
"/ex/bradypus.csv"))[,2:3]
envs <- rast(list.files(path=paste(system.file(package="predicts"),
"/ex", sep=""), pattern="tif$",
full.names=TRUE))
bg <- as.data.frame(predicts::backgroundSample(envs, n = 10000))
names(bg) <- names(occs)
ps <- list(orientation = "lat_lat")
# as an example, let's use two user-specified evaluation metrics
conf.and.cons <- function(vars) {
observations <- c(
rep(x = 1, times = length(vars$occs.train.pred)),
rep(x = 0, times = length(vars$bg.train.pred)),
rep(x = 1, times = length(vars$occs.val.pred)),
rep(x = 0, times = length(vars$bg.val.pred))
)
predictions <- c(vars$occs.train.pred, vars$bg.train.pred,
vars$occs.val.pred, vars$bg.val.pred)
evaluation_mask <- c(
rep(x = FALSE, times = length(vars$occs.train.pred) +
length(vars$bg.train.pred)),
rep(x = TRUE, times = length(vars$occs.val.pred) +
length(vars$bg.val.pred))
)
measures <- confcons::measures(observations = observations,
predictions = predictions,
evaluation_mask = evaluation_mask, df = TRUE)
measures.metrics <- measures[, c("CPP_eval", "DCPP")]
colnames(measures.metrics) <- c("confidence", "consistency")
return(measures.metrics)
}
e <- ENMevaluate(occs, envs, bg,
tune.args = list(fc = c("L","LQ","LQH"), rm = 2:4),
partitions = "block", partition.settings = ps,
algorithm = "maxnet", categoricals = "biome",
user.eval = conf.and.cons, parallel = TRUE)
d <- eval.results(e)
# here, we will choose an optimal model based on validation CBI, but you can
# choose yourself what evaluation statistics to use
opt <- d |> filter(cbi.val.avg == max(cbi.val.avg))
# now we can run our null models, and we can specify to include estimates for
# our user-specified variables too, but we need to make sure we note what
# sign we expect these statistics to be
# NOTE: you should use at least 100 iterations in practice -- this is just an
# example
nulls <- ENMnulls(e,
mod.settings = list(fc = opt$fc, rm = opt$rm),
no.iter = 10,
user.eval = conf.and.cons,
eval.stats = c("cbi.val", "confidence", "consistency"),
userStats.signs = c("confidence" = 1, "consistency" = 1))
# here are the results of all the null iterations
null.results(nulls)
# and here are the comparisons between the null and empirical values for
# the evaluation statistics, including the z-score and p-value
# for more details, see Bohl et al. 2019
null.emp.results(nulls)
} # }