Skip to contents

Internal functions to tune and summarize results for ecological niche models (ENMs) iteratively across a range of user-specified tuning settings. See ENMevaluate for descriptions of shared arguments. Function tune.parallel() tunes ENMs with parallelization. Function cv.enm() calculates training and validation evaluation statistics for one set of specified tuning parameters.

Validation CBI is calculated here with background values, not raster data, in order to standardize the methodology for both training and validation data for spatial partitions, as ENMeval does not mask rasters to partition areas and hence does not have partitioned raster data. Further, predictions for occurrence and background localities are combined as input for the parameter "fit" in ecospat::ecospat_boyce() because the interval is determined from "fit" only, and if test occurrences all have higher predictions than the background, the interval will be cut short.

Usage

tune.train(
  enm,
  occs.z,
  bg.z,
  mod.full,
  tune.tbl.i,
  other.settings,
  partitions,
  quiet
)

tune.validate(
  enm,
  occs.train.z,
  occs.val.z,
  bg.train.z,
  bg.val.z,
  mod.k,
  nk,
  tune.tbl.i,
  other.settings,
  partitions,
  user.eval,
  quiet
)

tune(
  d,
  enm,
  partitions,
  tune.tbl,
  doClamp,
  other.settings,
  partition.settings,
  user.val.grps,
  occs.testing.z,
  numCores,
  parallel,
  user.eval,
  algorithm,
  updateProgress,
  quiet
)

cv.enm(
  d,
  enm,
  partitions,
  tune.tbl.i,
  doClamp,
  other.settings,
  partition.settings,
  user.val.grps,
  occs.testing.z,
  user.eval,
  algorithm,
  quiet
)

Arguments

enm

ENMdetails object

occs.z

data.frame: the envs values for the coordinates at the full dataset occurrence records

bg.z

data.frame: the envs values for the coordinates at the full dataset background records

mod.full

model object: the model trained on the full dataset

tune.tbl.i

vector: single set of tuning parameters

other.settings

named list: used to specify extra settings for the analysis. All of these settings have internal defaults, so if they are not specified the analysis will be run with default settings. See Details for descriptions of these settings, including how to specify arguments for maxent.jar.

partitions

character: name of partitioning technique (see ?partitions)

quiet

boolean: if TRUE, silence all function messages (but not errors).

occs.train.z

data.frame: the envs values for the coordinates at the training occurrence records

occs.val.z

data.frame: the envs values for the coordinates at the validation occurrence records

bg.train.z

data.frame: the envs values for the coordinates at the training background records

bg.val.z

data.frame: the envs values for the coordinates at the validation background records

mod.k

model object: the model trained on the training dataset that becomes evaluated on the validation data

nk

numeric: the number of folds (i.e., partitions) – will be equal to kfolds for random partitions

user.eval

function: custom function for specifying performance metrics not included in ENMeval. The function must first be defined and then input as the argument user.eval. This function should have a single argument called vars, which is a list that includes different data that can be used to calculate the metric. See Details below and the vignette for a worked example.

d

data frame: data frame from ENMevaluate() with occurrence and background coordinates (or coordinates plus predictor variable values) and partition group values

tune.tbl

data frame: all combinations of tuning parameters

doClamp

boolean: if TRUE (default), model prediction extrapolations will be restricted to the upper and lower bounds of the predictor variables. Clamping avoids extreme predictions for environment values outside the range of the training data. If free extrapolation is a study aim, this should be set to FALSE, but for most applications leaving this at the default of TRUE is advisable to avoid unrealistic predictions. When predictor variables are input, they are clamped internally before making model predictions when clamping is on. When no predictor variables are input and data frames of coordinates and variable values are used instead (SWD format), validation data is clamped before making model predictions when clamping is on.

partition.settings

named list: used to specify certain settings for partitioning schema. See Details and ?partitions for descriptions of these settings.

user.val.grps

matrix / data frame: user-defined validation record coordinates and predictor variable values. This is used internally by ENMnulls() to force each null model to evaluate with empirical validation data, and does not have any current use when running ENMevaluate() independently.

occs.testing.z

data.frame: when fully withheld testing data is provided, the envs values for the coordinates at the testing occurrence records

numCores

numeric: number of cores to use for parallel processing. If NULL, all available cores will be used.

parallel

boolean: if TRUE, run with parallel processing.

algorithm

character: name of the algorithm used to build models. Currently one of "maxnet", "maxent.jar", or "bioclim", else the name from a custom ENMdetails implementation.

updateProgress

boolean: if TRUE, use shiny progress bar. This is only for use in shiny apps.