# Finding the ‘best’ habitat suitability model

*Submitted by editor on 5 March 2019.*Get the paper!

By Edward Gregr and Kai Chan

How is a habitat model’s complexity related to its forecasting skill? Our recent Ecography paper addressed this question using multiple data sets and models of increasing complexity.

Testing how well model predictions correspond with observations is a fundamental aspect of habitat suitability modelling. The first set of observations a model tries to predict are the ones used to build it. This is often misrepresented as the model’s “predictive power”, but it is in fact principally a measure of how much of the variance in the observations is explained by the predictor data. As we show in our paper, this variance explained is only a suitable proxy for predictive power under two—rather tenuous—assumptions: that processes are consistent across space or time (stationarity), and that the data (both observations and predictors) are representative of the intended predictive space (representativity).

These assumptions are tenuous because all data have a particular context (i.e., specific sampling methods, shared biases, spatial and temporal extents), and stationarity in ecology is mostly a myth. If the same data are used to also test the model (e.g., through simple cross-validation), then measures of model fit will generally overestimate a model’s predictive power because the model and data share a context that may not fully represent the predictive space. The recommended approach—which we followed—is to evaluate model performance using ‘independent’ data (i.e., data from a different context). Using a set of 4 independent data sets, we explored how models with different complexities performed to see whether we could identify an optimal model complexity. Along the way, we examined the benefits of cross-validation, and came to much more fully understand the meaning of data independence in the context of habitat modelling (see an earlier blog post).

Our study focused on habitat suitability models for canopy kelp in the eastern North Pacific. The models ranged from a process-based model based on ecological understanding to complex generalised additive models built using purpose-collected survey data (Fig. 1). We compare model complexity and forecast skill using both cross-validation and independent data evaluation, and show how cross-validation can lead to over-fitting, while independent data evaluation clearly identified the appropriate model complexity for generating habitat forecasts in our model domain (Fig. 2). We also show why predictions from simpler models can sometimes out-perform those from more complex models.

Exploring different model structures and evaluating their predictions using independent data led to a reasonable model of potential kelp habitat for our study area. The investigation of model performance shows the importance of using independent data to evaluate predictive models. We also show how, at least for small sample sizes, there is a tendency of complex models to over-fit the data when using accepted stopping conditions for model complexity. These are important considerations for practitioners, and we encourage the use of independent data for model evaluation where possible.

Figure 1. A series of heat maps (black = low suitability, red = high suitability) showing how with increasing model complexity (panels A through C) predictions become increasingly concentrated in shallow, high relief areas while reducing the amount of intermediate quality habitat predicted. Histograms above each map show how the predictions become increasingly bimodal with model complexity, an indication of how more complex models fit the data more closely.

Figure 2: With increasing model complexity (left to right), model fit rises for cross-validation (grey and black) but crashes for independent data (other colours) beyond a certain complexity. ‘AUC’ signifies area under the curve, a common measure of model fit. Cross-validation includes both random 3-fold, and 3-fold blocked on survey year. Independent data include three years of classified remote sensing data (2005, 2006, and 2009) and a compilation of long-term kelp observations (LOS).