Disentangling good from bad practices in the selection of spatial or phylogenetic eigenvectors

18 December 2017

Bauman, David; Drouet, Thomas; Dray, Stephane; Vleminckx, Jason

Eigenvector mapping techniques are widely used by ecologists and evolutionary biologists to describe and control for spatial and/or phylogenetic patterns in their data. The selection of an appropriate subset of eigenvectors is a critical step (misspecification can lead to highly biased results and interpretations), and there is no consensus yet on how to proceed. We conducted a ten-year review of the practices of eigenvector selection and highlighted three main procedures: selecting the subset of descriptors minimising the Akaike information criterion (AIC), using a forward selection with double stopping criterion after testing the global model significance (FWD), and selecting the subset minimising the autocorrelation in the model residuals (MIR). We compared the type I error rates, statistical power, and R² estimation accuracy of these methods using simulated data. Finally, a real dataset was analysed using variation partitioning analysis to illustrate to what extent the different selection approaches affected the ecological interpretation of the results. We show that, while the FWD and MIR approaches presented a correct type I error rate and were accurate, the AIC approach displayed extreme type I error rates (100%), and strongly overestimated the R². Moreover, the AIC approach resulted in wrong ecological interpretations, as it overestimated the pure spatial fraction (and the joint spatial-environmental fraction to a lesser extent) of the variation partitioning. Both the FWD and MIR methods performed well at broad and medium scales but had a very low power to detect fine-scale patterns. The FWD approach selected more eigenvectors than the MIR approach but also returned more accurate R² estimates. Hence, we discourage any future use of the AIC approach, and advocate choosing between the MIR and FWD approaches depending on the objective of the study: controlling for spatial or phylogenetic autocorrelation (MIR) or describing the patterns as accurately as possible (FWD).