Data stochasticity and model parametrisation impact the performance of species distribution models: insights from a simulation study

10.24072/pcjournal.263 - Peer Community Journal, Volume 3 (2023), article no. e34.

Get full text PDF Peer reviewed and recommended by PCI

Species distribution models (SDM) are widely used to describe and explain how species relate to their environment and predict their spatial distributions. As such, they are the cornerstone of most of spatial planning efforts worldwide. SDM can be implemented with a wide array of data types (presence-only, presence-absence, count...), which can either be point- or areal-based, and use a wide array of environmental conditions as predictor variables. The choice of the sampling type as well as the resolution of environmental conditions to be used are recognized as of crucial importance, yet we lack any quantification of the effects these decisions may have on SDM reliability. In the present work, we fill this gap with an unprecedented simulation procedure. We simulated 100 possible distributions of two different virtual species in two different regions. Species distribution were modelled using either segment- or areal-based sampling and five different spatial resolutions of environmental conditions. The SDM performances were inspected by statistical metrics, model composition, shapes of relationships and prediction quality. We provided clear evidence of stochasticity in the modelling process (particularly in the shapes of relationships): two dataset from the same survey, species and region could yield different results. Sampling type had stronger effects than spatial resolution on the final model relevance. The effect of coarsening the resolution was directly related to the resistance of the spatial features to changes of scale: SDM failed to adequately identify spatial distributions when the spatial features targeted by the species were diluted by resolution coarsening. These results have important implications for the SDM community, backing up some commonly accepted choices, but also by highlighting some up-to-now unexpected features of SDM (stochasticity). As a whole, this work calls for carefully weighted decisions in implementing models, and for caution in interpreting results.

Published online:
DOI: 10.24072/pcjournal.263
Keywords: change of support, grain size, spatial resolution, GAM, grid-based model, segment-based sampling, point-based sampling, Modifiable Areal Unit problem
Keywords: change of support, grain size, spatial resolution, GAM, grid-based model, segment-based sampling, point-based sampling, Modifiable Areal Unit problem
Lambert, Charlotte 1; Virgili, Auriane 2

1 Littoral ENvironnement et Sociétés UMR 7266 CNRS-LRUniv, 2 Rue Olympe de Gouges, 17000 La Rochelle, France
2 Observatoire Pelagis UAR 3462 CNRS-LRUniv, 5 allée de l’Océan, 17000 La Rochelle, France
License: CC-BY 4.0
Copyrights: The authors retain unrestricted copyrights and publishing rights
     author = {Lambert, Charlotte and Virgili, Auriane},
     title = {Data stochasticity and model parametrisation impact the performance of species distribution models: insights from a simulation study
     journal = {Peer Community Journal},
     eid = {e34},
     publisher = {Peer Community In},
     volume = {3},
     year = {2023},
     doi = {10.24072/pcjournal.263},
     url = {}
AU  - Lambert, Charlotte
AU  - Virgili, Auriane
TI  - Data stochasticity and model parametrisation impact the performance of species distribution models: insights from a simulation study

JO  - Peer Community Journal
PY  - 2023
VL  - 3
PB  - Peer Community In
UR  -
DO  - 10.24072/pcjournal.263
ID  - 10_24072_pcjournal_263
ER  - 
%0 Journal Article
%A Lambert, Charlotte
%A Virgili, Auriane
%T Data stochasticity and model parametrisation impact the performance of species distribution models: insights from a simulation study

%J Peer Community Journal
%D 2023
%V 3
%I Peer Community In
%R 10.24072/pcjournal.263
%F 10_24072_pcjournal_263
Lambert, Charlotte; Virgili, Auriane. Data stochasticity and model parametrisation impact the performance of species distribution models: insights from a simulation study
. Peer Community Journal, Volume 3 (2023), article  no. e34. doi : 10.24072/pcjournal.263.

Peer reviewed and recommended by PCI : 10.24072/pci.ecology.100523

Conflict of interest of the recommender and peer reviewers:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

[1] Baddeley, A.; Rubak, E.; Turner, R. Spatial Point Patterns: Methodology and Applications with R, Chapman and Hall/CRC Press, London, 2015

[2] Buckland, S.; Borchers, D.; Marques, T.; Fewster, R. Wildlife Population Assessment: Changing Priorities Driven by Technological Advances, Journal of Statistical Theory and Practice, Volume 17 (2023) no. 2, p. 20 | DOI

[3] Buckland, S.; Rexstad, E.; Marques, T.; Oedekoven, C. Distance sampling: methods and applications, Springer, 2015 | DOI

[4] Caballero, A.; Ferrer, L.; Rubio, A.; Charria, G.; Taylor, B. H.; Grima, N. Monitoring of a quasi-stationary eddy in the Bay of Biscay by means of satellite, in situ and model results, Deep-Sea Research II, Volume 106 (2014), pp. 23-37 | DOI

[5] Connor, T.; Hull, V.; Viña, A.; Shortridge, A.; Tang, Y.; Zhang, J.; Wang, F.; Liu, J. Effects of grain size and niche breadth on species distribution modeling, Ecography, Volume 41 (2017) no. 8, pp. 1270-1282 | DOI

[6] Elith, J.; Leathwick, J. R. Species Distribution Models: Ecological Explanation and Prediction Across Space and Time, Annual Review of Ecology, Evolution, and Systematics, Volume 40 (2009) no. 1, pp. 677-697 | DOI

[7] Fernandez, M.; Yesson, C.; Gannier, A.; Miller, P. I.; Azevedo, J. M. The importance of temporal resolution for niche modelling in dynamic marine environments, Journal of Biogeography, Volume 44 (2017) no. 12, pp. 2816-2827 | DOI

[8] Franklin, J. Mapping species distributions: spatial inference and prediction, Cambridge University Press, 2010 | DOI

[9] Franklin, J. Species distribution models in conservation biogeography: developments and challenges, Diversity and distributions, Volume 19 (2013) no. 10, pp. 1217-1223 | DOI

[10] Gottschalk, T. K.; Aue, B.; Hotes, S.; Ekschmitt, K. Influence of grain size on species–habitat models, Ecological Modelling, Volume 222 (2011) no. 18, pp. 3403-3412 | DOI

[11] Guillera-Arroita, G.; Lahoz-Monfort, J. J.; Elith, J.; Gordon, A.; Kujala, H.; Lentini, P. E.; McCarthy, M. A.; Tingley, R.; Wintle, B. A. Is my species distribution model fit for purpose? Matching data and models to applications, Global Ecology and Biogeography, Volume 24 (2015) no. 3, pp. 276-292 | DOI

[12] Guisan, A.; Graham, C. H.; Elith, J.; Huettmann, F.; NCEAS Species Distribution Modelling Group Sensitivity of predictive species distribution models to change in grain size, Diversity and distributions, Volume 13 (2007) no. 3, pp. 332-340 | DOI

[13] Guisan, A.; Tingley, R.; Baumgartner, J. B.; Naujokaitis-Lewis, I.; Sutcliffe, P. R.; Tulloch, A. I.; Regan, T. J.; Brotons, L.; McDonald-Madden, E.; Mantyka-Pringle, C.; others Predicting species distributions for conservation decisions, Ecology Letters, Volume 16 (2013) no. 12, pp. 1424-1435 | DOI

[14] Harrell Jr, F. E.; with contributions from Charles Dupont; many others. Hmisc: Harrell Miscellaneous, R package version 4.4-2, 2020 (

[15] Hernandez, P. A.; Graham, C. H.; Master, L. L.; Albert, D. L. The effect of sample size and species characteristics on performance of different species distribution modeling methods, Ecography, Volume 29 (2006) no. 5, pp. 773-785 | DOI

[16] Hijmans, R. J.; Etten, J. v.; Mattiuzzi, M.; Sumner, M.; Greenberg, J. A.; Lamigueiro, O. P.; Bevan, A.; Racine, E. B.; Shortridge, A. raster: Geographic data analysis and modeling, 2014 (

[17] Koutsikopoulos, C.; Le Cann, B. Physical processes and hydrological structure related to the Bay of Biscay anchovy , Scientia Marina, Volume 60 (1996) no. Supl. 2, pp. 9-19

[18] Lambert, C.; Authier, M.; Blanchard, A.; Dorémus, G.; Laran, S.; Van Canneyt, O.; Spitz, J. Delayed response to environmental conditions and infra-seasonal changes in the spatial distribution of short-beaked common dolphin, Royal Society Open Science (2022) | DOI

[19] Lambert, C.; Authier, M.; Dorémus, G.; Gilles, A.; Hammond, P.; Laran, S.; Ricart, A.; Ridoux, V.; Scheidat, M.; Spitz, J.; others The effect of a multi-target protocol on cetacean detection and abundance estimation in aerial surveys, Royal Society open science, Volume 6 (2019) no. 9, p. 190296 | DOI

[20] Lambert, C.; Mannocci, L.; Lehodey, P.; Ridoux, V. Predicting cetacean habitats from their energetic needs and the distribution of their prey in two contrasted tropical regions, PLoS One, Volume 9 (2014) no. 8, p. e105958 | DOI

[21] Lambert, C.; Virgili, A. Data stochasticity and model parametrisation impact the performance of species distribution models: insights from a simulation study [Data set], Zenodo, 2023 | DOI

[22] Lauzeral, C.; Grenouillet, G.; Brosse, S. Spatial range shape drives the grain size effects in species distribution models, Ecography, Volume 36 (2013) no. 7, pp. 778-787 | DOI

[23] Leroy, B.; Meynard, C. N.; Bellard, C.; Courchamp, F. virtualspecies, an R package to generate virtual species distributions, Ecography, Volume 39 (2015) no. 6, pp. 599-607 | DOI

[24] Longhurst, A. R. Ecological geography of the sea, Academic Press, 2007 | DOI

[25] Mannocci, L.; Boustany, A. M.; Roberts, J. J.; Palacios, D. M.; Dunn, D. C.; Halpin, P. N.; Viehman, S.; Moxley, J.; Cleary, J.; Bailey, H.; others Temporal resolutions in species distribution models of highly mobile marine animals: Recommendations for ecologists and managers, Diversity and Distributions, Volume 23 (2017) no. 10, pp. 1098-1109 | DOI

[26] Mannocci, L.; Catalogna, M.; Dorémus, G.; Laran, S.; Lehodey, P.; Massart, W.; Monestiez, P.; Van Canneyt, O.; Watremez, P.; Ridoux, V. Predicting cetacean and seabird habitats across a productivity gradient in the South Pacific gyre, Progress in Oceanography, Volume 120 (2014), pp. 383-398 | DOI

[27] Mannocci, L.; Laran, S.; Monestiez, P.; Dorémus, G.; Van Canneyt, O.; Watremez, P.; Ridoux, V. Predicting top predator habitats in the Southwest Indian Ocean, Ecography, Volume 37 (2014) no. 3, pp. 261-278 | DOI

[28] Manzoor, S. A.; Griffiths, G.; Lukac, M. Species distribution model transferability and model grain size–finer may not always be better, Scientific reports, Volume 8 (2018) no. 1, pp. 1-9 | DOI

[29] Marshall, C.; Glegg, G.; Howell, K. Species distribution modelling to support marine conservation planning: The next steps, Marine Policy, Volume 45 (2014), pp. 330-332 | DOI

[30] Marshall, L. dssd: Distance Sampling Survey Design, R package version 0.3.1, 2021 (

[31] Moudrý, V.; Keil, P.; Cord, A. F.; Gábor, L.; Lecours, V.; Zarzo-Arias, A.; Barták, V.; Malavasi, M.; Rocchini, D.; Torresani, M.; Gdulová, K.; Grattarola, F.; Leroy, F.; Marchetto, E.; Thouverai, E.; Prošek, J.; Wild, J.; Šímová, P. Scale mismatches between predictor and response variables in species distribution modelling: A review of practices for appropriate grain selection, Progress in Physical Geography: Earth and Environment (2023) | DOI

[32] Pingree, R. D.; Le Cann, B. Three anticyclonic Slope Water Oceanic eDDIES (SWODDIES) in the southern Bay of Biscay in 1990, Deep Sea Research, Volume 39 (1992) no. 7/8, pp. 1147-1175 | DOI

[33] Poisot, T. Species Distribution Models: the delicate balance between signal and noise, Peer Community in Ecology (2023) | DOI

[34] R Core Team R: A Language and Environment for Statistical Computing, 2021 (

[35] Ruijter, W. P.; van Aken, H. M.; Beier, E. J.; Lutjeharms, J. R.; Matano, R. P.; Schouten, M. W. Eddies and dipoles around South Madagascar: formation, pathways and large-scale impact, Deep Sea Research Part I: Oceanographic Research Papers, Volume 51 (2004) no. 3, pp. 383-400 | DOI

[36] Scales, K. L.; Hazen, E. L.; Jacox, M. G.; Edwards, C. A.; Boustany, A. M.; Oliver, M. J.; Bograd, S. J. Scale of inference: on the sensitivity of habitat models for wide-ranging marine predators to the resolution of environmental data, Ecography, Volume 40 (2016) no. 1, pp. 210-220 | DOI

[37] Schouten, M. W.; de Ruijter, W. P.; Van Leeuwen, P. J.; Ridderinkhof, H. Eddies and variability in the Mozambique Channel, Deep Sea Research Part II: Topical Studies in Oceanography, Volume 50 (2003) no. 12-13, pp. 1987-2003 | DOI

[38] Spiess, A.-N. qpcR: Modelling and Analysis of Real-Time PCR Data, R package version 1.4-1, 2018 (

[39] Virgili, A.; Authier, M.; Boisseau, O.; Cañadas, A.; Claridge, D.; Cole, T.; Corkeron, P.; Dorémus, G.; David, L.; Di-Méglio, N.; Dunn, C.; Dunn, T. E.; García-Barón, I.; Laran, S.; Lauriano, G.; Lewis, M.; Louzao, M.; Mannocci, L.; Martínez-Cedeira, J.; Palka, D.; Panigada, S.; Pettex, E.; Roberts, J. J.; Ruiz, L.; Saavedra, C.; Santos, M. B.; Van Canneyt, O.; Vázquez Bonales, J. A.; Monestiez, P.; Ridoux, V. Combining multiple visual surveys to model the habitat of deep-diving cetaceans at the basin scale, Global Ecology and Biogeography, Volume 28 (2018) no. 3, pp. 300-314 | DOI

[40] Virgili, A.; Authier, M.; Monestiez, P.; Ridoux, V. How many sightings to model rare marine species distributions, PLoS ONE, Volume 13 (2018) no. 3, p. e0193231 | DOI

[41] Virgili, A.; Racine, M.; Authier, M.; Monestiez, P.; Ridoux, V. Comparison of habitat models for scarcely detected species, Ecological Modellling, Volume 346 (2017), pp. 88-98 | DOI

[42] Wood, S. Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models, Journal of the Royal Statistical Society: Series B (Statistical Methodology), Volume 73 (2011) no. 1, pp. 3-36 | DOI

[43] Wood, S. Generalized Additive Models: An Introduction with R, Chapman and Hall/CRC, Boca Raton, FL, 2006

Cited by Sources: