Static species distribution models in the marine realm: the case of baleen whales in the Southern Ocean
This website shows the Supporting Information for: El-Gabbas et al. (2021) paper accepted by Diversity and Distributions. Please cite this article as:
El-Gabbas et al. (2021) Static species distribution models in the marine realm: the case of baleen whales in the Southern Ocean. Diversity and Distributions. https://doi.org/10.1111/ddi.13300
For Supporting Information for Antarctic minke whale (Balaenoptera bonaerensis), Click here.
For Supporting Information for Antarctic blue whale (Balaenoptera musculus intermedia), Click here.
For Supporting Information for fin whale (Balaenoptera physalus), Click here.
For Supporting Information for humpback whale (Megaptera novaeangliae), Click here.
For more info, see: https://elgabbas.netlify.app/.
Abstract
Aim: Information on the spatiotemporal distribution of marine species is essential for developing proactive management strategies. However, sufficient information is seldom available at large spatial scales, particularly in polar areas. The Southern Ocean (SO) represents a critical habitat for various species, particularly migratory baleen whales. Still, the SO’s remoteness and sea ice coverage disallow obtaining sufficient information on baleen whale distribution and niche preference. Here, we used presence-only species distribution models to predict the circumantarctic habitat suitability of baleen whales and identify important predictors affecting their distribution.
Location: The Southern Ocean (SO)
Methods: We used Maxent to model habitat suitability for Antarctic minke, Antarctic blue, fin, and humpback whales. Our models employ extensive circumantarctic data and carefully prepared predictors describing the SO’s environment and two spatial sampling bias correction options. Species-specific spatial-block cross-validation was used to optimise model complexity and for spatially-independent model evaluation. Results: Model performance was high on cross-validation, with generally little predicted uncertainty. The most important predictors were derived from sea ice, particularly seasonal mean and variability of sea ice concentration and distance to the sea ice edge.
Main conclusions: Our models support the usefulness of presence-only models as a cost-effective tool in the marine realm, particularly for studying the migratory whales’ distribution. However, we found discrepancies between our results and (within) results of similar studies, mainly due to using different species data quality and quantity, different study area extent, and methodological reasons. We further highlight the limitations of implementing static distribution models in the highly dynamic marine realm. Dynamic models, which relate species information to environmental conditions contemporaneous to species occurrences, can predict near-real-time habitat suitability, necessary for dynamic management. Nevertheless, obtaining sufficient species and environmental predictors at high spatiotemporal resolution, necessary for dynamic models, can be challenging from polar regions.
Table S1
Environmental predictors preparation
A) List of all initial variables and their derived predictors, unit, original temporal and spatial resolution, and data source.
B) List of 32 initially selected predictors based on data visualization and personal experience before excluding highly correlated predictors. The final list of predictors used to run the models is shown in Table 1 of the main text.
Table S2
The results of the cross-validated Maxent models.
ModelAll represents models run using all occurrences, while ModelUnique represents models run after removing duplicated occurrences within each 10×10 km cell. Model parameters and testing AUC columns show the best combination of feature classes (where ‘L’ linear, ‘Q’ quadratic, ‘H’ hinge, and ‘P’ product transformation) and regularization multiplier as well as the mean ± standard deviation of the testing AUC on spatial-block cross-validation. Block size represents the width of species- and model-specific spatial blocks. The number of occupied pixels represents the number of pixels (10×10 km) in the Southern Ocean with at least one sighting of each species.
Table S3
Summary of the comparison between this study’s results and like studies on Antarctic blue, fin, and humpback whales in the SO. Important predictors, as identified by this study, are shaded dark grey in the column header (> 5% permutation importance for the full models, in descending order). Cell colours represent the agreement between our study results and other studies: green, high agreement; orange, some disagreement; red, high disagreement. Empty cells represent situations when the given predictor was not tested. Results for Antarctic minke whale are shown in Table 2.
Abbreviations used: SIC = sea ice concentration; SIE = sea ice edge; SSH = sea surface height; Chl-a = chlorophyll-a concentration; ✓ = similar results; ⊕ = positive relationship; ⊖ = negative relationship; imp. = importance; Dist. = distance; SD = standard deviation; Temp. = water temperature.
Figure S21: Maps of the 15 environmental predictors used to run the models. For more information on predictor abbreviations, see Table 1.
Figure S22: Pearson correlation coefficient between each pair of environmental predictors used to run the models. Highly correlated predictors were excluded in advance using variance inflation factor (see main text). The maximum value of the correlation coefficient is 0.71. Colours range from red (high negative correlation) to blue (high positive correlation). For more information on predictor abbreviations, see Table 1.
Figure S23: Boxplots comparing values of each environmental predictor south of the Polar Front with corresponding values at species-specific sightings. For more information on predictor abbreviations, see Table 1.
Figure S24: The spatial allocation of blocks into species- and model-specific five-fold cross-validation. Block colour indicates how blocks were distributed into cross-validation folds. ModelAll represents models run using all occurrences, while ModelUnique represents models calibrated after removing duplicated occurrences within each 10×10 km cell. The number below each map represents the size of each block in kilometre. Points represent species presence-only sightings used in this study.
Figure S25: Spatiotemporal biases in species observation data.The map to the left shows the number of sightings used in this study (log-scale) per 100×100 km grid. Note the existence of high sampling bias towards the Antarctic Peninsula area and the absence of sightings from the majority of the Weddell Sea (dashed polygon). The plot to the right shows the number of sightings used in this study at each calendar day. There is an inevitable temporal bias in the visual observations data towards the summer months, particularly from the end of December to mid-April.
Figure S26: The seasonal distribution of daily sea ice edge from 2002 to 2019. Here, seasons were determined as three-month intervals from January.
Figure S27: Monthly distribution of daily sea ice edge from 2002 to 2019.
Figure S28: Comparison between distance to sea ice edge (SIE, left plots) or sea ice concentration (SIC, right plots) at species sightings, either spatiotemporally matched with their respective daily distance to SIE or SIC (x-axis) or spatially matched with the mean distance to summer SIE or SIC (y-axis, predictors used in this study). Horizontal and vertical grey lines in the left plots represent the location of SIE. The dashed grey line represents the identity (y=x relationship). It is clear that summarising highly dynamic environmental conditions (mean summer SIC or distance to summer SIE) has highly under- or over-estimated the correct values of SIC and SIE. This can greatly impact the performance of the static models and their inferences in the highly dynamic environment of the SO.