Research Article
Print
Research Article
Detailed multivariate comparisons of environments with mobility oriented parity
expand article infoMarlon E. Cobos, Hannah L. Owens§|, Jorge Soberón, A. Townsend Peterson
‡ University of Kansas, Lawrence, United States of America
§ University of Copenhagen, Copenhagen, Denmark
| University of Florida, Gainesville, United States of America
¶ University of Kansas, Lawrence, Kansas, United States of America
Open Access

Abstract

Multivariate model projections onto conditions that differ from those under which the models were calibrated can result in uncertainty owing to response extrapolation. Therefore, interpretations of models transferred to conditions not analogous to those of model calibration must be done carefully to avoid misleading conclusions. A good practice when producing model transfers is to assess the degree of dissimilarity between calibration and transfer conditions. The mobility oriented parity (MOP) metric is a tool commonly used to quantify such dissimilarities. Our study elucidated the details of MOP, and expanded the interpretability of results when transfer conditions are not analogous to those of model calibration. We implemented a more detailed characterization of non-analogous conditions previously only identified as outside calibration ranges. In addition, we quantified the sensitivity of MOP to the density and position of reference conditions in environmental space, and explored the effect of sample size on MOP results. We show that detailed characterizations of non-analogous conditions can help to determine which variables are different between the two areas and how. This information can be used to understand causes of environmental dissimilarity and to identify variables that could be excluded from ecological niche modeling owing to extrapolation risks associated with them. Our results also show that dissimilarity values calculated in comparisons offer good indicators of how reliable interpretations can be in areas with conditions not analogous to those of reference. The tools developed and implemented in this study provide a useful, quantitatively efficient means of characterizing extrapolation uncertainty in model transfer studies, a best practice when using multivariate ecological niche models to make predictions about species’ distributions.

Highlights

  • An extended version of the mobility oriented parity metric helps identify variables for which risks of model extrapolations can be expected.

  • Distinct MOP metrics used to characterize dissimilarity show comparable general patterns in transfer areas, despite differences in values.

  • As MOP results are influenced by background sampling size, the definition of this sample should be aimed to appropriately characterize calibration areas.

Keywords

Calibration area, ecological niche modeling, extrapolation risks, model extrapolation, model transfer, MOP

Introduction

The ability to transfer predictions of ecological models to scenarios distinct from that under which they were calibrated is a crucial feature that allows researchers to make inferences about a phenomenon under different circumstances (Klepper 1997). Ecological niche models (ENMs) are not an exception: ENM projections facilitate prediction of suitability under distinct environmental scenarios (in space and/or time; hereafter referred to as “model transfers”; Yates et al. 2018). This is a key characteristic for ENM applications in which understanding patterns and dynamics of species’ distribution ranges are required (Colwell et al. 1994; Jiménez-Valverde et al. 2011; Lemes and Loyola 2013; Pecl et al. 2017). For instance, model transfers are critical in conservation planning, allowing researchers to answer questions such as how the distributional potential of species will shift in the face of climate change (e.g., Simões et al. 2021), or how likely a biological invasion is to occur given the introduction of a species (e.g., Nuñez-Penichet et al. 2021). However, model transfers are not simple to interpret, as the robustness of projections depends on multiple factors (Yates et al. 2018; Qiao et al. 2019), particularly how model transfers are done when environmental conditions are not analogous to those of model calibration (Elith et al. 2010; Rödder and Engler 2012; Peterson et al. 2018; Fig. 1).

Figure 1. 

Schematic representation of an ecological niche modeling exercise, potential scenarios of extrapolation in model transfers, and calculation of the mobility oriented parity metric.

To aid in interpreting model transfers, various tools have been developed to identify environmental dissimilarity between calibration areas and projections to other areas or time periods (e.g., Zurell et al. 2012; Owens et al. 2013; Mesgaran et al. 2014; Velazco et al. 2023). Areas with conditions outside the environmental spectrum used in model calibration are regions in which extrapolation may produce non-reliable results, especially if response curves are truncated (Peterson et al. 2011; Fig. 1). In addition, the irregular arrangements and patterns of environmental combinations associated with geographic areas (Colwell and Rangel 2009) can generate situations in which risks from combinatorial extrapolation can be expected (Mesgaran et al. 2014). That is, model extrapolation can happen in conditions that are outside calibration ranges of one or multiple variables, but also in combinatorial conditions of two or more variables that have not been considered, even if they are inside univariate calibration ranges (Fig. 1). These factors make it difficult to anticipate and understand dissimilarities between calibration and transfer conditions. One method for identifying non-analogous conditions in model transfer regions is the mobility oriented parity metric (MOP; Owens et al. 2013). This metric identifies non-analogous environmental conditions in transfer areas (hereafter conditions of interest) that are outside the ranges of calibration conditions (hereafter reference conditions), and calculates environmental dissimilarity between conditions of interest and those of reference.

To account for the irregularity of environmental spaces (see example, Fig. 1), MOP measures multivariate environmental distances between conditions of interest and a predefined percentage of the reference conditions that is environmentally closest to the relevant region of conditions of interest (Fig. 1). In this regard, MOP, similar to metrics such as area of applicability (AOA; Meyer and Pebesma 2021) and Shape (Velazco et al. 2023), focuses on how distinct the conditions of interest are from the reference conditions, in sharp contrast to other metrics which compare the conditions of interest only to the multivariate centroid of reference conditions (e.g., Elith et al. 2010; Mesgaran et al. 2014). However, by calculating distances from a percentage of the points in the reference conditions, MOP is the only metric that considers the density of reference conditions in concert. As such, in this contribution, we concentrate on understanding and improving the MOP metric.

A limitation of how MOP identifies environments outside reference conditions is that it does not identify which variables are outside such conditions. Non-analogous environmental conditions can be found for single or multiple variables in the same geographic area, yet MOP results only show that dissimilarity exists, not which variables are dissimilar. In addition, dissimilar environmental conditions can be at the high or low ends of a variable, information also missing from MOP outputs. Elith et al. (2010) had earlier proposed a solution to this problem called the multivariate environmental similarity surface (MESS), which identifies which variable is the furthest outside its reference range in a transfer area. Another approach, termed ExDet (Mesgaran et al. 2014), identifies which variable is the furthest outside the range of single or multiple variables, though it does not identify which other variables are contributing the dissimilarity. Hence, no existing method for estimating dissimilarity between conditions of interest and those of reference identifies variables with values outside the reference set, distinguishes whether extreme variable values in transfer regions are higher or lower than ranges of reference, and indicates which variable combinations produce novel conditions (Figs 1, 2).

A further aspect in model transfer metrics that has yet to be assessed is the effect of point-density patterns in the cloud of reference environmental conditions (conditions for model calibration) on model transfers. As distances are measured from each point in the conditions of interest to a closest set of points in the reference conditions, density patterns (i.e., how tightly clustered points are in an environmental region) in reference conditions may affect the results obtained. That is, for high-density conditions of interest in the reference regions, lower dissimilarity values would be found than for those close to low-density reference regions, because points in high-density areas are closer to one other than those in low-density regions.

Another complication that may affect MOP is the number of background or pseudo-absence points used to characterize the study area (Iturbide et al. 2018). The sample size used to represent the calibration area (reference conditions in environmental space) can affect both ENM and MOP results. Larger sample sizes can be better at representing extreme values from the calibration area (Guevara et al. 2018). The size of the sample can also influence the density patterns of reference conditions in environmental space. The effects of samples representing reference conditions have been explored for ENMs before, and they can not be neglected (Barbet-Massin et al. 2012). Given how the MOP metric is calculated, sample size should also have a noticeable effect when detecting conditions outside ranges and dissimilarity values. However, exploring the effect of sample size in MOP results has not been done before.

MOP and other metrics of environmental similarity between conditions of interest and those of reference offer crucial information when interpreting the results of model transfers. If environments in a transfer region are similar to those of the reference region, interpretations of model predictions are straightforward. On the other hand, if the conditions are non-analogous, interpretation of prediction results should be done carefully. More details about what causes dissimilarity and what is its magnitude can help in delimiting areas where interpretations are safe, and maybe change the way decisions are made during the modeling process. Our general goal with this study was to improve and enrich the information provided by MOP, permitting more detailed interpretations of results. To this end, we (1) extended MOP to provide information on non-analogous variable identity and the nature of their extremity compared to the reference region, (2) explored sensitivity of MOP to parameter settings and differences in the distribution (point density patterns) of reference conditions, and (3) explored effects of the number of points sampled from the set of reference conditions (sample size) when comparing conditions of interest to those of reference (Fig. 2).

Figure 2. 

Schematic representation of the examples used and challenges performed in this study. Geographic regions representing calibration and transfer areas are shown for the example cases.

Methods

Example cases and data

We performed three computational challenges with MOP, described below, using data for two geographic scenarios: (1) a calibration area represented by the Lower 48 United States (hereafter “USA”) and a broad transfer area within the extent (60°S–80°N, 135–35°W), which covers most of the Americas (hereafter “the Americas”), and (2) an area for model calibration for the bird Transvolcanic Jay (Aphelocoma ultramarina; hereafter “jay accessible area”), obtained via dispersal simulations by Machado-Stredel et al. (2021) and a transfer area representing the entire country of Mexico (Fig. 2). The first case is to exemplify a general, multivariate, comparison between reference conditions (USA) and conditions of interest (the Americas). This initial example is not linked to a particular species, but rather represents an ideal example for demonstrating the analysis proposed. The second case exemplifies an application comparable to that used in an ENM exercise, in which understanding extrapolation risks in a prediction is needed. In this second case, an ENM would be calibrated in the jay accessible area and projected to Mexico. In this particular case, MOP can be used to identify areas with environmental conditions outside calibration ranges (non-analogous conditions) and to establish which variables present values outside such ranges. Interpretation of predictions in areas with non-analogous conditions should be done carefully; model extrapolation can result in false positives (Type I error), whereas truncation can result in false negatives (Type II error, Fig. 1).

Environmental conditions comprised bioclimatic variables from the WorldClim database (v2, www.worldclim.org; Fick and Hijmans 2017) at 10’ (~17 km at the Equator) resolution. These variables were: maximum temperature of warmest month (bio5), minimum temperature of coldest month (bio6), temperature annual range (bio7), precipitation of wettest month (bio13), precipitation of driest month (bio14), and precipitation seasonality (bio15). These variables are commonly used as predictors in ENM applications with diverse taxa, and are considered to be biologically relevant as they represent environmental extremes and seasonality (Petitpierre et al. 2017; Cobos and Peterson 2023). We used spatial polygons from the GADM database (v4.1; www.gadm.org) to mask variables in our areas of interest (i.e., calibration and transfer areas). We downloaded and prepared data for analysis using the packages geodata (Hijmans et al. 2023) and terra (Hijmans 2023) in R 4.3 (R Core Team 2023). The three challenges are described below; R code to reproduce all analyses is available at https://github.com/marlonecobos/new_MOP.

Challenge 1: More in depth understanding via the MOP metric

In our first challenge (Fig. 2), we ran MOP analyses to illustrate our four new metrics that aid in detection of non-analogous conditions using both example cases (USA vs the Americas, and jay accessible area vs Mexico). We included the six bioclimatic variables for these analyses. To extend the information included in MOP results, we created a means to track variable identities when characterizing conditions of interest outside ranges of reference conditions (Fig. 1). Using this information, we generated additional raster layers for the transfer area: (1) a layer showing areas presenting non-analogous environmental conditions; (2) a layer representing the number of variables having values outside reference ranges; (3) two separate raster layers for each variable, one showing areas with conditions above maximum reference values and one showing areas with conditions below minimum reference values; and (4) two extreme variable-combination summary layers, one showing areas with combinations of variable values higher than maximum reference values, and the other showing areas with combinations of variable values lower than minimum reference values. In addition, we provided options to scale and center variables before analysis, choose between Euclidean or Mahalanobis distances in calculations, and to re-scale MOP scores to a 0–1 range. MOP re-scaling was done using maximum and minimum values as reference, as follows: RDi = (Di - Dmin) / (Dmax - Dmin), where D is the MOP distance, and RD is the re-scaled distance.

Challenge 2: Effects of density patterns in reference conditions

We designed our second exploration to illustrate the effects of reference condition density patterns in environmental space on MOP results (Fig. 2). We selected six regions of interest within a two-dimensional environmental space (precipitation of driest month vs. minimum temperature) representing the transfer area (cloud of conditions of interest, Fig. 3). These regions were selected using rectangular polygons of equal area (blocks) as follows: (Block 1) inside the cloud of points of reference, close to a region of this cloud with high density of points; (Block 2) inside the reference cloud, close to a region of this cloud with low density of points; (Block 3) outside the reference cloud, but within reference ranges, and close to a region of the reference cloud of high density of points; (Block 4) outside the reference cloud, within reference ranges, and close to an area of low density of points; (Block 5) outside but not too far from the reference cloud, within the range of only one variable; (Block 6) outside and far from the reference cloud, outside of the ranges of the two variables. MOP dissimilarity values (distances) from each region of interest were summarized and compared graphically. We used the package biosurvey to generate the regions of interest in environmental space (Nuñez-Penichet et al. 2022); comparisons were done using base functions in R.

Figure 3. 

Representation of environmental spaces of the example cases in two dimensions. Reference conditions (conditions in calibration areas) and conditions of interest (conditions in transfer areas) are represented as points and density kernels. Regions of the environmental space of relevance for analysis are highlighted and described in the conditions of interest of one of the examples. USA = Lower 48 United States of America.

Dissimilarity measurements are influenced by the method used to calculate distances as well as by the way data is transformed before performing analysis (Faith et al. 1987). To account for these factors, we ran MOP analyses changing the type of distance used (Euclidean or Mahalanobis), and whether or not variable values were scaled before analyses. To scale variables, we subtracted their means and divided them by their standard deviation (i.e. centered and standardized them). MOP also allows users to change the percentage of points in calibration areas used as a reference for distance calculations (e.g., 5 or 1%). Changing this percentage modifies the results, as larger percentages tend to result in larger distances, especially in conditions of interest that are close or around regions with low density of reference conditions. For this reason, we also ran analyses changing with reference percentages of 10, 5 and 1%. We performed all these analyses and explored results in the six regions of interest defined before because the change in the parameters described could have an influence on the effect of reference condition density patterns on MOP results. Analyses were done only for the example case comparing the USA and the Americas, and using only two variables (minimum temperature of coldest month and precipitation of wettest month), to facilitate interpretation via graphical representations in environmental space. We ran all analyses twice, once keeping the raw distance results, and a second time re-scaling distances to between 0 and 1.

Challenge 3: Effect of calibration area sample size on MOP results

Finally, we tested the effect of sample size (number of points used to represent the reference conditions in the calibration area) on MOP results using four different samples: an initial analysis with the entire set of calibration area data (40,207 cells), and three subsequent subsets (20,000, 10,000, and 5000 cells, respectively). The size of the sample of reference conditions can determine whether rare environmental combinations are included or not in analysis, not only for MOP but also for ENM applications. Therefore, MOP and ENM results can vary depending on this factor. We tested distinct sample sizes to identify if this parameter can alter results and interpretations of MOP outputs. Analysis times on a laptop computer (Intel Xeon, 2.8 GHz, 8 processors, 16 GB of RAM; 4 processors in parallel) were recorded to serve as a benchmark for analysis optimization. We ran MOP analyses using Euclidean distances for the example case that compares the USA and the Americas, using the set of six variables used in the first challenge, scaled and centered before analysis. Distances were measured in reference to the closest 10% of the conditions in the calibration area.

Results

In challenge one, we attempted to detect regions of the transfer areas where risks from model extrapolation are higher based on more detailed characterizations of non-analogous conditions (Fig. 4). Our results showed where one or multiple variables presented non-analogous conditions, and which variables presented such novel conditions, considering the lower or upper end of their range (Fig. 4; Suppl. material 1: fig. S1). For the example case of USA vs the Americas, we found novel conditions towards the low end of variables: at high latitudes for variables representing of minimum (bio6) and maximum (bio5) temperature, close to the Equator for annual range of temperature (bio7), and in the Atacama Desert for variables representing precipitation (bio13 and bio15). In the same example case, novelty towards the upper values of most variables was found at low latitudes in tropical areas, with extensions at higher latitudes in dry areas for precipitation seasonality (bio15), and areas with extreme winter seasons for annual range of temperature (bio7). For the example case comparing the jay accessible area vs Mexico, novel conditions towards the lower end were found mostly in the northern and northeastern parts of the country associated mainly with precipitation of the wettest month (bio13). Novel conditions towards the upper end of variables, in this example, were associated primarily with annual range of temperature (bio7) in northern parts of the country.

In challenge two, we explored the effects of parameter settings and density patterns in reference conditions on dissimilarity results. MOP dissimilarity values (distances) obtained for the conditions of interest (Figs 5, 6; Suppl. material 1: figs S2–S5, tables S1, S2) showed the following general patterns: all MOP results showed similar patterns of distance among the regions of interest tested; using Euclidean distances resulted in slightly more marked differences among regions of interest close to the reference cloud; and density of reference environments had direct effects on the distances obtained. These general patterns did not change after rescaling MOP distances to between 0 and 1 (Fig. 6; Suppl. material 1: figs S3–S5). Distances measured for reference conditions located inside the reference cloud were smaller when reference condition point density was high (Block 1) compared to when density was low (Block 2). Distances for conditions of interest outside the reference cloud but inside reference ranges were larger than for conditions of interest inside both the reference cloud and ranges (Blocks 1–4). Distances to Block 5 (conditions outside both the reference cloud and ranges) were smaller than those to Block 4 (outside the reference cloud, inside ranges) owing to a higher density of reference conditions close to Block 5. Distances to Block 6 were always larger than distances to Block 5, which accords with their position in reference to the reference cloud.

Reducing the percentage of calibration points used as a reference to measure distances decreased dissimilarity in projection environments inside reference ranges (Figs 5, 6; Suppl. material 1: figs S3–S5, tables S1, S2). Rescaling MOP dissimilarity to between 0 and 1 allowed better comparison of distances obtained across our distinct treatments, highlighting the ability of Euclidean distances to differentiate among regions close to the reference cloud (Fig. 5). Our results suggested that scaled MOP distances were preferable in identifying which conditions of interest were inside and outside the reference cloud; this was more notable when using scaled variables and 10% of the reference cloud as a reference for calculations compared to using a smaller percentage of the reference cloud.

In challenge three, we assessed the effect of calibration area sample size on MOP results. The most noticeable effect of background sample size was detected in the areas identified as non-analogous (Fig. 7). Non-analogous areas increased slightly in MOP outputs obtained with samples of the entire set of values in the calibration area; these small changes were more noticeable in tropical areas. Perhaps the most dramatic change was the increase of the number of variables detected to have non-analogous conditions in the Amazon region, especially when using a sample of 5000 points (Fig. 7; Suppl. material 1: fig. S6). The increase of areas with non-analogous conditions was detected towards the low end of annual range of temperature (bio7; Suppl. material 1: fig. S6). On the other hand, patterns of dissimilarity (distance) across the projection area were not as noticeable. Times required to perform these analyses were as follows: 140.8 s for the entire calibration area, 84.4 s for sample size 20,000, 34.5 s for 10,000, and 19.8 s for 5000.

Figure 4. 

Extended results from the mobility oriented parity metric in areas with non-analogous conditions. Information on how many variables are outside conditions of reference and the identity of such variables are presented.

Figure 5. 

Re-scaled dissimilarity (distance) results from the mobility oriented parity (MOP) metric for the example case comparing the United States and the Americas using two variables. MOP analyses were performed using Euclidean or Mahalanobis distances, raw or scaled variables, and with reference to the closest 10, 5, or 1% of environmental conditions in the calibration area. Environmental regions of interest (blocks) are highlighted in red. Color palette is shared among all panels in the figure.

Figure 6. 

Dissimilarity values (distances, rep-scaled 0–1) for regions of interest in environmental space (blocks), obtained from the mobility oriented parity metric (MOP). Analyses were performed using two variables, Euclidean or Mahalanobis distances, raw or scaled variables, and with reference to the closest 10, 5, or 1% of environmental conditions in the calibration area.

Figure 7. 

Results from the mobility oriented parity metric performed with samples of the calibration area of distinct sizes (40,207; 20,000; 10,000; and 5,000). Analyses were done for the example that compares the Lower 48 United States and the Americas, using 10% of the reference conditions to get mean distances.

Discussion

Understanding how a model extrapolates under non-analogous conditions can be challenging owing to complications involved in multivariate predictions and the combination of novel conditions in the transfer scenarios (Qiao et al. 2019). ENM/SDMs are important tools that are applied to diverse questions in conservation, biological invasions, public health, and other fields, such that implications of model transfers that derive purely from extrapolation cannot be accepted without question (Yates et al. 2018). As one of the tools available to explore such phenomena, MOP has played an important role in helping researchers to identify areas where interpretation should be done carefully. However, interpretation of the results of this metric has been limited by the simplistic way in which non-analogous conditions were identified. These original representations did not provide details on which variables showed conditions outside reference ranges, which restricted our understanding of why these risks existed and, perhaps, how to avoid them. Here, we proposed and explored several ways to expand this metric to elucidate how non-analogous conditions are characterized. In addition, we performed detailed explorations of the effects of changing calculation parameters, which were previously not well understood. Our additions to the MOP algorithm improved interpretability of results considerably, especially in transfer areas (conditions of interest) with novel conditions. Our results can also guide researchers in ways to parameterize MOP analysis depending on the final goals of their studies.

The information produced with the updated MOP metric in regards to which variables, and the end of such variables (low or high) at which non-analogous conditions are observed can be used to determine whether one or more of these variables should be included in ENMs. Although the decision of which variables to use should be based mainly on their relevance in modeling the phenomenon of interest, MOP results can help to vote in favor of one or another predictor in cases in which they have similar implications in the model and the information provided is comparable. For instance, MOP results showing that a temperature variable presents large areas with non-analogous conditions in a transfer scenario for the future is a first indication that extrapolation risks can be expected (e.g., bio6 in Fig. 4). In a case in which bio6 had a similar effect in an ENM to that of other variables (e.g., bio5), the new results provided by MOP would help prioritize the use of this later predictor (bio5) instead. However, even more interpretations can derive from the new MOP results. In the case of non-analogous conditions occurring mainly towards high values of the variable, and suitability responses derived from an ENM showing decreasing values towards the high end of this variable, spurious extrapolation is not likely to occur (Fig. 1). On the other hand, if suitability is increasing towards the high end of this variable, the risk of spurious extrapolation will be higher (Fig. 1). These observations highlight the importance of considering MOP results and ENM-generated response curves in concert. Our implementation of MOP returns layers that show areas where conditions of interest are outside ranges of reference conditions for each variable independently, or combining results for all variables. Results for each variable are likely easier to interpret, whereas results that combine information of multiple variables could help to identify critical areas of extrapolation risks, where not only one but multiple variables indicate non-analogous conditions. As many ENM applications require predictions to conditions outside those of calibration (reference conditions), analysis of extrapolation risks such as the ones that MOP allows are instrumental to understand how safe interpretations can be in our predictions.

All parameters of the MOP analysis explored in our second challenge (distance type, percentage of reference, and variable scaling) had effects on MOP distances. Perhaps the most notable effect on the pattern of distance values derived from using Mahalanobis distances, with shorter values for conditions of interest around the reference conditions compared to results of analyses using Euclidean distances (only visible when distances were re-scaled 0–1; Fig. 5). This effect derives from the way distances are calculated: Euclidean distances are obtained from simple calculations that consider each point independently, whereas Mahalanobis distances are calculated with consideration of a variance-covariance matrix derived from all reference conditions (Faith et al. 1987). Depending on the interest of researchers, both types of distance can be useful. Mahalanobis distance is better to distinguish, in broad terms, which conditions of interest are close to the reference conditions versus those that are far (e.g., Blocks 3–5 vs Block 6; Figs 5, 6). On the other hand, Euclidean distances give a more detailed differentiation within conditions of interest that are close to reference conditions (e.g., Blocks 3–5; Figs 5, 6). Based on our results, the percentage of reference conditions used in MOP calculations has a major effect on the distances obtained. MOP distances calculated using low percentages of reference conditions will be similar to those obtained with algorithms that only consider the closest reference point when measuring dissimilarities in conditions of interest (e.g., Shape; Velazco et al. 2023). However, if the goal is to have a clearer differentiation in what is outside the reference cloud of environments, we recommend using a larger percentage of reference conditions (e.g., ~10%) and scaling variables before calculating Euclidean distances (Fig. 5). All parameter settings explored in our MOP analyses were useful in detecting conditions of interest that are far from reference environments.

The density of calibration points used to characterize environmental conditions, which indicates how common certain conditions are in the reference area, had clear effects in our results from challenge two (Fig. 6; Suppl. material 1: figs S3–S5). For instance, distances detected for conditions of interest located inside the range of reference conditions (Block 4) were comparable to or even larger than those for conditions of interest outside the range of one of the variables considered (Block 5). This effect is because the density of reference environments was higher close to Block 5 (Fig. 3). This pattern was detected in all treatments in this challenge (Fig. 6; Suppl. material 1: fig. S3), which could complicate interpretations with extrapolation of truncated responses in ENMs. However, conditions of interest in Blocks 3–5 were all outside the cloud of reference but also close to it, and the differences in distances between these blocks and those in Block 6 (far from the reference cloud) are high. Therefore, MOP distances can still be considered as good indicators of how risky interpretations are in conditions of interest that differ considerably from reference conditions.

Our results from challenge three showed that sample size has an effect on how non-analogous conditions are identified. As the details related to which variable showed non-analogous conditions, the end of the variable at which such conditions are found, and the extension of the area with these conditions are relevant for interpretations and to making decisions for modeling, definition of appropriate sizes to sample backgrounds is not a trivial matter. In our example, the largest effect on the patterns of novel conditions was found when the sample size was ~10% of the total points (5000). Some ENM algorithms and routines allow users to include raster layers from which background for modeling is sampled. As the sample size used to generate background is usually smaller than the total number of pixels in such layers, only this sampled background (including occurrences) should be used as reference conditions in analysis of extrapolation risks. The sample size needed to create appropriate models and perform MOP analyses could depend on multiple factors; for instance, if extreme conditions in the area of interest are rare, a larger random sample size may be required to represent such rare conditions in analyses (Guevara et al. 2018). Applications in which predefined backgrounds (e.g., target-group backgrounds) are used should not be affected by this factor, as all these points are used in ENM and MOP analyses. As technology advances, computing times are expected to continue decreasing, which suggests that computing time should not be a crucial factor in deciding the size of sample used to represent reference conditions. However, if the size of the calibration area is too large and/or the resolution of environmental layers is high, computation of MOP can be time-consuming in lower-capacity computers.

In conclusion, this study addressed the intricacies of model extrapolation under non-analogous conditions, enhancing the interpretability and utility of the MOP metric. We have expanded the metric to characterize non-analogous conditions better, and explored the impact of calculation parameters, which has significantly improved its interpretability. We emphasize the importance of MOP results in assessing risks of spurious extrapolations in ENMs, and perhaps in guiding variable inclusion in these models. The interplay between MOP outcomes and ENM response curves is crucial to more robust assessments of extrapolation risk. Our findings indicate that reference conditions density affects MOP results, emphasizing the need for careful explorations of environmental conditions by researchers. Our research also highlights the role of sample size in identifying non-analogous conditions, underscoring the importance of selecting an appropriate sample size for background characterization. These findings contribute to refining ecological niche modeling methods by informing about model extrapolation under non-analogous conditions.

Acknowledgments

We thank the members of the KUENM working group in the University of Kansas Biodiversity Institute, for their thinking and work on these topics over the years.

Author Contributions

MEC, HLO, JS, and ATP conceived the idea of the project. MEC ran analyses. MEC wrote the manuscript. MEC, HLO, JS, and ATP reviewed and improved the manuscript.

Data Accessibility Statement

All data used to produce the examples presented here can be obtained with the code provided as part of the example application. R code to reproduce all analyses and figures is available from a GitHub repository (https://github.com/marlonecobos/new_MOP). All analyses required to produce the extended mobility-oriented parity metric were implemented in the new R package “mop” (available at https://CRAN.R-project.org/package=mop).

References

  • Colwell RK, Coddington JA, Hawksworth DL (1994) Estimating terrestrial biodiversity through extrapolation. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences 345: 101–118. https://doi.org/10.1098/rstb.1994.0091
  • Fick SE, Hijmans RJ (2017) WorldClim 2: New 1-km spatial resolution climate surfaces for global land areas. International Journal of Climatology 37: 4302–4315. https://doi.org/10.1002/joc.5086
  • Guevara L, Gerstner BE, Kass JM, Anderson RP (2018) Toward ecologically realistic predictions of species distributions: A cross-time example from tropical montane cloud forests. Global Change Biology 24: 1511–1522. https://doi.org/10.1111/gcb.13992
  • Hijmans R (2023) terra: Spatial data analysis. R package.
  • Iturbide M, Bedia J, Gutiérrez JM (2018) Background sampling and transferability of species distribution model ensembles under climate change. Global and Planetary Change 166: 19–29. https://doi.org/10.1016/j.gloplacha.2018.03.008
  • Jiménez-Valverde A, Peterson AT, Soberón J, Overton JM, Aragón P, Lobo JM (2011) Use of niche models in invasive species risk assessments. Biological Invasions 13: 2785–2797. https://doi.org/10.1007/s10530-011-9963-4
  • Machado-Stredel F, Cobos ME, Peterson AT (2021) A simulation-based method for identifying accessible areas as calibration areas for ecological niche models and species distribution models. Frontiers of Biogeography 13: e48814. https://doi.org/10.21425/F5FBG48814
  • Mesgaran MB, Cousens RD, Webber BL (2014) Here be dragons: A tool for quantifying novelty due to covariate range and correlation change when projecting species distribution models. Diversity and Distributions 20: 1147–1159. https://doi.org/10.1111/ddi.12209
  • Meyer H, Pebesma E (2021) Predicting into unknown space? Estimating the area of applicability of spatial prediction models. Methods in Ecology and Evolution 12: 1620–1633. https://doi.org/10.1111/2041-210X.13650
  • Nuñez-Penichet C, Cobos ME, Soberón J, Gueta T, Barve N, Barve V, Navarro-Sigüenza AG, Peterson AT (2022) Selection of sampling sites for biodiversity inventory: Effects of environmental and geographical considerations. Methods in Ecology and Evolution 13: 1595–1607. https://doi.org/10.1111/2041-210X.13869
  • Nuñez-Penichet C, Osorio-Olvera L, Gonzalez VH, Cobos ME, Jiménez L, DeRaad DA, Alkishe A, Contreras-Díaz RG, Nava-Bolaños A, Utsumi K, Ashraf U, Adeboje A, Peterson AT, Soberon J (2021) Geographic potential of the world’s largest hornet, Vespa mandarinia Smith (Hymenoptera: Vespidae), worldwide and particularly in North America. PeerJ 9: e10690. https://doi.org/10.7717/peerj.10690
  • Owens HL, Campbell LP, Dornak LL, Saupe EE, Barve N, Soberón J, Ingenloff K, Lira-Noriega A, Hensz CM, Myers CE, Peterson AT (2013) Constraints on interpretation of ecological niche models by limited environmental ranges on calibration areas. Ecological Modelling 263: 10–18. https://doi.org/10.1016/j.ecolmodel.2013.04.011
  • Pecl GT, Araújo MB, Bell JD, Blanchard J, Bonebrake TC, Chen I-C, Clark TD, Colwell RK, Danielsen F, Evengård B, Falconi L, Ferrier S, Frusher S, Garcia RA, Griffis RB, Hobday AJ, Janion-Scheepers C, Jarzyna MA, Jennings S, Lenoir J, Linnetved HI, Martin VY, McCormack PC, McDonald J, Mitchell NJ, Mustonen T, Pandolfi JM, Pettorelli N, Popova E, Robinson SA, Scheffers BR, Shaw JD, Sorte CJB, Strugnell JM, Sunday JM, Tuanmu M-N, Vergés A, Villanueva C, Wernberg T, Wapstra E, Williams SE (2017) Biodiversity redistribution under climate change: Impacts on ecosystems and human well-being. Science 355: eaai9214. https://doi.org/10.1126/science.aai9214
  • Peterson AT, Cobos ME, Jiménez‐García D (2018) Major challenges for correlational ecological niche model projections to future climate conditions. Annals of the New York Academy of Sciences, 1429: 66–77. https://doi.org/10.1111/nyas.13873
  • Petitpierre B, Broennimann O, Kueffer C, Daehler C, Guisan A (2017) Selecting predictors to maximize the transferability of species distribution models: Lessons from cross-continental plant invasions. Global Ecology and Biogeography 26: 275–287. https://doi.org/10.1111/geb.12530
  • Qiao H, Feng X, Escobar LE, Peterson AT, Soberón J, Zhu G, Papeş M (2019) An evaluation of transferability of ecological niche models. Ecography 42: 521–534. https://doi.org/10.1111/ecog.03986
  • Rödder D, Engler JO (2012) Disentangling interpolation and extrapolation uncertainties in ecological niche models: A novel visualization technique for the spatial variation of predictor variable collinearity. Biodiversity Informatics 8: 30–40. https://doi.org/10.17161/bi.v8i1.4326
  • Simões MVP, Saeedi H, Cobos ME, Brandt A (2021) Environmental matching reveals non-uniform range-shift patterns in benthic marine Crustacea. Climatic Change 168: 31. https://doi.org/10.1007/s10584-021-03240-8
  • Velazco SJE, Rose MB, De Marco Jr P, Regan HM, Franklin J (2023) How far can I extrapolate my species distribution model? Exploring shape, a novel method. Ecography 2024: e06992. https://doi.org/10.1111/ecog.06992
  • Yates KL, Bouchet PJ, Caley MJ, Mengersen K, Randin CF, Parnell S, Fielding AH, Bamford AJ, Ban S, Barbosa AM, Dormann CF, Elith J, Embling CB, Ervin GN, Fisher R, Gould S, Graf RF, Gregr EJ, Halpin PN, Heikkinen RK, Heinänen S, Jones AR, Krishnakumar PK, Lauria V, Lozano-Montes H, Mannocci L, Mellin C, Mesgaran MB, Moreno-Amat E, Mormede S, Novaczek E, Oppel S, Crespo GO, Peterson AT, Rapacciuolo G, Roberts JJ, Ross RE, Scales KL, Schoeman D, Snelgrove P, Sundblad G, Thuiller W, Torres LG, Verbruggen H, Wang L, Wenger S, Whittingham MJ, Zharikov Y, Zurell D, Sequeira AMM (2018) Outstanding challenges in the transferability of ecological models. Trends in Ecology and Evolution 33: 790–802. https://doi.org/10.1016/j.tree.2018.08.001

Supplementary material

Supplementary material 1 

Supple. tables S1, S2, figs S1–S6 (.pdf)

Download file (2.07 MB)
login to comment