Research Article
Print
Research Article
Instagram data for validating Nothofagus pumilio distribution mapping in the Southern Andes: A novel ground truthing approach
expand article infoMelanie Werner, Johannes Weidinger, Jürgen Böhner, Udo Schickhoff, Maria Bobrowski
‡ University of Hamburg, Hamburg, Germany
Open Access

Abstract

The availability of valid, non-biased species occurrence data has always been a major challenge for biodiversity research and modelling studies. Data from open-source databases or remote sensing are promising approaches to increase the availability of species occurrence data. However, these data may contain spatial, temporal, and taxonomic biases or require ground truthing. In recent years, social media has received attention for its contribution to species occurrence data sampling and ground truthing approaches. The wide reach of social media platforms allows for rapid and large-scale analyses.

Here we introduce a novel Instagram ground truthing approach to validate the occurrence mapping of Nothofagus pumilio across its entire distribution range. The treeline species of the southern Andes has been extensively studied in small-scale studies, but large-scale modelling approaches are largely missing due to limited accessibility to treeline sites resulting in a lack of occurrence data. The content posted on the social media platform Instagram consists of images and videos in which the species N. pumilio and its location can be identified. By searching for suitable posts using hashtags and location tags, we created 1238 Instagram ground truthing points. We compared the performance of our dataset with open-source data from the Global Biodiversity Information Facility (GBIF) through visual, quantitative, and bias analyses, acknowledging that both social media-based and Citizen Science data are subject to sampling and spatial biases due to collection in human-accessible areas. The Instagram ground truthing points were subsequently used to validate remote sensing occurrence data, generated using Sentinel-2 level 2A data and Supervised Classification. The combined approach – Instagram ground truthing and remote sensing – allows for the collection of occurrence data over the entire latitudinal range of N. pumilio, covering approximately 2000 km.

Highlights

  • The use of social media content provides potentially important contributions to species occurrence data sampling and ground truthing

  • In our study we introduce a novel ground truthing approach for species occurrence data sampling based on Instagram data

  • Instagram ground truthing points, combined with Supervised Classification generate species occurrence data of Nothofagus pumilio over its entire distribution range in the southern Andes

  • The performance of the Instagram ground truthing points is evaluated by comparison with existing data from the GBIF database.

  • Our Instagram ground truthing approach demonstrates a new way of sampling species occurrence data and can be applied to other suitable species and study areas.

Keywords

Citizen Science, Ecological Modelling, Ground Truthing, Instagram, Occurrence Data Sampling, Remote Sensing, Social Media, Southern Andes, Supervised Classification

Introduction

Quantifying spatial and temporal distribution of species and analysing underlying ecological requirements has become increasingly important in high elevation regions due to climate and environmental change (Schickhoff et al. 2022). Worldwide, ecological modelling studies are applied to model treeline species under present, past and future climate conditions (e.g., Dullinger et al. 2004; Bobrowski et al. 2017; Akobia et al. 2022). Whereas high mountains of the Northern Hemisphere are well represented in treeline related research, those of the Southern Hemisphere, such as the Andes, have rather been neglected (Hansson et al. 2021; Hansson et al. 2023). Recently published studies on the southern Andes have focused on local treeline sites (e.g., Daniels and Veblen 2003; Fajardo and Piper 2014; Srur et al. 2016, 2018), while large-scale modelling studies investigating treeline species or vegetation at higher elevations are very rare (Nagy et al. 2023). The limited accessibility of treeline sites due to remoteness and complex topography might have impeded such studies.

Species occurrence data are mainly collected through field research and made available in publications and databases (Feng et al. 2019). However, large-scale vegetation sampling is often costly and time-consuming. Therefore, species occurrence data are frequently downloaded via open-source databases such as the Global Biodiversity Information Facility (GBIF, gbif.org), which hosts datasets compiled from various sources (Edwards 2004; Boakes et al. 2010). However, these data may contain unknown taxonomic, spatial or sampling biases and are seldom evaluated or revisited (Beck et al. 2014; Meyer et al. 2016; Daru et al. 2018). Ensuring the quality of species data is essential for accurate modelling outcomes (Chauvier et al. 2021). Therefore, prior to utilisation, thorough examination, filtration, and potential supplementation of the data are imperative steps.

More recently, Citizen Science projects and social media are becoming crucial for surveying species occurren­ces (Jarić et al. 2020; Goldberg 2023). Citizen science involves the participation of citizens in scientific processes, such as collecting species data (Bonney 1996; Bonney et al. 2009). The number of Citizen Science projects, particularly those that are computer- or app-based, along with the resulting data, is increasing exponentially (Pocock et al. 2017). Such projects can generate large amounts of occurrence data in a comparatively short time (Sumner et al. 2019). Dickinson et al. (2010) even take the view that Citizen Science is the only practical way to study distribution patterns and range shifts of species over large areas. Despite the improving quality of the data, which is increasingly nearing that of expert-recorded data (Aceves‐Bueno et al. 2017), bias persists (Di Cecco et al. 2021). Data collection is concentrated in urban and tourist areas, with inaccessible or remote locations rarely being recorded. Additionally, the data is collected solely by individuals engaged in Citizen Science projects. Leveraging social media contributions can unlock further potential in data collection. On social media, an increasing number of geotagged image files are used. Images posted by both experts and non-experts can be analysed in large quantities. Recently, social media platforms such as Facebook, Flickr, Instagram, Twitter and YouTube are used to sample occurrence data and range shifts of animals, for example data of whales, dolphins (e.g., Pace et al. 2019; Gibson et al. 2020; Martino et al. 2021), birds (e.g., Hentati-Sundberg and Olsson 2016), insects (e.g., Virić Gašparić et al. 2022; O’Neill et al. 2023) and plants (e.g., ElQadi et al. 2017). Through the usage of geotagged social media content, as well as identifiable landscape elements and descriptions of the posted image, the actual location of the posts can be traced. In combination with clearly recognisable plant characteristics, it is possible to identify the location of species, representing a promising new ground truthing possibility. With the increasing use of social media and the spread of high-quality digital cameras, a huge potential can be tapped to make, in addition, important contributions to biodiversity monitoring and to the evaluation of potential protected areas (Chowdhury et al. 2023).

While both Citizen Science and social media enhance the sampling of occurrence data, the data remain spatially biased, as records are predominantly collected from locations accessible to humans (Meyer et al. 2016; La Sorte et al. 2024). Within this context, utilising remotely sensed species occurrences emerges as a promising method for reducing such bias and examining large study areas, particularly in regions with limited accessibility, such as high-elevation areas. One example is the use of remote sensing to classify tree species. Very high-resolution data like IKONOS, WorldView, RapidEye and airborne images are mostly used for small-scale studies (Fassnacht et al. 2016) and medium high-resolution data like Landsat and Sentinel data can successfully be used for larger areas (Immitzer et al. 2016; Immitzer et al. 2019). Despite the wide availability of high-resolution data and highly developed remote sensing methods, ground truthing remains indispensable for validating species data accuracy through field validation (on-site sampling) or validation processes with existing datasets (Nagai et al. 2020).

The availability of data for treeline species is limited; however, such species are likely to be suitable candidates for analysis using remote sensing and social media-based species occurrence data sampling. N. pumilio forms an abrupt treeline in the orotemperate belt of the southern Andes in mono-species forest stands (Amigo and Rodríguez-Guitián 2011) and is distributed in a region with many touristic areas where photos are taken, for example, during hiking. Therefore, the species can be recognised in satellite images as well as in Instagram posts. As N. pumilio responds to climate variations by changing radial growth patterns (Lara et al. 2001; Aravena et al. 2002; Daniels and Veblen 2004; Masiokas and Villalba 2004; Álvarez et al. 2015) and seedling establishment patterns above the treeline (Fajardo and Piper 2014), it is a suitable target species also for modelling approaches aiming at analysing treeline sensitivity and treeline shift due to climate change.

In this study, we demonstrate the large-scale sampling of N. pumilio occurrence data using Sentinel-2 imagery and Supervised Classification. To validate the spatial occurrence data, we introduce a novel Instagram ground truthing approach, leveraging occurrence points derived from the social media platform Instagram (www.instagram.com). We hypothesise that this Instagram-based method, due to a high volume of potentially suitable posts and our sampling approach, will generate more occurrence points with reduced spatial bias compared to datasets from the open-source GBIF database. Spatial bias in the resulting species occurrence data is further mitigated by incorporating remote sensing data, which enhances both the quantity and spatial coverage of occurrence information. Unlike presence-only point datasets, remote sensing data provide presence-absence datasets, offering more comprehensive opportunities for ecological modelling.

Material and methods

Study area and target species

Nothofagus pumilio (Poepp et Endl.) Krasser (southern or lenga beech) is the dominant subalpine tree species in the southern Chilean and Argentinean Andes between 35°S and 56°S, encompassing a longitudinal distribution range of more than 500 km (Masiokas and Villalba 2004; Lara et al. 2005; Rodríguez‐Catón et al. 2016). Out of all Nothofagus species, N. pumilio is the most orophilic (Amigo and Rodríguez-Guitián 2011). The dark green, elliptical, and notched broad leaves of the deciduous species turn into an orange-reddish colour in autumn, which is a reliable distinguishing feature of the species in comparison to other evergreen Nothofagus species in this region (Hildebrand-Vogel et al. 1990; Amigo and Rodríguez-Guitián 2011).

The distribution area of the species along the Andean cordillera follows an elevational gradient from north to south, while the west-east expansion is also dependent on precipitation. N. pumilio forms mono-species forests located between 1600 m up to 2000 m in the northern parts, whereas the elevational limit decreases to 400 m in the southernmost range at Tierra Del Fuego (Cuevas 2000; Lara et al. 2005). The treeline elevation decreases constantly by 60 m per 1° latitude (Lara et al. 2005). The west-east distribution area is defined by the extreme precipitation gradient from the windward to the leeward side of the Andes (Hertel et al. 2008). The eastern distribution boundary is characterised by low precipitation, following the forest-steppe ecotone (Rodríguez‐Catón et al. 2016). Often, two treelines are formed in the eastern regions – a common upper and a lower, xeric treeline towards the arid region (Hertel et al. 2008). On the western side of the Andes, the distribution is restricted to high elevations. In these hyperhumid areas, the species does not occur at lower elevations, where Nothofagus betuloides is the dominant tree species (Young et al. 2007; Amigo and Rodríguez-Guitián 2011). Especially the autumn colouring of the deciduous species and its occurrence in mono-species forests at the treeline are important characteristics for the Instagram ground truthing approach, as these features aid in identifying the species in Instagram posts and make them recognisable in satellite images.

Instagram ground truthing approach

As a first step, we developed the Instagram ground truthing approach to ensure proper validation of large-scale remotely sensed occurrence data of N. pumilio. Additionally, the Insta­gram ground truthing points are quantitatively compared with existing occurrence data from the GBIF database. We used the social media platform Instagram (www.instagram.com) to sample the Instagram ground truthing points. Although other social media platforms have been utilised in studies sampling species occurrences, Instagram has largely been overlooked. Nonetheless, we identify a signifi­cant advantage in using Instagram. Instagram users have the possibility to post both photos and short videos. The platform’s lack of text-only posts makes it especially suitable for our approach. At the same time, Instagram is one of the largest social media platforms with 2 billion users worldwide (We Are Social et al. 2024), allowing for the analy­sis of a significantly larger volume of content compared to lesser-known platforms that exclusively host visual content (e.g., Flickr). Moreover, users can localise their posts with a geographical tag and add descriptions where the content and, e.g., the location can be specified with text or hashtags. The aim of the ground truthing analysis was to identify locations in which the species N. pumilio is clearly identifiable, the location can be reliably traced and transferred to a map.

We started the Instagram ground truthing approach in 2021 and repeated it in 2022. The analysis consisted of 3 steps: 1) Potential contributions from publicly accessible profiles were searched for using the search bar embedded in the Instagram user interface and two specific search options: hashtags (#nothofaguspumilio and #lenga for exact species information) and places or landscape features (by location tags, locations in hashtags or usernames for exact locations). 2) Posts were checked using a strict catalogue of criteria (Table 1), ensuring that the species can be clearly identified, and that the location is traceable. Autumn pictures were preferentially included in the analysis, as the identification of the species is particularly reliable during this season. In addition, the typical abrupt treeline and mono-species forests were main criteria for selecting the posts. Furthermore, the Instagram posts include a publication date. Due to the chronological structure of the Instagram interface, our analysis has primarily focused on the most recent posts. However, the publication date does not guarantee that the photo was taken in that year. 3) Occurrences verified by the approach were transferred to a map. We first analysed all posts with specific species information (#nothofaguspumilio and #lenga). Next, we searched for specific locations within the species’ distribution range, filling gaps and adding points to the already set occurrence points. This structured approach ensures a homogeneous distribution of the sampled occurrences.

When we manually transferred the locations to a map in step 3), simple descriptions of the locations were not sufficient. The locations still had to be clearly traceable by landscape features (see Table 1). Such landscape elements in Patagonia could be glaciers, characteristic mountain peaks, roads, urban areas, touristic sites, waterbodies, and coastlines. These features should be so characteristic, that they can also be recognised in a satellite image. To avoid mistakes, the hashtags and descriptions in the post should match the given location. Images centring people, as well as those altered through the usage of filters, colour modifications, or emojis, were excluded.

Table 1.

Criteria for selecting Instagram posts to generate Nothofagus pumilio occurrence data. All these criteria must be fulfilled for the image to be included in the analysis.

Criterion Element or Example
Typical characteristics of Nothofagus pumilio morphological characteristics (leaves, branches, habitus)
autumn colouring
abrupt treeline
mono-species forest
Concrete location information geographical tag
location hashtag
location description in the caption
Recognisable landscape elements glaciers
mountain peaks or ranges
rivers, lakes
roads
tourist points, cities, villages
coastlines
Fitting hashtags hashtags describing the location or the plant
Picture criteria Avoid persons in focus
no photo montages
no emojis
no extreme (colour) falsifications

If all conditions were met, we manually transferred the determined occurrence to a map with SAGA GIS (Conrad et al. 2015, https://saga-gis.sourceforge.io). When creating Instagram ground truthing points, at least one point was created at the actual location where the photo was taken and N. pumilio was identified. However, additional points were created if other occurrences of N. pumilio were visible at the posted image. This is particularly the case when forest or even the treeline of N. pumilio can be seen in the background of landscape photographs. The species’ ecology enables the identification of these occurrences, as its presence at the treeline, combined with autumn colouring, makes the species recognisable, particularly when morphological features are clearly visible in the foreground of the images. In some cases, points were set as far as the neighbouring valley, when the treeline was clearly autumn-coloured in the used satellite image. Background points were assigned to a 1 km grid, as this is commonly the target resolution for model analyses, thereby maintaining the number of background points at a reasonable level. The selection of posts is exemplified in Fig. 1. The landscape elements used to locate the posts, along with the transferred points, are shown in Fig. 2.

Figure 1. 

Example of the Instagram ground truthing approach at Laguna Capri, Argentina. Nothofagus pumilio can be identified by its habitus and leaves in the foreground, its autumn-colouring and the abrupt treeline in the background. The lake itself and Mount Fitz Roy are reliable landscape elements. A location tag, the post description and hashtags also refer to the location (used with permission by Instagram user fernando.v.fotografia 2022).

Figure 2. 

The transmitted points (red) visible at Laguna Capri and at the treeline. Red boxes indicate the landscape elements, Laguna Capri and Mount Fitz Roy, that allowed to identify the location of the Instagram post shown in Fig. 1.

Remote sensing occurrence data sampling with supervised classification

Analysing multispectral, medium spatial resolution satellite data like Sentinel-2 leads to cost-efficient and robust results in tree species classification over large spatial extents (Fassnacht et al. 2016; Immitzer et al. 2016; Immitzer et al. 2019). Therefore, we compiled large-scale occurrence data over the entire distribution range of N. pumilio using Sentinel-2 level 2A data (BOA) in a resolution of 20 m. As N. pumilio occurs in mono-species forests at the treeline, a medium spatial resolution was sufficient to classify the forest type while allowing analysis over almost 2000 km latitude. Furthermore, the high temporal resolution data allowed phenological differentiation with autumn and summer images. Using the R Package “sen2r” (Ranghetti et al. 2020) we downloaded summer (months December and January) and autumn (month April) Sentinel-2 data. The sensing period was 2019 to 2022. Sentinel-2 scenes with up to 50 % cloud cover were included. Most scenes had a cloud cover of 15 % or less. We used the Sentinel-2 level 2A data “Scene Classification Layer” product (SCL, Fig. 3B) to mask all raster cells not classified as vegetation. This ensured that only data relevant to the analysis was included in the classification. Downloading numerous Sentinel-2 data with different acquisition times ensured that as many vegetation pixels as possible from a Sentinel-2 scene were included in the classification, despite high cloud cover.

We trained our Supervised Classification with training areas including three classes (1 = N. pumilio, 2 = Evergreen vegetation, 3 = Low vegetation). Training areas were created using summer, autumn and winter Sentinel-2 scenes at selected sites across the range. The winter data made evergreen vegetation clearly recognisable. Autumn colouring at the treeline indicated N. pumilio. We tested various classification algorithms for Supervised Classification, including well-performing standard algorithms like Maximum Likelihood, Minimum Distance, and Spectral Angle, as well as the decision tree-based Random Forest algorithm. The performance of these algorithms was measured using overall accuracy and the Kappa value (Richards 2022). To cover a wide spectral range, we used all the spectral bands from 2 to 7, 8a, 11 and, 12. All preprocessing steps and the Supervised Classification, which are visualised in Fig. 3, were carried out in R (R Core Team 2023) and SAGA GIS (Conrad et al. 2015). Map visualisation was performed in ArcGIS Pro software by Esri, version 2.7.0.

We classified summer and autumn data separately and subsequently extracted and merged the result of the N. pumilio occurrence into one layer. The result was further refined using three different masks. As the classification of the class “low vegetation/grassland” was particularly reliable in the summer classification and the classification of the class “evergreen” in the autumn classification, the result was masked by the result of these classes. Therefore, any pixels that may have been misclassified have been removed. In the north of the study area, N. pumilio occurs only at higher elevations, so that other deciduous species at lower elevations were misclassified as N. pumilio. To remove this occurrence, a Digital Surface Model (DSM, ALOS Global Digital Surface Model “ALOS World 3D – 30m (AW3D30)”, Jaxa EORC 2023) was used to remove occurrences below high-elevation mono-species forests (thresholds: 800 m from 35°S to 40°S; 500 m from 40°S to 45°S; 250 m from 45° to 50°S).

Figure 3. 

(A) Sentinel-2 autumn image at the Perito Moreno Glacier, Argentina, (B) Scene Classification Layer of the Sentinel-2 scene; the green area, class 4, shows vegetation, (C) the masked Sentinel-2 scene and, (D) classification result with three classes (red = Nothofagus pumilio, dark green = evergreen vegetation, light green = low vegetation/grassland).

GBIF data and validation process

The large-scale remote sensing data on the occurrence of N. pumilio was validated using the Instagram ground truthing points. This process involved verifying whether the Insta­gram ground truthing points align with the spatial distribution derived by Supervised Classification. Addition­ally, we used occurrence data from the GBIF database to also validate the spatial distribution and to compare it with the Instagram ground truthing points visually, quantitatively and with a sampling bias analysis. The GBIF data­base provides data on species of all taxa according to the open-source principle. The Secretariat in Copenhagen coordinates data from various sources, such as museums, research publications, and Citizen Science projects, and makes them available (GBIF 2024a). A search for the species Nothofagus pumilio (Poepp. & Endl.) Krasser resulted in 1,616 entries (as of 6 June 2024, GBIF 2024b). These entries come primarily from the Citizen Science platform iNaturalist, but also from other Citizen Science projects, universities, botanical gardens/arboreta, state institutions, and archives. All available data were downloaded and filtered according to two criteria. First, coordinates had to be provided, which led to the removal of 573 entries. Second, the coordinate uncertainty had to be less than or equal to 1,000 m, resulting in the removal of additional 485 entries. After filtering, 558 occurrences remained in the GBIF dataset. The data sources included two Citizen Science projects (iNaturalist with 546 occurrences and naturgucker with 5 occurrences) and two museums (Museo Argentino de Ciencias Naturales “Bernardino Rivadavia” with 6 occurrences and NTNU University Museum with 1 occurrence). The sampling years range between 1981–2024. Instagram ground truthing points and GBIF points were compared visually, quantitatively and by elevation using a DSM. The R package “sampbias” (Zizka et al. 2021) was used to assess whether the spatial sampling bias, influenced by site accessibility to humans, is reduced in the Instagram ground truthing point dataset. The package quantifies geographic bias and estimates the sampling rate across the study area using a Bayesian approach. Cities, roads, rivers, and lakes from Natural Earth Data (https://www.naturalearthdata.com/) were utilised as bias factors (gazetteers).

Usage of AI

ChatGPT (GPT-4 and GPT-3.5; available at https://chat.openai.com/) was used to enhance sentence structure and grammar in individual sentences.

Results

Instagram ground truthing approach

Numerous posts found by hashtags and location tags were reviewed in 2021 and 2022, resulting in 1238 traceable occurrence points. In total we found 297 suitable posts published between 2017 and 2022. Most posts were published in 2021. A total of 460 points were placed at the actual location of the posts, and 778 points were placed in the visible background area (mainly autumn coloured treeline locations). Posts with specific species information using the hashtags #nothofaguspumilio or #lenga provided 61 occurrence points. Posts with detailed location information, where N. pumilio is recognisable, contributed significantly to the occurrence data.

Comparison with GBIF data

Fig. 4 illustrates that the Instagram ground truthing points from the structured approach are less scattered. This is because the approach targeted the species’ distribution range and gaps between already set points, when searching for specific locations, such as cities, national parks, mountain peaks, and lakes within that range. In contrast, the GBIF points are somewhat more scattered, with a few separate “outliers” visible in the west. The average elevation of the GBIF points is 559.82 m. Nevertheless, the GBIF dataset also includes very high-elevation locations, with the highest recorded location at 2123 m, compared to 1952 m for the Instagram ground truthing points. Insta­gram ground truthing points are higher on average at 1049.25 m, as images with a visible treeline were preferably selected. However, the differences in mean elevation are mainly due to the fact that the GBIF points were predominantly recorded in the southern part of the study area (45°S to 55°S). Of the 558 points, 419 are located in this area, with only 139 located to the north. In contrast, the Instagram ground truthing points are more homogeneously distributed: 661 of the 1238 points are located between 45°S and 55°S, while 577 points are in the northern part of the study area. However, the GBIF data also supplement the Instagram ground truthing data, particularly in the northernmost parts of the distribution area. 16 points are located further north of the Instagram ground truthing dataset. In some cases, GBIF data augment locations with Instagram ground truthing points by adding several GBIF points. With 252 points nearly half of the GBIF points are concentrated in a few tourist/urban areas. For example, 159 points are located in and around Ushuaia, 24 points in and around Bariloche, 53 points in Torres del Paine National Park, and 16 points solely at a parking area with viewing platform for the Perito Moreno Glacier. To quantify this observed bias, we used the “sampbias” package, which assesses sampling bias based on factors (gazetteers) indicating the influence of human-accessible locations (Zizka et al. 2021). The analysis revealed a significant impact of cities on both datasets, with the effect being higher for the GBIF dataset (bias weight, IGTA = 0.3455, GBIF = 0.4611). Additionally, roads and rivers exert a greater influence on the GBIF dataset. The Instagram ground truthing dataset, however, is more biased only in relation to lakes, which can be attributed to numerous points in national parks such as Torres del Paine and Los Glaciares. Fig. 5 presents the estimated sampling rates for both datasets. In Fig. 5 (B) the undersampled area between cities can be clearly seen. The bias weights and further results from the “sampbias” analysis are included as supplemental material (see Suppl. material 1).

Figure 4. 

(A) 1238 points created by the Instagram ground truthing approach (IGTA) and (B) IGTA points (red) and 558 occurrence points from GBIF database (yellow, GBIF 2024b).

Figure 5. 

Visualised results of the “sampbias” analysis, indicating the sampling rate based on the influence of bias factors (gazetteers: cities, roads, rivers and lakes). In comparison, the IGTA dataset (A) displays more homogeneous sampling, whereas the GBIF dataset (B) shows undersampled areas, represented by dark blue regions.

Supervised classification

We found that the OpenCV Supervised Classification and Random Forest algorithms demonstrated the best performance. For the summer classification, the overall accuracy was 0.93 with a Kappa value of 0.89, and for the autumn classification, the overall accuracy was 0.97 with a Kappa value of 0.96 (see Suppl. material 1 for details on the Supervised Classification validation). The Supervised Classification of summer and autumn Sentinel-2 data resulted in a spatial distribution map of the species N. pumilio from 35°10'S to 55°59'S (Fig. 6A). Detailed strengths (Fig. 6B) and weaknesses (Fig. 7) can be reviewed in the following figures.

In particular, the area extending east of the Northern and Southern Patagonian Ice Fields shows an accurate classification result: Three tree species dominate these areas, with N. pumilio and N. antarctica being deciduous species and N. betuloides being an evergreen species (Veblen et al. 1996). The occurrence of N. pumilio at the treeline was clearly distinguished from the occurrence of the evergreen species N. betuloides (class 2) below the treeline and areas of low vegetation (class 3). A clear distinction is also made in more arid regions in the east of the study area. Here, N. pumilio occurrences are clearly separated from scrub- and grassland. In the northern part of the study area, N. pumilio was reliably recorded at the treeline. However, in the valleys of this area there are also deciduous species that are misclassified as N. pumilio. We have removed these occur­rences by an elevation correction, so that only occurrences of N. pumilio that could be unequivocally identified as such remained. However, at higher elevation, it was not possible to distinguish between the two deciduous and morphologically and ecologically similar species, N. pumilio and N. antarctica.

Errors occur mainly due to high cloud cover and shadow effects. At the southern tip of Chile and Argentina, high cloud cover leads to data coverage problems. An area where the Sentinel-2 scene is not fully available in the sensing period of 2019 to 2022 and other scenes were almost completely covered by clouds is shown in Fig. 7A). A case where data are missing due to mountain shadows is visualised in Fig. 7B). Using the SCL resulted in shadow or shaded valleys being excluded from the analysis. Errors in validation with Instagram ground truthing points occur precisely in these areas.

Figure 6. 

(A) Nothofagus pumilio occurrence determined by Supervised Classification and (B) occurrence of Nothofagus pumilio at Perito Moreno Glacier, with a classification result corresponding to the natural conditions, shown in comparison with the autumn Sentinel-2 scene in (C). All Instagram ground truthing points (blue, B) cover the identified occurrence.

Figure 7. 

(A) A data gap at the southern tip of Chile. This is caused by missing Sentinel-2 data in the sensing period between 2019 and 2022 and very high cloud cover. (B) A valley with mountain shadows. To avoid errors in the spectral signals, these areas were removed during analysis using the Sentinel-2 Scene Classification Layer. However, this leads to a gap in the classification result and errors in the validation with the Instagram ground truthing points (blue).

Validation

We validated the classification result with the Instagram ground truthing and GBIF points by checking whether the points match the spatial occurrence. Out of 1238 Instagram ground truthing points, 1142 points are congruent with the remote sensing data, which is 92.25 %. 96 points (7.75 %) lie outside the areas classified as N. pumilio. These errors are probably due to mountain shadows and missing data, as we show in the results. Of the GBIF points, 157 (28.14 %) align with the spatial occurrence, while 401 (71.86 %) do not. However, many of these points lie just outside the determined spatial occurrence. Errors can also occur due to shadows and missing data. Other reasons may include the uncertainty of the coordinates, the image being recorded on roads or paths next to the occurrence rather than directly in the plant stand, or individual trees or stands being recorded in urban areas, evergreen forest stands, or open areas with low vegetation, which the classification does not categorise as N. pumilio areas. Fig. 8 provides two examples that support these hypotheses.

Figure 8. 

(A) Example of Nothofagus pumilio occurrence points from the GBIF database (GBIF points, yellow) with an uncertainty in coordinates, so that the points are located in a lake and not at the actual sampling location and (B) GBIF points in Ushuaia, where some raster cells, at locations of GBIF points, were not classified as vegetation.

Discussion

Instagram ground truthing approach as novel method for ground truthing

The availability of sufficient, non-biased species occurrence data has always been a major problem for ecological modelling studies (Bobrowski et al. 2021; Chauvier et al. 2021). An increasing wealth of information on the occurrences of species is becoming available through global databases (Michener et al. 2012; Feng et al. 2019). As field research is costly and time-consuming, open-source occurrence databases like the GBIF, with data compiled by Citizen Science are increasingly used in such studies (Feldman et al. 2021). However, even these advances do not replace the need for ground truthing procedures, as these data may contain spatial, temporal, and taxonomic biases (Beck et al. 2014; Meyer et al. 2016). In this study, we tested a novel Instagram ground truthing approach to generate ground truthing points for the species N. pumilio to validate species occurrence data derived from remote sensing. Furthermore, the Instagram ground truthing points were compared with existing GBIF data which led us to the conclusion that our approach offers the possibility to generate occurrence data with the potential to increase and improve existing data.

We include the sampling of ground truthing points on Instagram, and thus the re-use of social media posts in the realm of Citizen Science, although this is controversially discussed. According to the vignette study by Haklay et al. (2021), approximately 50 % of respondents describe the re-use of social media as Citizen Science. While Citizen Science requires the active involvement of participants (Wiggins and Crowston 2011), in social media, content creators are rarely aware that they are participating in a study. Rather, permission to use imagery repurposed for occurrence information was given passively or unintentionally (Jarić et al. 2020). In addition, information about one’s own study and its results (data transparency) is an important point for Citizen Science (Haklay et al. 2021). In our current work, we considered these criticisms as much as possible. By creating an own public Instagram account (www.instagram.com/nothofagus_pumilio_research/), where our study and results are presented, we aim at making a knowledge exchange possible. In addition, all images sourced from publicly accessible profiles used for our analysis were “liked” to draw the attention of content creators to the account. Users whose images were used in publications were asked for permission in advance and informed of the purpose. Furthermore, Instagram’s chat function was used to exchange information with Instagram users about a photo’s location or the species itself. Despite the discourse on terminology, we primarily focus on discussing the advantages and disadvantages of Citizen Science occurrence data sampling and exploring potential improvements to our method.

N. pumilio is suitable for the Instagram approach due to its distinctive characteristics and visibility in satellite images at the treeline in mono-species forest stands. Additionally, its occurrence in national parks, where tourists often take photos, increases the number of Instagram posts. The species’ autumn colouring further enhances its aesthetic appeal, leading to even more posts. These advantages, also benefit occurrence data sampling in unstructured Citizen Science projects. In unstructured projects, user behaviour of Citizen Science participants leads to observations with spatial bias (Di Cecco et al. 2021), mainly concentrated in tourist locations or urban areas, such as near cities or along roads (Reddy and Dávalos 2003; Graham et al. 2004; Fithian et al. 2015; Chauvier et al. 2021). Since the direct sampling location of the Citizen Science participants is usually recorded, remote locations that are difficult to access are not included. Other studies have already addressed this spatial bias in GBIF data and highlighted potential pitfalls for ecological conclusions (Boakes et al. 2010; Beck et al. 2014; Meyer et al. 2016), which can be further emphasized with the analysed dataset where nearly half of the points are located in urban or touristic locations. Points derived from Instagram are also not free of this bias, but as the Instagram ground truthing approach allowed to identify occurrence points in the background of suitable posts and additional points at autumn-coloured treelines, this spatial sampling bias is efficiently reduced. Additionally, the use of Instagram, with 2 billion users worldwide (We Are Social et al. 2024) and millions of photos uploaded daily (60 million daily uploads by 2014, WirtschaftsWoche 2014), allows us to analyse a large number of posts that are potentially suitable for analysis. The large number of possibly suitable posts on Instagram and a structured sampling approach further reduced spatial bias. After posts were found using specific hashtags (#nothofaguspumilio and #lenga), gaps between these occurrences were then closed by searching for specific locations. Contributions were found where the species was not specified by the Instagram user but was clearly recogni­sable. This ensured that the study area was covered as homogeneously as possible. The improvement of the Instagram ground truthing approach compared to other Citizen Science projects lies particularly in the inclusion of images taken by non-experts, where the species was recorded even though the social media user might not be aware of it. Instagram exclusively shares photos and videos and is one of the largest social media platforms, offering the possibility to analyse a large quantity of suitable posts. This is also reflected in the quantitative comparison. While the GBIF database contains 558 data points from 1981 to 2024 after filtering, the Instagram ground truthing approach was able to create 1238 points from posts dated 2017 to 2022.

Data from Citizen Science projects are improving and even approaching the quality of expert data (Mesaglio and Callaghan 2021). For example, species identifications in iNaturalist, the main source of GBIF data used in this study, must first be confirmed by a 2/3 consensus. iNaturalist also offers suggestions for species identification based on examples, which are reviewed and updated by experts (curators). Only after occurrences are considered complete and certain are they passed on to the GBIF database for publication as valid data (Heberling and Isaac 2018). Although the sampling bias in species identification is decreasing, the sampling bias in terms of coordinate accuracy still needs improvement. Coordinate uncertainties result from georeferencing methods of museum data (Marcer et al. 2022) or weak satellite signals while sampling with mobile devises (Uyeda et al. 2020). We included GBIF data with an uncertainty of up to 1 km, although such a deviation is substantial, especially in a highly complex ecosystem like high mountains. The validation process showed that deviations in the coordinates lead to errors. Precise filtering and selection of data from databases are necessary. Although the manual creation of occurrences during the Instagram ground truthing approach was time-consuming, this reduced such bias.

The limitation of the Instagram ground truthing approach is the time-consuming and non-automated process. While sampling by Citizen Science projects is a way of data collection in which data can be collected particularly quickly and cost-effectively with a large reach (Kullenberg and Kasperowski 2016; Sumner et al. 2019), the manual search for suitable posts and map transfer is time-consuming. Other studies using social media to investigate species occurrences carried out an automated search of suitable posts via the Application Programming Interface (API) of the chosen social media platform (e.g., Flickr: Alampi Sottini et al. 2019; August et al. 2020). The Instagram API is not as easy to access. At this stage, manual analysis remains unavoidable when using Instagram, and it is similarly required for platforms such as Facebook and Twitter, as demonstrated by O’Neill et al. (2023). In the future, AI-based image recognition processes could be an efficient way to find suitable posts on Instagram more quickly. Recently, Instagram’s terms of usage were changed to allow for this purpose. By using Instagram, users agree that their content may be analy­sed by AI (Meta 2024).

Another limitation is the accuracy of the recording date. Caution is needed regarding the exact timing of when a photo was taken and posted on Instagram, as the publication date may not correspond to the actual date the photo was captured. In our analysis, we primarily focused on recent posts, covering the period from 2017 to 2022, while the GBIF data includes records dating as far back as 1981. Verifying the accuracy of the Instagram date is essential (if necessary, by contacting the post creators), particularly for temporal distribution analyses of species. “Historical” data may not be available on Instagram at all. Additionally, discrepancies in acquisition dates between Sentinel-2 and Instagram data may introduce potential sources of error in the validation process. We used Sentinel-2 data from 2019 to 2022, while Instagram posts date back to 2017. Changes in forest stands, such as deforestation or forest fires, could result in discrepancies.

Supervised classification

The occurrence in mono-species forest stands at the treeline and the phenology of N. pumilio allows a precise creation of training areas and a reliable result of the Supervised Classification. With the Sentinel-2 level 2A data in a resolution of 20 m we achieved an accurate classification over a very large study area of about 2000 km latitudinal extent in the southern Andes. Such a resolution is sufficient for subsequent modelling, which often uses climate data at a resolution of 30 arcseconds (~1 km), for example, WorldClim and Bioclim data (e.g., Bobrowski et al. 2018). Furthermore, the high temporal resolution of the Sentinel-2 data provided many scenes, that were analysed in the Supervised Classification. By masking the Sentinel-2 scenes with the SCL, sources of error due to too many non-vegetation classes were avoided. It also removed vegetation grid cells covered by clouds, aerosols, and shadows, which can lead to classification errors. Consequently, the three vegetation classes (1 = N. pumilio, 2 = Evergreen vegetation, 3 = Low vegetation) were trained using training areas that contained only the spectral information of representative vegetation raster cells. However, this also created gaps in the classification result that could not be filled even with the large number of Sentinel-2 scenes.

A different source of error is that in some cases stands of N. antarctica were classified as stands of N. pumilio in high and low elevation areas. The two deciduous species are very similar, both phenologically and ecologically, and often share the same range. Hybrids of the two species are also possible (Soliani et al. 2015). Inclusion of N. antarctica occurrence data, e.g., data from local forestry services (e.g data from Corporación Nacional Forestal (CONAF), Chile and “Ordenamiento Territorial de Bosque Nativo” (OTBN), Argentina), or repeating the Instagram ground truthing approach for this species could further specify the result. Nevertheless, the forest type was determined with certainty at this stage. High mountain deciduous forest was correctly identified even if N. antarctica was partially included.

In more southerly areas, N. pumilio and N. antarctica dominate as deciduous species. In the north, however, many other deciduous species occur at lower elevations, while N. pumilio occurs only at the treeline. For this reason, an elevational correction of the result was necessary. Occurrences below subalpine forest stands of N. pumilio have been removed. Individual stands below these are also less relevant for treeline modelling studies. The thresholds (800 m, 500 m, 250 m) were estimated from literature data and from the classification result in order to obtain the most accurate classification result with the least data loss.

Conclusion

Citizen Science and social media-based occurrence sampling is developing and improving rapidly, becoming an important source of species occurrence data, especially for large-scale modelling approaches where alternatives are limited. However, resulting data are not free from bias and need to be filtered and verified before being used in applications such as ecological models. We conclude that using social media posts on Instagram in a structured Insta­gram ground truthing approach leads to less-biased occurrence data for N. pumilio in comparison with GBIF data. Sampling biases are further minimised by combining the Instagram ground truthing method with Supervised Classification, as large-scale occurrence data are generated across the entire distribution range of the species, rather than just in urban or tourist locations where most pictures are taken building the basis for Citizen Scientist observations or Instagram posts. We further conclude that the Instagram ground truthing approach is a novel method that can complement occurrence data sampling methods and be applied to other suitable species. However, it is essential that landscape elements are visible in the posts, which is more likely for landscape images and less so for detailed images of smaller herbaceous plant or animal species. Future work could focus on creating an automated search for Instagram posts using Instagram API and AI technology to replace the time-consuming manual search and further increase the availability of suitable posts. We believe that using social media can unlock significant potential for species occurrence data sampling and thus promote research on species in remote and high-elevation regions. Furthermore, the spatial occurrence data of N. pumilio enables presence-absence modelling approaches, that can provide detailed insights into the current and future distribution of N. pumilio.

Acknowledgments

We would like to thank all the Instagram users who have supported us with their posts, with information about the location where their pictures were taken and sharing information about the occurrence of N. pumilio and N. antarctica. Special thanks to user fernando.v.fotografia for allowing us to publish his post.

Author contributions

All authors participated during project design. MW conceived and developed the idea, collected, and analysed the data and drafted the first manuscript, JW and MB were significantly involved in the conception of the idea and the development and implementation of the analyses, MB also managed the whole project, JB and US were the reviewers and editors of the manuscript, providing their expertise on the topic and assistance with the literature. All authors have read and agreed to the published version of the manuscript.

Data accessibility statement

The Instagram ground truthing points and the spatial species occurrence data generated in our study are currently in the process of being published with the open access data provider of the University of Hamburg.

Supplementary material: The following materials are available as part of the online article at https://escholarship.org/uc/fb.

References

  • Aceves‐Bueno E, Adeleye AS, Feraud M, Huang Y, Tao M, Yang Y, Anderson SE (2017) The Accuracy of Citizen Science Data: A Quantitative Review. Bulletin Ecological Society of America 98: 278–290. https://doi.org/10.1002/bes2.1336
  • Akobia I, Janiashvili Z, Metreveli V, Zazanashvili N, Batsatsashvili K, Ugrekhelidze K (2022) Modelling the potential distribution of subalpine birches (Betula spp.) in the Caucasus. Community Ecology 23: 209–218. https://doi.org/10.1007/s42974-022-00097-4
  • Alampi Sottini V, Barbierato E, Bernetti I, Capecchi I, Fabbrizzi S, Menghini S (2019) Rural environment and landscape quality: an evaluation model integrating social media analysis and geostatistics techniques. Aestimum 74: 43–62. https://doi.org/10.13128/aestim-7379
  • Álvarez C, Veblen TT, Christie DA, González-Reyes Á (2015) Relationships between climate variability and radial growth of Nothofagus pumilio near altitudinal treeline in the Andes of northern Patagonia, Chile. Forest Ecology and Management 342: 112–121. https://doi.org/10.1016/j.foreco.2015.01.018
  • Amigo J, Rodríguez-Guitián MA (2011) Bioclimatic and phytosociological diagnosis of the species of the Nothofagus genus (Nothofagaceae) in South America. International Journal of Geobotanical Research 1: 1–20. https://doi.org/10.5616/ijgr110001
  • Aravena JC, Lara A, Wolodarsky-Franke A, Villalba R, Cuq E (2002) Tree-ring growth patterns and temperature reconstruction from Nothofagus pumilio (Fagaceae) forests at the upper tree line of southern Chilean Patagonia. Revista chilena de historia natural 75: 361–376. https://doi.org/10.4067/S0716-078X2002000200008
  • Beck J, Böller M, Erhardt A, Schwanghart W (2014) Spatial bias in the GBIF database and its effect on modeling species’ geographic distributions. Ecological Informatics 19: 10–15. https://doi.org/10.1016/j.ecoinf.2013.11.002
  • Boakes EH, McGowan PJK, Fuller RA, Chang-qing D, Clark NE, O’Connor K, Mace GM (2010) Distorted views of biodiversity: spatial and temporal bias in species occurrence data. PLoS Biology 8: e1000385. https://doi.org/10.1371/journal.pbio.1000385
  • Bobrowski M, Bechtel B, Böhner J, Oldeland J, Weidinger J, Schickhoff U (2018) Application of Thermal and Phenological Land Surface Parameters for Improving Ecological Niche Models of Betula utilis in the Himalayan Region. Remote Sensing 10: 814. https://doi.org/10.3390/rs10060814
  • Bonney R (1996) Citizen science: a lab tradition. Living Bird 15: 7–15.
  • Bonney R, Cooper CB, Dickinson J, Kelling S, Phillips T, Rosenberg KV, Shirk J (2009) Citizen Science: A Developing Tool for Expanding Science Knowledge and Scientific Literacy. BioScience 59: 977–984. https://doi.org/10.1525/bio.2009.59.11.9
  • Chauvier Y, Zimmermann NE, Poggiato G, Bystrova D, Brun P, Thuiller W (2021) Novel methods to correct for observer and sampling bias in presence‐only species distribution models. Global Ecology and Biogeography 30: 2312–2325. https://doi.org/10.1111/geb.13383
  • Chowdhury S, Fuller RA, Ahmed S, Alam S, Callaghan CT, Das P, Correia RA, Di Marco M, Di Minin E, Jarić I, Labi MM, Ladle RJ, Rokonuzzaman M, Roll U, Sbragaglia V, Siddika A, Bonn A (2023) Using social media records to inform conservation planning. Conservation Biology 38: e14161. https://doi.org/10.1111/cobi.14161
  • Conrad O, Bechtel B, Bock M, Dietrich H, Fischer E, Gerlitz L, Wehberg J, Wichmann V, Böhner J (2015) System for Automated Geoscientific Analyses (SAGA) v. 2.1.4. Geoscientific Model Development 8: 1991–2007. https://doi.org/10.5194/gmd-8-1991-2015
  • Daniels LD, Veblen TT (2004) Spatiotemporal Influences of Climate on Altitudinal Treeline in Northern Patagonia. Ecology 85: 1284–1296. https://doi.org/10.1890/03-0092
  • Daru BH, Park DS, Primack RB, Willis CG, Barrington DS, Whitfeld TJS, Seidler TG, Sweeney PW, Foster DR, Ellison AM, Davis CC (2018) Widespread sampling biases in herbaria revealed from large-scale digitization. The New Phytologist 217: 939–955. https://doi.org/10.1111/nph.14855
  • Di Cecco GJ, Barve V, Belitz MW, Stucky BJ, Guralnick RP, Hurlbert AH (2021) Observing the Observers: How Participants Contribute Data to iNaturalist and Implications for Biodiversity Science. BioScience 71: 1179–1188. https://doi.org/10.1093/biosci/biab093
  • Dullinger S, Dirnböck T, Grabherr G (2004) Modelling climate change‐driven treeline shifts: relative effects of temperature increase, dispersal and invasibility. Journal of Ecology 92: 241–252. https://doi.org/10.1111/j.0022-0477.2004.00872.x
  • ElQadi MM, Dorin A, Dyer A, Burd M, Bukovac Z, Shrestha M (2017) Mapping species distributions with social media geo-tagged images: Case studies of bees and flowering plants in Australia. Ecological Informatics 39: 23–31. https://doi.org/10.1016/j.ecoinf.2017.02.006
  • Fajardo A, Piper FI (2014) An experimental approach to explain the southern Andes elevational treeline. American Journal of Botany 101: 788–795. https://doi.org/10.3732/ajb.1400166
  • Fassnacht FE, Latifi H, Stereńczak K, Modzelewska A, Lefsky M, Waser LT, Straub C, Ghosh A (2016) Review of studies on tree species classification from remotely sensed data. Remote Sensing of Environment 186: 64–87. https://doi.org/10.1016/j.rse.2016.08.013
  • Feldman MJ, Imbeau L, Marchand P, Mazerolle MJ, Darveau M, Fenton NJ (2021) Trends and gaps in the use of citizen science derived data as input for species distribution models: A quantitative review. PLOS ONE 16: e0234587. https://doi.org/10.1371/journal.pone.0234587
  • Feng X, Park DS, Walker C, Peterson AT, Merow C, Papeş M (2019) A checklist for maximizing reproducibility of ecological niche models. Nature Ecology & Evolution 3: 1382–1395. https://doi.org/10.1038/s41559-019-0972-5
  • Fithian W, Elith J, Hastie T, Keith DA (2015) Bias correction in species distribution models: pooling survey and collection data for multiple species. Methods in Ecology and Evolution 6: 424–438. https://doi.org/10.1111/2041-210X.12242
  • Gibson CE, Williams D, Dunlop R, Beck S (2020) Using social media as a cost‐effective resource in the photo‐identification of a coastal bottlenose dolphin community. Aquatic Conservation: Marine and Freshwater Ecosystems 30: 1702–1710. https://doi.org/10.1002/aqc.3356
  • Goldberg JK (2023) iNaturalist is an open science resource for ecological genomics by enabling rapid and tractable records of initial observations of sequenced biological samples. Biology Letters 19: 20230251. https://doi.org/10.1098/rsbl.2023.0251
  • Graham CH, Ferrier S, Huettman F, Moritz C, Peterson AT (2004) New developments in museum-based informatics and applications in biodiversity analysis. Trends in Ecology & Evolution 19: 497–503. https://doi.org/10.1016/j.tree.2004.07.006
  • Haklay M, Fraisl D, Greshake Tzovaras B, Hecker S, Gold M, Hager G, Ceccaroni L, Kieslinger B, Wehn U, Woods S, Nold C, Balázs B, Mazzonetto M, Ruefenacht S, Shanley LA, Wagenknecht K, Motion A, Sforzi A, Riemenschneider D, Dorler D, Heigl F, Schaefer T, Lindner A, Weißpflug M, Mačiulienė M, Vohland K (2021) Contours of citizen science: a vignette study. Royal Society Open Science 8: 202108. https://doi.org/10.1098/rsos.202108
  • Hansson A, Dargusch P, Shulmeister J (2021) A review of modern treeline migration, the factors controlling it and the implications for carbon storage. Journal of Mountain Science 18: 291–306. https://doi.org/10.1007/s11629-020-6221-1
  • Hansson A, Shulmeister J, Dargusch P, Hill G (2023) A review of factors controlling Southern Hemisphere treelines and the implications of climate change on future treeline dynamics. Agricultural and Forest Meteorology 332: 109375. https://doi.org/10.1016/j.agrformet.2023.109375
  • Heberling JM, Isaac BL (2018) iNaturalist as a tool to expand the research value of museum specimens. Applications in Plant Sciences 6: e01193. https://doi.org/10.1002/aps3.1193
  • Hertel D, Therburg A, Villalba R (2008) Above- and below-ground response by Nothofagus pumilio to climatic conditions at the transition from the steppe–forest boundary to the alpine treeline in southern Patagonia, Argentina. Plant Ecology & Diversity 1: 21–33. https://doi.org/10.1080/17550870802257026
  • Hildebrand-Vogel R, Godoy R, Vogel A (1990) Subantarctic-Andean Nothofagus pumilio Forests. Distribution Area and Synsystematic Overview; Vegetation and Soils as Demonstrated by an Example of a South Chilean Stand. Vegetatio 89: 55–68. https://doi.org/10.1007/BF00134434
  • Immitzer M, Neuwirth M, Böck S, Brenner H, Vuolo F, Atzberger C (2019) Optimal Input Features for Tree Species Classification in Central Europe Based on Multi-Temporal Sentinel-2 Data. Remote Sensing 11: 2599. https://doi.org/10.3390/rs11222599
  • Immitzer M, Vuolo F, Atzberger C (2016) First Experience with Sentinel-2 Data for Crop and Tree Species Classifications in Central Europe. Remote Sensing 8: 166. https://doi.org/10.3390/rs8030166
  • Jarić I, Correia RA, Brook BW, Buettel JC, Courchamp F, Di Minin E, Firth JA, Gaston KJ, Jepson P, Kalinkat G, Ladle R, Soriano-Redondo A, Souza AT, Roll U (2020) iEcology: Harnessing Large Online Resources to Generate Ecological Insights. Trends in Ecology & Evolution 35: 630–639. https://doi.org/10.1016/j.tree.2020.03.003
  • Lara A, Aravena JC, Villalba R, Wolodarsky-Franke A, Luckman B, Wilson R (2001) Dendroclimatology of high-elevation Nothofagus pumilio forests at their northern distribution limit in the central Andes of Chile. Canadian Journal of Forest Research 31: 925–936. https://doi.org/10.1139/cjfr-31-6-925
  • Lara A, Villalba R, Wolodarsky-Franke A, Aravena JC, Luckman BH, Cuq E (2005) Spatial and temporal variation in Nothofagus pumilio growth at tree line along its latitudinal range (35°40′-55° S) in the Chilean Andes. Journal of Biogeography 32: 879–893. https://doi.org/10.1111/j.1365-2699.2005.01191.x
  • La Sorte FA, Cohen JM, Jetz W (2024) Data coverage, biases, and trends in a global citizen‐science resource for monitoring avian diversity. Diversity and Distributions 30: e13863. https://doi.org/10.1111/ddi.13863
  • Marcer A, Chapman AD, Wieczorek JR, Xavier Picó F, Uribe F, Waller J, Ariño AH (2022) Uncertainty matters: ascertaining where specimens in natural history collections come from and its implications for predicting species distributions. Ecography 2022: e06025. https://doi.org/10.1111/ecog.06025
  • Martino S, Pace DS, Moro S, Casoli E, Ventura D, Frachea A, Silvestri M, Arcangeli A, Giacomini G, Ardizzone G, Lasinio GJ (2021) Integration of presence‐only data from several sources: a case study on dolphins’ spatial distribution. Ecography 44: 1533–1543. https://doi.org/10.1111/ecog.05843
  • Mesaglio T, Callaghan CT (2021) An overview of the history, current contributions and future outlook of iNaturalist in Australia. Wildlife Research 48: 289–303. https://doi.org/10.1071/WR20154
  • Meyer C, Weigelt P, Kreft H (2016) Multidimensional biases, gaps and uncertainties in global plant occurrence information. Ecology Letters 19: 992–1006. https://doi.org/10.1111/ele.12624
  • Michener WK, Allard S, Budden A, Cook RB, Douglass K, Frame M, Kelling S, Koskela R, Tenopir C, Vieglais DA (2012) Participatory design of DataONE—Enabling cyberinfrastructure for the biological and environmental sciences. Ecological Informatics 11: 5–15. https://doi.org/10.1016/j.ecoinf.2011.08.007
  • Nagai S, Nasahara KN, Akitsu TK, Saitoh TM, Muraoka H (2020) Importance of the Collection of Abundant Ground‐Truth Data for Accurate Detection of Spatial and Temporal Variability of Vegetation by Satellite Remote Sensing. In: Dontsova K, Balogh-Brunstadt Z, Le Roux G (Eds) Biogeochemical cycles. Ecological drivers and environmental impact. Wiley, 223–244. https://doi.org/10.1002/9781119413332.ch11
  • Nagy L, Eller CB, Mercado LM, Cuesta FX, Llambí LD, Buscardo E, Aragão LEOC, García-Núñez C, Oliveira RS, Barbosa M, Ceballos SJ, Calderón-Loor M, Fernandes GW, Aráoz E, Muñoz AMQ, Rozzi R, Aguirre F, Álvarez-Dávila E, Salinas N, Sitch S (2023) South American mountain ecosystems and global change – a case study for integrating theory and field observations for land surface modelling and ecosystem management. Plant Ecology & Diversity 16: 1–27. https://doi.org/10.1080/17550874.2023.2196966
  • O’Neill D, Häkkinen H, Neumann J, Shaffrey L, Cheffings C, Norris K, Pettorelli N (2023) Investigating the potential of social media and citizen science data to track changes in species’ distributions. Ecology and Evolution 13: e10063. https://doi.org/10.1002/ece3.10063
  • Pace DS, Giacomini G, Campana I, Paraboschi M, Pellegrino G, Silvestri M, Alessi J, Angeletti D, Cafaro V, Pavan G, Ardizzone G, Arcangeli A (2019) An integrated approach for cetacean knowledge and conservation in the central Mediterranean Sea using research and social media data sources. Aquatic Conservation: Marine and Freshwater Ecosystems 29: 1302–1323. https://doi.org/10.1002/aqc.3117
  • Ranghetti L, Boschetti M, Nutini F, Busetto L (2020) “sen2r”: An R toolbox for automatically downloading and preprocessing Sentinel-2 satellite data. Computers & Geosciences 139: 104473. https://doi.org/10.1016/j.cageo.2020.104473
  • R Core Team (2023) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
  • Rodríguez‐Catón M, Villalba R, Morales M, Srur A (2016) Influence of droughts on Nothofagus pumilio forest decline across northern Patagonia, Argentina. Ecosphere 7: e01390. https://doi.org/10.1002/ecs2.1390
  • Schickhoff U, Bobrowski M, Mal S, Schwab N, Singh RB (2022) The World’s Mountains in the Anthropocene. In: Schickhoff U, Singh RB, Mal S (Eds) Mountain Landscapes in Transition. Sustainable Development Goals Series. Springer, Cham., 1–144. https://doi.org/10.1007/978-3-030-70238-0_1
  • Soliani C, Tsuda Y, Bagnoli F, Gallo LA, Vendramin GG, Marchelli P (2015) Halfway encounters: meeting points of colonization routes among the southern beeches Nothofagus pumilio and N. antarctica. Molecular Phylogenetics and Evolution 85: 197–207. https://doi.org/10.1016/j.ympev.2015.01.006
  • Srur AM, Villalba R, Rodríguez-Catón M, Amoroso MM, Marcotti E (2016) Establishment of Nothofagus pumilio at Upper Treelines Across a Precipitation Gradient in the Northern Patagonian Andes. Arctic, Antarctic, and Alpine Research 48: 755–766. https://doi.org/10.1657/AAAR0016-015
  • Srur AM, Villalba R, Rodríguez-Catón M, Amoroso MM, Marcotti E (2018) Climate and Nothofagus pumilio Establishment at Upper Treelines in the Patagonian Andes. Frontiers in Earth Science 6: 57. https://doi.org/10.3389/feart.2018.00057
  • Sumner S, Bevan P, Hart AG, Isaac NJ (2019) Mapping species distributions in 2 weeks using citizen science. Insect Conservation and Diversity 12: 382–388. https://doi.org/10.1111/icad.12345
  • Veblen TT, Hill RS, Read J [Eds] (1996) The ecology and biogeography of Nothofagus forests. Yale University Press.
  • Virić Gašparić H, M. Mikac K, Pajač Živković I, Krehula B, Orešković M, Galešić MA, Ninčević P, Varga F, Lemić D (2022) Firefly Occurrences in Croatia – One Step Closer from Citizen Science to Open Data. Interdisciplinary Description of Complex Systems 20: 112–124. https://doi.org/10.7906/indecs.20.2.4
  • Wiggins A, Crowston K (2011) From Conservation to Crowdsourcing: A Typology of Citizen Science. In: 2011 44th Hawaii International Conference on System Sciences. IEEE, 1–10. https://doi.org/10.1109/HICSS.2011.207
  • Young KR, León B (2007) Tree-line changes along the Andes: implications of spatial patterns and dynamics. Philosophical transactions of the Royal Society of London, Series B, Biological sciences 362: 263–272. https://doi.org/10.1098/rstb.2006.1986
  • Zizka A, Antonelli A, Silvestro D (2021) sampbias, a method for quantifying geographic sampling biases in species distribution data. Ecography 44: 25–32. https://doi.org/10.1111/ecog.05102

Supplementary material

Supplementary material 1 

Results of the “sampbias” analysis and information on the performance of the Supervised Classification results testing different classification algorithms (.xlsx)

Download file (309.47 kb)
login to comment