Eric Nost

Logo

Assistant Professor of Geography, Environment, and Geomatics | University of Guelph

Contacts

Projects

Digital Conservation - Using iNaturalist to Supplement Natural Heritage Information

In this tutorial, we’ll use the open-source mapping software QGIS to visualize citizen science data from the iNaturalist platform and highlight unrecognized areas where species of conservation concern might exist.

In the province of Ontario, the Natural Heritage Information Centre (NHIC) tracks what it calls “species of conservation concern,” including species at risk, in 1km-by-1km grid cells across the province. Each square kilometer cell is evaluated for whether a species is predicted to or has been recorded occurring there. The relatively coarse 1 sq. km. spatial resolution of the dataset is meant to protect sensitive species from disturbances such as poaching. According to NHIC, this dataset “can be used to identify species, ecological communities or natural heritage areas on or near your property or project site.”

But how does NHIC create this grid of species occurrence in the first place? NHIC is a member of the NatureServe network, which curates species location data from a wide range of stakeholders, including: “provincial, territorial, state and federal governments, citizen science platforms and digital biodiversity data resources (e.g., GBIF, iNaturalist), academia, non-governmental organizations, industry, species experts and Traditional Ecological Knowledge (TEK).” Although NatureServe and NHIC already incorporate “digital biodiversity data” including from citizen science sources such as iNaturalist, there may be opportunities to extend the use of these sources. For instance, there may be iNaturalist observations of species in areas not currently identified by NHIC’s grid.

We will compare the areas NHIC identifies a species occurring in with those where iNaturalist users have observed it. Although citizen science and social media data have their limitations, this comparison can give us a sense of whether these data can better account for species ranges, which might in turn influence land use decisions. We’ll look at bald eagles as our example, providing a proof-of-concept for what the analysis could look like for more sensitive and at-risk species (due to sensitive data restrictions, we are unable to do that specific analysis here; more detailed data with precise locations is available after obtaining a sensitive data use license from NHIC).

Required software

QGIS is free and open-source mapping software. This tutorial was written for QGIS version 3.28, but may work on the latest long-term release, which is QGIS 3.34 as of the time of writing (May 2, 2024). The software can be downloaded for any operating system here: https://qgis.org/en/site/forusers/download.html

Natural heritage information

1. First, let’s download natural heritage information from this website. Specifically, we want to download the “Species of conservation concern (including species at risk), plant communities, wildlife concentration areas and natural heritage areas at the 1km grid level” dataset. It can be found here. Click download and select “Complete File Geodatabase” at the bottom of the page.

This dataset contains both NHIC’s 1 sq. km. grid for the province as well as a table with records for each part of the grid where tracked species are estimated to occur in. For reference and metadata, please see here.

2. Next, unzip the data you just downloaded. On Macs, this will involve double clicking the file. On PCs, this involves right clicking and opening using a utility such as 7-Zip.

3. Open QGIS. Create a new project by clicking the blank white page icon in the upper left portion of the screen. This will create a blank map for us.

4. Now we will add the dataset we just downloaded to the map.

What do we see here? We have two layers in our list on the left-hand side of the screen now: 1) the 1 sq. km. grid, which will also appear on the map as a series of rectangular boxes; 2) A table that contains information on species occurrences in each part of the grid.

5. At this point, you might be wondering, why do the grid cells appear as rectangles rather than squares (since they are 1km by 1km in size)? This is because of how the mapping software is “projecting” the data’s spatial coordinates onto the map. We just need to change the projection to one more appropriate for our analysis.

The rectangles should now be squares!

6. But how do we know which of these square kilometers are home to bald eagles? To find out, we need to first join the table to the spatial grid. We will do so by matching key identifiers from the species occurrence table with their equivalents in the grid. Search for the “Join attributes by field value” tool in the Processing Toolbox, which can be found by clicking Processing at the top of the screen and then Processing Toolbox. Complete the menu as following:

7. Now we can highlight locations of species using the “Joined layer” that was produced in the previous step. We will “filter” the grid to show only those square kilometers where NHIC tells us bald eagles occur.

This will filter the map to only the cells where bald eagles are observed or expected to occur. As you can see, there aren’t that many in southern Ontario. Perhaps we can change that by using iNaturalist records to demonstrate a wider range for bald eagles….

What counts as an occurrence? According to NHIC, “The [NatureServe] specifications define what does and does not constitute an element occurrence. For example, for a bald eagle, the NHIC considers a record for a nesting site an element occurrence or part of an element occurrence. But it does not consider a record of a migrating bald eagle an element occurrence or part of an element occurrence.”

iNaturalist observations

8. To potentially demonstrate a wider range for bald eagles, we will load some records from iNaturalist. An iNaturalist dataset containing all observations of bald eagles in Ontario over the past 20+ years is available from the Global Biodiversity Information Facility here.

Take a look at the map. Now we have the bald eagle sightings from iNaturalist as well as the bald eagle recorded areas from the provincial monitoring dataset. We can see some overlap but also some differences. There are bald eagles observed throughout the province – not just in the existing grid areas. There are also some areas where no bald eagle sightings have been recorded on iNaturalist, though this is perhaps an artefact of the opportunistic sampling that the platform is based on, where users are typically not making systematic, structured observations.

Analysis

9. What we want to do now is use the iNaturalist data to identify bald eagle “clusters” that are distinct from areas NHIC already identifies as bald eagle occurrence areas. First, we’ll identify the clusters. There are many single individual bald eagle observations on our map, but these could be one-off sightings rather than true habitat. So we will filter the iNaturalist observations to clusters – observations those that are near other observations.

To do this requires a few preliminary steps:

Remove these unclustered points from the dataset altogether:

10. Finally, we want to assess the extent to which the NHIC grid cells line up with these bald eagle clusters. Visually, we can see some interesting differences. (Consider adding a basemap for context – to do this, select OpenStreetMap from the XYZ tiles section of the Browser panel). For instance, the area around London includes many observations along the Thames River even though the NHIC grids only sometimes intersect with the river.
"Step 10a"
For a more rigorous and less visual analysis, we will select the grid cells that contain clusters of bald eagle sightings. First, clear the filter on the grid (see step #7 above) so that we can see all the grid cells, not just the ones the province currently thinks has bald eagles.

Reflections