Towards a National Digital Platform for Biodiversity Conservation: Data Repository and Artificial Intelligence Enabled Analytics

 

Executive Summary

Canada’s 2030 Nature Strategy emphasizes a whole-of-society approach to meeting the country’s commitments under the twenty-three targets of the Kunming-Montreal Global Biodiversity Framework (GBF). Central to this strategy are Targets 14 and 21, which call for integrating biodiversity values into decision-making at every level (from local to national) and ensuring the availability of high-quality biodiversity data.

To help address these goals, the Chief Science Advisor convened a roundtable in the spring of 2024 with twenty national and international experts from academia, government and the private sector. The roundtable explored the development of a National Digital Platform for Biodiversity Conservation to improve the storage, standardization, accessibility and use of biodiversity data. The proposed Platform would include a Data Repository and a suite of artificial intelligence (AI) enabled Tools to perform analytics.

Participants identified four primary functions and associated tools for the proposed Platform:

  • Data Acquisition and Standardization: Tools for discovering, digitizing and standardizing biodiversity-relevant data and interfacing with existing data repositories.
  • Information Communication and Mobilization: Tools to effectively share data among diverse stakeholders, including citizen science platforms and international partnerships.
  • Design and Planning: Tools to support the creation of biodiversity monitoring programs and assess the effectiveness of conservation actions, consistent with Canada’s GBF commitments.
  • Analysis, Prediction and Forecasting: Tools for characterizing Canadian biodiversity trends and underlying causes, predicting future trends and characterizing high-priority policy interventions.

The discussions also highlighted existing challenges to data accessibility and mobilization such as insufficient incentives for data sharing, non-digitized data, proprietary interests and the need for compliance with FAIRFootnote 1 data principles and OCAPFootnote 2 principles for Indigenous data sovereignty.

Proposed solutions include incentivizing data sharing, making corporate reporting mandatory and leveraging existing funding programs to digitize biodiversity-relevant data. The report also outlines actionable recommendations to advance the development of the Platform, including:

  • creating an inventory of biodiversity-relevant data sources for prioritization;
  • evaluating existing AI tools to support biodiversity data applications;
  • identifying the product and service needs of biodiversity information end-users; and
  • assessing methods to incentivize the production of open and AI-ready biodiversity-relevant data across various providers.

top of page


Background

Canada’s 2030 Nature StrategyFootnote 3 represents the country’s implementation of its obligations under the Kunming-Montreal Global Biodiversity Framework (GBF)Footnote 4. The Strategy commits Canada to a whole-of-government and whole-of-society approach to ensure coordinated action across environmental, economic and social mandates. It calls for action, not only from federal government departments, but also from various sectors of society, leveraging the best available science and knowledge, including the integration of information from multiple knowledge systems.

Target 21 of the GBF requires signatories to “ensure that the best available data, information and knowledge, are accessible to decision makers, practitioners and the public...”. At the same time, Target 14 requires signatories to ensure the “full integration of biodiversity and its multiple values into decision-making ... within and across all levels of government, and across all sectors...”. Together, Targets 14 and 21 emphasize the importance of making biodiversity information readily available and useful for decision-makers, as a key step toward evidence-informed conservation actions.

top of page


Context

The Chief Science Advisor, in support of the Minister of Environment and Climate Change, undertook to organize several roundtables of national and international experts to address specific challenges related to the acquisition, utilization and communication of scientific data and information to support the GBF’s implementation.

In this context, the Office of the Chief Science Advisor (OCSA) convened a two-session multidisciplinary roundtable focused on advancing efforts under Targets 14 and 21 of the GBF, with particular emphasis on the following actions outlined in Canada’s Nature Strategy 2030 to further Target 21:

  • Assembling and publishing an online compendium of biodiversity resources to facilitate access to trusted sources of data and information.
  • Improving the availability of existing data and information through digitization, secure storage and open publication, including facilitating access to authoritative information about the state of biodiversity and enhancing links between domestic and international data platforms.

The roundtable included twenty national and international experts from a wide range of sectors, including academia, government and the private sector. It was co-chaired by Dr. Mona Nemer, Chief Science Advisor of Canada, and Dr. Scott Findlay, Researcher in Residence. Participants included lead Analysts at the OCSA (see Appendix A for the complete list of participants). The participants’ expertise covered biodiversity and conservation policy, data sciences and archives and artificial intelligence (AI). This multidisciplinary team provided critical insights into the challenges and opportunities for developing a National Digital Platform for Biodiversity Conservation composed of a Data Repository and a suite of AI enabled Tools for analytics, which would ensure a comprehensive approach to data acquisition, management and application for biodiversity conservation.

The Data Repository is envisioned as a national, permanent and federated repository and archive for digital biodiversity-related information, including historical biodiversity-relevant dataFootnote 5, information currently only available in non-digital form and existing historical (“legacy”) digital datasets that are vulnerable to loss. The suite of AI Tools would include a number of applications that support functions ranging from data gathering and data and knowledge quality assurance to scenario modeling and knowledge representation and communication.

top of page


Issue and Objectives

Issue: To achieve Targets 14 and 21 of the GBF, Canada must develop tools that facilitate rapid access to high-quality, trusted biodiversity-relevant data and information. Given the diverse stakeholders who need access, this information should be available in formats that facilitate its communication across a wide range of modalities.

Objective: The roundtable was convened to solicit expert advice on designing a National Digital Platform for Biodiversity Conservation. Participants were asked to address two key questions:

  • 1) What functions should the National Digital Platform for Biodiversity Conservation support?
  • 2) What criteria should be used to select or prioritize data and information for incorporation into the Platform?

top of page


Summary of Deliberations

1) What functions should the National Digital Platform for Biodiversity Conservation support?

Workshop participants identified an extensive set of functions that the Data Repository and AI Tools might sustain (Appendix B). Identified functions were of four main types: those sustaining (1) data acquisition, maintenance of integrity and data standardization; (2) information communication and mobilization; (3) planning and design, especially of biodiversity monitoring and surveillance programs; and (4) trend analysis, prediction and forecasting. Participants agreed that prioritization of functions was a critical next step in the design of the Data Repository and AI Tools.

  • Data acquisition, integrity, standardization and accessibility functions that require tools for discovering biodiversity-relevant data; assessing and maintaining data integrity (e.g. through quality assurance and quality control protocols); deploying data and metadata standards (e.g. FAIR principles); and interfacing with existing, proposed or in-development biodiversity data archives or repositories such as the Global Biodiversity Information FacilityFootnote 6. Interfacing functionality may be delivered in various ways. Given the numerous biodiversity-relevant digital databases already available, this last function is likely best delivered through a federated data architecture.
  • Information communication, sharing and mobilization functions that support the communication, sharing and mobilization of biodiversity-relevant data. This includes facilitating participation in biodiversity citizen science and supporting citizen science platforms, identifying and characterizing scientific controversies related to biodiversity conservation and ecosystem health and facilitating the analysis of evidence strength and uncertainty analysis of postulated causal relationships underlying biodiversity loss, recovery or restoration. Sustaining these functions will require enhancing data- and information-sharing among jurisdictions and across different elements of the biodiversity data ecosystem, including among private and public institutions.
  • Design and planning functions that support the design of biodiversity monitoring and assessment programs as well as policy and regulatory tools that incentivize or disincentivize actions by individuals, institutions or corporations that would be in line with Canada’s GBF commitments. Examples include the design of optimized biodiversity monitoring networks, the evaluation and prioritization of conservation actions and the identification and prioritization of biodiversity science and knowledge generation initiatives.
  • Analysis, prediction and forecasting functions that require tools for a number of activities such as characterizing Canadian biodiversity trends and inferring their underlying drivers, predicting future biodiversity trends under various scenarios, facilitating causal inference about the effects of mitigation efforts and identifying and characterizing legislative, regulatory or policy interventions that may impact Canadian biodiversity.

Roundtable participants also noted the following:

  • The Data Repository and AI Tools should serve as a permanent, accessible archive and inventory of Canadian biodiversity-relevant data, including legacy data. Additionally, if these are to function as an interface for biodiversity-relevant data provided by or involving Indigenous communities, implementation should respect the First Nations principles of OCAP.
  • The utility of the functions listed in Appendix B will depend in part on the targeted end-user community, as different user groups may have varying (functional) priorities based on their specific needs. Thus, an important next step in designing the platform is to identify sectoral and end-user needs.
  • Regardless of how knowledge is transmitted, end-users must recognize and be made aware of current scientific uncertainties, especially uncertainties related to the hypothesized causal drivers of biodiversity decline, the effectiveness and efficiency of pressure or threat mitigation strategies, the state of biodiversity under various future scenarios and the effectiveness of recovery, restoration, remediation or rehabilitation actions.
  • Some participants were of the view that functional limitations of the proposed platform are more likely to arise not from limitations of AI tools, but from the quantity of new data and the effectiveness of data sharing. This underscores the importance of designing and deploying robust biodiversity monitoring networks to generate new, widely accessible high-quality, high-resolution data.
  • The functions described in Appendix B vary significantly in their current feasibility because of both data and technological limitations. For example, functions such as evaluating the effectiveness of conservation actions and identifying those most likely to be effective in a particular context require data on specific conservation, recovery or restoration interventions and their implementation and associated outcomes. In the absence of such data, no AI tool can deliver these functions, and currently, this type of data is in short supply.

2) What criteria should be used to select or prioritize data and information for incorporation into the Platform?

Participants identified a number of criteria, several of which pertained to attributes of the data themselves, especially those related to data quality and reliability. Other suggested criteria include discoverability and accessibility, the diversity of uses to which candidate data could be put and, in the case of historical or legacy data, vulnerability to loss. For many participants, selection criteria for data or information sources were of lesser importance than the issue of how best to overcome existing obstacles and limitations to the accessibility of biodiversity-relevant data.

Potential prioritization criteria are summarized in Table 1. Roundtable participants also noted the following:

  • Investments in acquiring biodiversity-relevant information to support Canada’s 2030 Nature Strategy should be prioritized based on their expected utility to decision-makers. Maximizing this utility requires determining the optimal balance between (a) science and knowledge mobilization devoted to the gathering, digitalization, synthesis and communication of existing science and knowledge; and (b) science and knowledge generation, involving the planning and execution of new science and knowledge.
  • Some datasets that are critical for understanding the history of biodiversity change are at risk of being lost due to the termination of private or public-sector programs responsible for their management. The loss of these programs could result in the permanent disappearance of valuable historical information.
  • Both the mobilization of existing biodiversity-relevant data and new knowledge generation should be prioritized at the spatial (where) and temporal (when) scales at which decisions are made. For instance, if most decision-making occurs at local or regional levels, priority should be given to data and information that are relevant at these scales and that allow for a rigorous evaluation of spatial and temporal scalability (e.g. the predictive value of larger-scale status and trends to local scales, and vice versa).
Table 1. Prioritization criteria for data or information acquisition or interface functionality in the proposed National Digital Platform for Biodiversity Conservation
Criterion Description
Support of current GBF goals and targets To what extent will the data or information directly contribute to Canada’s commitments under the GBF? For example, how many GBF targets or indicators (headline or otherwise) will the data or information support? All else being equal, the more GBF goals, targets or indicators directly supported by the data or information, the higher the priority.
Species or ecosystems of conservation concern To what extent are the data or information relevant to the conservation of species or ecosystems of concern? All else being equal, data or information the more relevant to these species or ecosystems the higher the priority.
Capacity to support future GBF goals and targets To what extent will the data or information potentially contribute to Canada’s commitments under future GBFs? Operationalizing this criterion will require foreseeing potential future GBF goals, targets and indicators.
Vulnerability to loss What is the current risk of permanent loss of the data or information? All else being equal, the greater the risk of permanent loss, the higher the priority.
Data or information reliability To what extent are the data reliable? (N.B. this is essentially a question about data quality and the implementation of protocols to assess and improve data quality by those responsible for data management.)
Openness To what extent is the data or information open or able to be made open? For open data, priority should be given to improving interface functionality. Closed data should be a priority for acquisition, especially if current closure reflects largely logistical constraints.
FAIR (data) To what extent are the data or information themselves FAIR? All else being equal, data or information sources that are themselves FAIR are higher priority.
FAIR (metadata) To what extent are the associated metadata FAIR? All else being equal, data or information sources with FAIR metadata should have higher priority.
AI actionability or AI readiness To what extent is the data or information AI actionable or AI ready? All else being equal, the greater the AI readiness, the higher the priority.
Standards Does the data or information conform to existing data standards? All else being equal, data or information that conform to existing standards should have higher priority than those that are non-conformant or non-compatible if widely deployed standards exist.

Participants identified several obstacles and limitations to biodiversity-relevant data accessibility:

  • Lack of effective incentives to promote biodiversity-relevant data-sharing and the absence of institutional requirements to make available such data, especially when it results from publicly funded initiatives.
  • Lack of institutional requirements to comply with existing data and metadata principles or standards.
  • Proprietary and commercial interests, particularly regarding biodiversity-relevant data collected by businesses and industries.
  • Private sector concerns over legal liability related to information disclosure, especially when it pertains to potentially adverse biodiversity impacts.
  • Lack of data-sharing agreements, data and metadata standards and consistent (or at least compatible) data formats across various public institutions, particularly among municipal, provincial, territorial and federal governments.
  • Limited efforts by scientists, in some cases, to ensure that data are rendered into formats that are at least digital and ideally AI ready, particularly concerning historical or legacy data.
  • Lack of regulatory requirements for various sectors to report biodiversity-relevant data and the absence of tools to facilitate such reporting.

Roundtable participants agreed that overcoming barriers and limitations to the accessibility of biodiversity-relevant data is critical to the success of the proposed Data Repository. Potential solutions include:

  • development and implementation of data and metadata standards by the federal government, ensuring alignment with existing international standards;
  • promotion or creation of training programs focussed on digitizing biodiversity-relevant data and making it AI readyFootnote 7;
  • mandatory corporate reporting of biodiversity-relevant data and information;
  • provision of privileged access or functionality within the proposed Platform as an incentive for data contributors to share data, especially given the growing opportunities for commodification and monetization of biodiversity-relevant data. Any proposed incentive measures based on privileged access should be carefully evaluated for associated risks and benefits; and
  • leveraging of existing programs and infrastructures, such as the Canada Foundation for Innovation (CFI) Major Science Initiative (MSI) programs, to digitize and standardize biodiversity-relevant data and render it AI ready.

Participants also raised several issues related to data ownership, provenance and governance:

  • The private sector is increasingly both a producer and consumer of biodiversity-relevant data and information. Therefore, greater effort is needed to understand the biodiversity-relevant information needs and end-uses within this sector and to develop effective incentives for access to proprietary biodiversity-relevant data.
  • The increasing commodification and monetization of data, including biodiversity-relevant data, in the AI era means that private-sector investments in data generation or AI tool development are increasingly influenced by market demand. This volatility suggests that the continued availability of AI-delivered services in support of Canada’s 2030 Nature Strategy will require ongoing public investment in biodiversity-relevant data generation and AI-based services.
  • Long-term sustainability will depend on constant investment. Therefore, efforts to develop, implement and maintain the Platform should include a realistic, flexible and adaptable funding model.
  • Any attempt to include biodiversity-relevant data or information derived from Indigenous Knowledge must adhere to OCAP principles. Additionally, effectively supporting Indigenous Knowledge holders or providers may require approaches that differ significantly from those identified for other data or service providers or contributors.

top of page


Recommendations

Given the opportunities and challenges identified above, the following recommendations are provided to support Canada’s 2030 Nature Strategy, and in particular Canada’s commitments under Targets 14 and 21 of the GBF. They require collaboration between the Government of Canada and various domestic and international partners across sectorsFootnote 8:

  1. Create an inventory of both digital and non-digital biodiversity-relevant data sources and evaluate their priority for inclusion in the proposed platform. Potential data sources include (a) legacy datasets or databases, especially those at risk of being lost; (b) datasets or databases actively maintained by public, civil society or private institutions; and (c) existing non-digital data sources, including scientific reports developed under municipal, provincial, territorial or federal government legislation, regulation or policy.
  2. Assess the potential of existing AI tools, whether open source or otherwise, to deliver the functions outlined in this report. For tools with high potential, determine the technical and resource requirements needed to realize this potential.
  3. Determine the needs of biodiversity data and information users, particularly regarding end-user products and services.
  4. Assess and evaluate potential methods to incentivize the production of open and AI-ready biodiversity-relevant data across a range of providers, including academia, businesses, government and civil society.
  5. Convene a mission-driven discussion among biodiversity scientists, knowledge holders, data scientists and the AI community to develop a funding proposal for the design, implementation and operation of a national digital platform for biodiversity conservation composed of a data repository and a number of AI-enabled tools to conduct analytics. A key aspect of this effort will be engaging Canadian AI expertise through the Pan-Canadian Artificial Intelligence Strategy and leveraging both academic and business strength in digital and AI fields in Canada.

top of page


Appendix A - Roundtable participants

Appendix A - Roundtable participants
Name Position and Affiliation
Participants from the Office of the Chief Science Advisor
Mona Nemer Chief Science Advisor of Canada (Roundtable Co-Chair)
Scott Findlay Researcher in Residence at the OCSA, retired Professor of Biology at the University of Ottawa (Roundtable Co-Chair)
Kyle Bobiwash Researcher in Residence at the OCSA and Assistant Professor at the Department of Entomology, Faculty of Agricultural and Food Sciences, University of Manitoba
David Castle Researcher in Residence at the OCSA and Professor at the School of Public Administration and Gustavson School of Business, University of Victoria
Gary Slater Researcher in Residence at the OCSA and Professor at the Department of Physics, University of Ottawa
External Participants
Mark Daley Professor of Computer Science and Chief AI Officer, University of Western Ontario
Jason Duffe Geomatics Research Manager, Wildlife and Landscape Science Directorate of Environment and Climate Change Canada (ECCC)
Charles Francis Research Scientist, Canadian Wildlife Service (ECCC) and Adjunct Research Professor at Carleton University
Kaitlyn Gaynor Assistant Professor in Zoology and Botany at the University of British Columbia
Andy Gonzalez Professor, McGill University, Liber Ero Chair in Conservation Biology and Founding Director of the Quebec Centre for Biodiversity Science
Jeremy Kerr Professor, Department of Biology, University of Ottawa
Dimitris Koreas Managing Director at Naturalis Biodiversity Center; Executive Director of the pan-European Research Infrastructure Distributed System of Scientific Collections (DiSSCo.); Biodiversity Genomics Europe (BGE) Director and Scientific Coordinator
Amy Luers Senior Global Director, Sustainability, Science and Innovation, Microsoft
Joel Martin National Research Council’s Chief Digital Research Officer and Chief Science Officer
Joe Miller Executive Secretary, Global Biodiversity Information Facility, Copenhagen, Denmark
Scott Miller Chief Scientist and Interim Director of Smithsonian Libraries and Archives, Washington, DC
Sarah (Sally) Otto Professor of Zoology at the University of British Columbia
Laura Pollock Assistant Professor in Quantitative Ecology at McGill University
Sujeevan Ratnasingha Adjunct Professor at the University of Guelph and Associate Director at the Centre for Biodiversity Genomics
David Rolnick CFIAR AI Chair in the School of Computer Science at McGill University and Mila Quebec AI Institute
Jennifer Sunday Assistant Professor of Biology and Canada Research Chair in Global Change Ecology at McGill University
Graham Taylor Professor of Engineering at the University of Guelph
Supporting Staff from the Office of the Chief Science Advisor
Nancy Abou-Chahine Policy Analyst
Serge Nadon Senior Policy Analyst

top of page


Appendix B - Functions of the proposed National Digital Platform for Biodiversity Conservation

Appendix B - Functions of the proposed National Digital Platform for Biodiversity Conservation
Function class Function
Data acquisition, accessibility, integrity and standardization Assist in the development and deployment of FAIR biodiversity data and metadata standards.
Data acquisition, accessibility, integrity and standardization Interface with existing, proposed or in-development biodiversity data archives or repositories such as GBIF.
Information acquisition, mobilization and communication Promote the sharing of biodiversity-related data or information.
Analysis, prediction and forecasting Identify and characterize (a) spatiotemporal trends in Canadian biodiversity and (b) direct and indirect drivers of biodiversity trends.
Analysis, prediction and forecasting Predict (a) future biodiversity trends under various scenarios (“forecasting”) and (b) effectiveness of conservation and restoration actions at scale.
Design and planning Design optimized biodiversity monitoring and assessment networks:
  • Identification of optimal assessment and measurement endpoints (including not only “responses”, e.g. sentinel or indicator species and habitats, but also drivers and pressures).
  • Monitoring network design (indicators and measurement endpoints, location of sampling sites, frequency of sampling, sample size, etc.).
Analysis, prediction and forecasting Facilitate causal inference about the effects of pressures and drivers, including attempts to reduce or mitigate drivers and pressures.
Design and planning Evaluate and prioritize conservation actions, e.g. identify which conservation actions are more likely to be effective based on current knowledge.
Data acquisition, accessibility, integrity and standardization Carry out identification, capture, quality assurance, quality control and evaluation of both digital and non-digital (including unstructured) biodiversity-relevant data (e.g. extraction from public sources, text-mining, image capture)
Design and planning Identify and prioritize biodiversity science and knowledge generation undertakings.
Information communication, sharing and mobilization Facilitate participation in biodiversity citizen science and support citizen science platforms.
Analysis, prediction and forecasting Identify and characterize legislative, regulatory or policy interventions potentially affecting Canadian biodiversity.
Analysis, prediction and forecasting Characterize, assess and evaluate the existing legislative, regulatory and policy landscape to identify gaps, synergies, complementarities and conflicts.
Analysis, prediction and forecasting Inform acceptable risk and action thresholds for biodiversity conservation actions at scale.
Information communication, sharing and mobilization Facilitate rapid production of evidence and expert knowledge summaries on important conservation issues or questions, including knowledge summaries that provide information on the current state of biodiversity in particular regions of Canada.
Information communication, sharing and mobilization Facilitate efficient and accurate transmission or translation of biodiversity knowledge to different end-users, including non-scientists.
Information communication, sharing and mobilization Identify and characterize scientific controversies regarding biodiversity and ecosystem health.
Information communication, sharing and mobilization Facilitate strength of evidence and uncertainty analysis of postulated causal relationships underlying biodiversity loss, recovery or restoration.
Data acquisition, accessibility, integrity and standardization Support biodiversity-relevant data acquisition and resource discovery.
Data acquisition, accessibility, integrity and standardization Facilitate digitization and standardization of unstructured biodiversity-relevant data.
Analysis and prediction Facilitate semantic, ontological or geospatial co-registration or correlation of museum collection data and databases and text- or image-mined biodiversity-relevant data.
Analysis and prediction Support biodiversity-relevant data analytics and visualization.

top of page