LILY COST Action

Nature based solutions compilation and categorization

LILY COST Action Project Header

Project

I developed a centralized database for Nature-Based Solutions (NbS) by integrating data from various sources using web scraping and NLP.

Tools & Technologies

LLMs · Web Scraping · Python

Impact

  • 2000+ solutions collected
  • 95% validation accuracy
  • 12 data platforms scraped

Client

CSID Lab Heidelberg

Role

Project Owner

Duration

2025 - present

Overview

Nature-Based Solutions (NbS) are interventions that leverage natural ecosystems to address societal challenges like climate change, public health, and biodiversity loss. Despite their growing implementation across Europe, data on NbS projects is fragmented across various platforms, making it difficult to analyze trends or evaluate impact.

Objective

As part of a broader initiative to centralize and standardize NbS data, I contributed to the development of a database that consolidates information from multiple sources to enable better monitoring, evaluation, and research.

NbS types frequency heatmap

Figure 1: NbS types frequency heatmap

NbS location spread

Figure 2: NbS location spread

Key Contributions

Database Landscape Review

Conducted a comprehensive survey of existing NbS databases and platforms, documenting their scope, data structures, and accessibility.

Data Collection via NLP & Web Scraping

Automated the extraction of relevant project data (e.g. location, implementation dates, objectives, NbS types) using web scraping and Natural Language Processing techniques.

Deduplication Using LLMs

Developed and implemented a large language model (LLM)–based pipeline to identify and remove duplicate project entries across disparate data sources.

Hazard Target Identification

Parsed NbS objectives to extract and classify climate hazard targets (e.g., heatwaves, floods) as reported by implementers, linking them to potential health and well-being indicators.

Descriptive and Contextual Analysis

Conducted descriptive statistics and contextual analysis using remote sensing data and LLMs to quantify trends, coverage, and environmental context of each NbS project.

Validation with Ground-Truth Labels

Used the Una.city platform, a manually curated NbS dataset, as a validation benchmark for model performance and data quality assurance.