Skip to content

Dynamic classification of global ecosystems based on geographical parameters

Notifications You must be signed in to change notification settings

NoahKuertoes/global_ecosystem_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

INTRODUCTION

With the climate rapidly changing and humans having increasing impact on global ecosystems, today it is not longer sufficient to classify and analyse ecosystems based on their geographical location and historical data. Here, we classify ecosystems based on their wider geoecological parameters. The goal is to create a classifier that allows us to dynamically classify ecosystems. Perspectively, used to predict i.e. expected species richness, species extinction rates, soil status, soil detioration and eventually ecosystem collapse.

Project objective

Minimal viable product (MVP): On EcoVerse we show the development of ecosystems over time.

Goal: Showing a trend of ecosystem development and migration over time

Validation: Highlighting renaturation efforts of the past decade such as Great Green Wall Initiative, Chinese Loess Plateau Rehabilitation and the Netherlands’ Marker Wadden

Disclaimer

This project is designed and pursued in the context of the Data Analytics Consulting Bootcamp 2024-2 from neuefische GmbH. During which a capstone project is to be designed for the timeframe of four weeks, including setup, data retrieval, analysis and integration.

Collaborators:

Member Role
Heiko Främbs Communications
Project management
Soma Pasumarthy Web integration
Database maintainance
Alexander Schmidt Data acquisition
Data handling
Noah Kürtös Conceptualization
Scripts

About the repository: Collabotors ought to work in independent branches and merge into main need to be approved by at least one other collabotor. For team communications this miro board is used.

DATA

Name Description Content Data URL Notes
EarthData Library of accumulated satellite data Radiation based climate data (detailed here ) GLDAS Noah Land Surface Model L4 monthly 0.25 x 0.25 degree V2.1 (GLDAS_NOAH025_M) From 2000-2024
GLDAS Noah Land Surface Model L4 monthly 0.25 x 0.25 degree V2.0 (GLDAS_NOAH025_M) From 1948-2014
Elevation data relative ASTER Advanced Spaceborne Thermal Emission and Reflection Radiometer From 2009 as .geotif
Night time illumination data VNP46A1 - VIIRS/NPP Daily Gridded Day Night Band 500m Linear Lat Lon Grid Night From 2012 onward as .geotif
Black marble
Vegetation index (NDVI) MOD13A2.061 Terra Vegetation Indices 16-Day Global 1km Earth engine snippets as .geotif
Copernicus Sentinel mission satellite data. 90m global resolution (TanDEM-X mission) Elevation data absolute Copernicus DEM - Global and European Digital Elevation Model Mission: from 2011-2015.
Data availble: from 2019-2026
Earth engine snippets as .geotif
DEPRECIATED
Meteostat
Library of global weatherstations Weather data Data is sourced through the python library meteostat (documentation) Representation bias towards EU and NA
DEPRECIATED
Open elevation
Free API alterntavie to Google Elevation data absolute Data is sourced through the API open_elevation_request.py Open elevation data doesn't feature data from lat > 60°

WORKFLOW

1. Literature research

NASA's EarthData features a wide range of global satellite data which can be used to infer climate paramters. These parameters are already widely used for local studies featured in their "News" section. Moreover, there is a "Global Ecosystem Viewer" from the United states geological survey for global ecosystems (27.10.2016).

Hence, the focus of this project was to integrate and automate the classification of ecosystems using dynamically updated satellite data.

2. Data acquisition and storage

All data is acquired through EarthData or through the GoogleEarth engine. The data is stored on an AWS postgreSQL server. Currently migrating.

2.1 Exemplary visualisation of input data

1.) VIIRS night time illumination

image


2.) MODIS vegetation index

image


3.) Copernicus elevation data

image

3. Modelling

For modelling all parameters were aggregated for each grid cell and a model was trained on 73 different locations of 15 different ecosystems using RandomForestClassifier from sklearn.

label_encoder = LabelEncoder()
scaler = StandardScaler()

#define model data
X = df.drop(columns=["ecosystem", "name"])
y = label_encoder.fit_transform(df["ecosystem"])

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

#split data
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

#modelling
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

3.1 Training data (2020)

image

3.2 Classification (2012)

image

4. Forecasting

For forecasting LinearRegression from sklearn was performed for each parameter per grid cell individually.

# Prepare data for the model
X, y = np.array(df_pixel["year"]).reshape(-1, 1), np.array(df_pixel[parameter])

# Check if there is enough data to fit the model
if len(X) > 1:
    # Fit the model and calculate R-squared
    model.fit(X, y)
    r_squared = model.score(X, y)
    predict_val = model.predict(np.array([[predict_year]]))[0]  # Extract single value

4.1 Classification of forecast for 2030

image

4.1 Classification of forecast for 2100

image

About

Dynamic classification of global ecosystems based on geographical parameters

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •