Merge pull request #412 from geoadmin/develop

New Release v1.23.1 - #patch
geoadmin · Apr 10, 2024 · e84f938 · e84f938
2 parents a6da664 + 254bd9c
commit e84f938
Show file tree

Hide file tree

Showing 3 changed files with 281 additions and 1 deletion.
diff --git a/app/middleware/logging.py b/app/middleware/logging.py
@@ -54,7 +54,7 @@ def __call__(self, request):
         # HttpResponse and JSONResponse sure have
         # (e.g. WhiteNoiseFileResponse doesn't)
         if isinstance(response, (HttpResponse, JsonResponse)):
-            extra["response"]["payload"] = response.content[:200].decode()
+            extra["response"]["payload"] = response.content.decode()[:200]
 
         logger.info("Response %s", response.status_code, extra=extra)
         # Code to be executed for each request/response after

diff --git a/doc/assets/stac-search-endpoint-example.ipynb b/doc/assets/stac-search-endpoint-example.ipynb
@@ -0,0 +1,262 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Demo: STAC API POST /search endpoint example\n",
+    "\n",
+    "This notebook is meant to be a very short and simple demo on how to do a simple search in swisstopo's STAC API and do a quick visualization of the downloaded data.<br>\n",
+    "Here are some useful links, you might be interested in:<br>\n",
+    "- [STAC SPEC (\"Speck-Bsteck\" ;-))](https://stacspec.org/en)\n",
+    "- [STAC API SPEC](https://github.com/radiantearth/stac-api-spec)\n",
+    "- [swisstopo's STAC API](https://data.geo.admin.ch/api/stac/v0.9/)\n",
+    "- [swisstopo's STAC API docu/spec](https://data.geo.admin.ch/api/stac/static/spec/v0.9/api.html)\n",
+    "\n",
+    "**Note: To keep this notebook short and simple, some steps/best practices, such as proper error handling or checking checksums of downloaded files, are not implemented to a full extent**"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Preparations\n",
+    "First we need to import the necessary modules. <br>\n",
+    "Make sure you have installed all the dependencies, e.g. by running: `pip install folium geopandas jupyter mapclassify matplotlib pandas requests xyzservices`,<br>\n",
+    "or e.g. by creating a [virtual environment](https://docs.python-guide.org/dev/virtualenvs/) using the `Pipfile`, which is contained in the same github directory, as this notebook."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import json\n",
+    "import os\n",
+    "import pprint\n",
+    "\n",
+    "import geopandas as gpd\n",
+    "import requests"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We will use the [API's POST /search endpoint](https://data.geo.admin.ch/api/stac/static/spec/v0.9/api.html#tag/STAC/operation/postSearchSTAC) for our search.<br>\n",
+    "For this we need to define, where (i.e. in which collection(s)) and what (i.e. search term) we want to search.<br>\n",
+    "Lets try:\n",
+    "- search for items, which contain \"2023\" in their title\n",
+    "- search in the collection \"ch.are.agglomerationsverkehr\" (also check the according [metadata](https://www.geocat.ch/geonetwork/srv/ger/catalog.search#/metadata/f4b72bb8-aff0-4eab-b1e8-48e698c0e8fb))<br>\n",
+    "- lets limit the returned results to only 1 ( = \"only the first hit\"), to make things less complicated ;-)<br>\n",
+    "<br>\n",
+    "Note: You could also do more advanced searches, using bounding boxes or limiting to time intervals, and the like."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "search_string = \"2023\"\n",
+    "search_collection = \"ch.are.agglomerationsverkehr\"\n",
+    "limit = 1\n",
+    "\n",
+    "payload = {\n",
+    "    \"limit\": limit,\n",
+    "    \"query\": {\n",
+    "        \"title\": {\n",
+    "            \"contains\": search_string\n",
+    "        },\n",
+    "    },\n",
+    "    \"collections\": [search_collection]\n",
+    "}\n",
+    "\n",
+    "\n",
+    "try:\n",
+    "    response = requests.post(\"https://data.geo.admin.ch/api/stac/v0.9/search\",\n",
+    "                             data=json.dumps(payload), headers={\"Content-Type\": \"application/json\"})\n",
+    "# I know, this is a \"too broad exception\" clause ;-)\n",
+    "# For a demo, this will do. But of course, one should be more specific here and\n",
+    "# catch different exceptions specifically\n",
+    "except Exception as e:\n",
+    "    raise(e)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we can check, what we got back from the STAC API.\n",
+    "If everything went well, we should have something very similar to the \"Response samples\" from the POST /search endpoint's documentation."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "response.json()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "From the returned \"FeatureCollection\" we can now extract the details of the first (and only, since we used \"limit=1\") item can be extracted.<br>\n",
+    "**(Again, for the sake of simplicity, there's no error handling implemented here. But in real life it might well make sense, to check the response of our request. Is there any data returned? Which format, etc. Otherwise there might be exceptions when trying to access certain elements of the response, which don't exist)**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# if we had more than one item, we could use a loop over all items here.\n",
+    "# In this demo, we only have one item\n",
+    "item = response.json()[\"features\"][0]\n",
+    "assets = item[\"assets\"]\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now, `assets` contains a list of all assets belonging to the item that our search returned.<br>\n",
+    "Let's check, how `assets` looks like"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pprint.pprint(assets)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "There is only one assest, a .gpkg, belonging to our item. We'll use the `list(dict.items())` trick here, to access the data we need.<br>\n",
+    "Again, if we had more than only one asset, we could do everything in a loop over all the assets."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "asset_name = list(assets.items())[0][0]\n",
+    "asset_data = list(assets.items())[0][1]\n",
+    "asset_href = asset_data[\"href\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we have the asset's href and can download it from the STAC API.<br>\n",
+    "Note: In the code below, we will only download the asset, in case it does not exist locally already.<br>\n",
+    "For our demo and especially for very large files this makes sense, as it won't download the file twice.<br>\n",
+    "**In real life you might want to check, if the local version is old, and the STAC API could offer a newer version, which you might want to download.**\n",
+    "<br>\n",
+    "<br>\n",
+    "Additionally, in our case we are dealing with a .pgkp file. In other cases, the API might return a zipped file, which would need to be extracted after downloading. This would require adaptions of the code below.<br>\n",
+    "And, last but not least, in real life you'd want to check if the checksum of the downloaded file matches the one returned by the API."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fname = os.path.basename(asset_href)\n",
+    "# define the local path of the asset file\n",
+    "path = f\"./data/{asset_name.split('.', 1)[0]}/{asset_name}\"\n",
+    "\n",
+    "# create the necessary directories, if not existing\n",
+    "if not os.path.exists(f\"./data/{asset_name.split('.', 1)[0]}\"):\n",
+    "    os.makedirs(f\"./data/{asset_name.split('.', 1)[0]}\")\n",
+    "\n",
+    "# only send the request, if asset file does not yet exist locally\n",
+    "# useful for large files ;-)\n",
+    "# Again: in real life, you'd want to do proper exception handling here, too, as well\n",
+    "# as check if the checksum of the download matches the one specified in the API's response.\n",
+    "if not os.path.exists(path):\n",
+    "    req = requests.get(asset_href)\n",
+    "    with open(path, 'wb') as outfile:\n",
+    "        outfile.write(req.content)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we can simply load our asset file into a geopandas GeoDataFrame"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = gpd.read_file(path)\n",
+    "data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "And finally, with only one line of code, have a quick look at the asset to see, if it is what we expected, while using a swisstopo map as background ;-)<br>\n",
+    "This is done using geopanda's [geopandas.GeoDataFrame.explore](https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.explore.html#geopandas-geodataframe-explore), which creates an interactive leaflet map."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data.explore(\"Name\", legend=False, tiles=\"SwissFederalGeoportal NationalMapColor\", attr=\"© Data:swisstopo\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.18"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/doc/stac-search-endpoint-example.md b/doc/stac-search-endpoint-example.md
@@ -0,0 +1,18 @@
+# STAC POST /search endpoint example
+
+## Disclaimer
+
+This Readme file and the according Jupyter notebook might not stay in the service-stac repository but probably will be moved into another repo in the mid-term.
+
+## Jupyter notebook
+
+This [jupyter notebook](./assets/stac-search-endpoint-example.ipynb) serves as a simple example for using the STAC API's POST /search endpoint. For the sake of readability, some best practices, such as proper error handling, checking checksums of the downloaded files, and the like, are not (fully) implemented in this notebook.
+
+## Preparation
+
+For this notebook to work, you need a recent Python version installed. You further need to install the following dependencies, e.g. by running `pip install folium geopandas jupyter mapclassify matplotlib pandas requests xyzservices` or by creating a virtual environment, where those dependencies are installed.
+
+## Jupyter notebooks
+
+Please refer to the documentation on how to run Jupyter notebooks on your machine: https://docs.jupyter.org/en/latest/running.html
+Some IDEs support Jupyter notebooks, when the according extensions are installed.