Skip to content

Commit

Permalink
merge with develop
Browse files Browse the repository at this point in the history
  • Loading branch information
mathleur committed Dec 9, 2024
2 parents a04776b + a12fdee commit bb8bc38
Show file tree
Hide file tree
Showing 153 changed files with 4,568 additions and 451 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ jobs:
python -m coverage report
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
uses: codecov/codecov-action@v4
with:
files: coverage.xml
deploy:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,4 @@ polytope_venv_latest
new_updated_numpy_venv
newest-polytope-venv
serializedTree
new_polytope_venv
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
21 changes: 10 additions & 11 deletions docs/Overview/Overview.md → docs/Algorithm/Overview/Overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,30 +10,29 @@ Developed by ECMWF - the European Centre for Medium-Range Weather Forecasts - it

### Traditional Extraction Techniques

Traditional data extraction techniques only allow users to access datacubes "orthogonally" by selecting specific values or ranges along datacube dimensions.
Such data access mechanisms can be seen as extracting so-called "bounding boxes" of data.
These mechanisms are quite limited however as many user requests cannot be formulated using bounding boxes.
Traditional data extraction techniques only allow users to access boxes of data from datacubes.
These techniques are quite restrictive however as many user requests cannot be formulated using such boxes.

!!!note "Example"

Imagine for example someone interested in extracting temperature data over the shape of France.
France is not a box shape over latitude and longitude.
Using current extraction techniques, this exact request would therefore be impossible and users would instead need to request a bounding box around France.
Imagine for example someone interested in extracting wind data over the Mediterranean sea.
The Mediterranean is not a box shape over latitude and longitude.
Using current extraction techniques, this exact request would therefore be impossible and users would instead need to request a bounding box around the Mediterranean.
The user would thus get back much more data than he truly needs.

In higher dimensions, this becomes an even bigger challenge with only tiny fractions of the extracted data being useful to users.

### Polytope Extraction Technique

As an alternative, Polytope enables users to access datacubes "non-orthogonally".
Instead of extracting bounding boxes of data, Polytope has the capability of querying high-dimensional "polytopes" along several axes of a datacube.
This is much less restrictive than the popular bounding box approach described before.
Instead, Polytope enables users to access high-dimensional "polytopes" from datacubes, rather than only boxes of data.
<!-- Instead of extracting bounding boxes of data, Polytope has the capability of querying high-dimensional "polytopes" along several axes of a datacube. -->
<!-- This is much less restrictive than the popular bounding box approach described before. -->

!!!note "Example"

Using Polytope, extracting the temperature over just the shape of France is now trivially possible by specifying the right polytope.
Using Polytope, extracting the temperature over just the shape of the Mediterranean is now trivially possible by specifying the right polytope.
This returns much less data than by using a bounding box approach.

These polytope-based requests do in fact allow Polytope to fulfill its two main aims.
Indeed, because polytope requests return only the exact data users need, they significantly reduce I/O usage as less data has to be transmitted.
Indeed, because polytope requests return only the data users need, they significantly reduce I/O usage as less data has to be transmitted.
Moreover, because only the data inside the requested polytope is returned, this method completely removes the challenge of post-processing on the user side, as wanted.
File renamed without changes.
41 changes: 41 additions & 0 deletions docs/Algorithm/User_Guide/Building_Features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Building Features

The Polytope software implements a set of base shapes that might be of interest to users. These are detailed [here](../Developer_Guide/shapes.md).

For many applications however, these shapes are not directly of interest and should rather be used as building blocks for more complex and domain-specific "features", such as timeseries or country areas.

The main requirement when building such features in Polytope is that the feature should be defined on all dimensions of the provided datacube.
This implies that, when defining lower-dimensional shapes in higher-dimensional datacubes, the remaining axes still need to be specified within the Polytope request (most likely as *Select* shapes).

For example, for a given datacube with dimensions "level", "step", "latitude" and "longitude", we could query the following shapes:

- a timeseries of a point which would be defined as

Request(
Point(["latitude", "longitude"], [[p1_lat, p1_lon]]),
Span("step", start_step, end_step),
Select("level", [level1])
)


- a specific country area which would be defined as

Request(
Polygon(["latitude", "longitude"], country_points),
Select("step", [step1]),
Select("level", [level1])
)

- a flight path which would be defined as

Request(
Path(
["latitude", "longitude", "level", "step"],
Box(
["latitude", "longitude", "level", "step"],
[0, 0, 0, 0],
[lat_padding, lon_padding, level_padding, step_padding]
),
flight_points
)
)
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Example
Here is a step-by-step example of how to use the Polytope software.

1. In this example, we first specify the data which will be in our Xarray datacube. Note that the data here comes from the GRIB file called "winds.grib", which is 3-dimensional with dimensions: step, latitude and longitude.
1. In this example, we first specify the data which will be in our XArray datacube. Note that the data here comes from the GRIB file called "winds.grib", which is 3-dimensional with dimensions: step, latitude and longitude.

import xarray as xr

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,13 @@ or from PyPI with the command

Polytope's tests and examples require some additional dependencies compared to the main Polytope software.

- **Git Large File Storage**
<!-- - **Git Large File Storage**
Polytope uses Git Large File Storage (LFS) to store large data files used in its tests and examples.
To run the tests and examples, it is thus necessary to install Git LFS, by following instructions provided [here](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage) for example.
Once Git LFS is installed, individual data files can be downloaded using the command
git lfs pull --include="*" --exclude=""
git lfs pull --include="*" --exclude="" -->

- **Additional Dependencies**

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ For a quick guide of how to install and use Polytope, refer to the links below:

- <a href="../Example">Example</a>

- <a href="../Building_Features">Building Features</a>

!!!note
<!-- For more information about Polytopes' APIs, refer to the [API page](../Developer_Guide/API.md). -->
An exhaustive list of all shapes that can currently be requested using Polytope can be found [here](../Developer_Guide/shapes.md).
Expand Down
51 changes: 51 additions & 0 deletions docs/Service/Data_Portfolio.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Data Portfolio

Polytope feature extraction only has access to data that is stored on an FDB. The dataset currently available via Polyope feature extraction is the operational forecast. We plan to add Destination Earth Digital Twin data in the future.

## Operational Forecast Data

The following values available for each field specified are:

* `class` : `od`
* `stream` : `enfo` `oper`
* `type` : `fc` `pf` `cf`
* `levtype` : `sfc` `pl` `ml`
* `expver` : `0001`
* `domain` : `g`
* `step` : `0/to/360` (All steps may not be available between `0` and `360`)

If `type` is `enfo`:

* `number` : `0/to/50`

If `levtype` is `pl` or `ml` a `levelist` must be provided:

* `levelist` : `1/to/1000`

`pl` and `ml` also only contain a subset of parameters that are available in grid point. These are:

* `pl`
* `o3`
* `clwc`
* `q`
* `pv`
* `ciwc`
* `cc`
* `ml`
* `q`
* `cat`
* `o3`
* `clwc`
* `ciwc`
* `cc`
* `cswc`
* `crwe`
* `ttpha`

For `sfc` most `params` will be available but not all.

Only data that is contained in the operational FDB can be requested via Polytope feature extraction, the FDB usually only contains the last two days of forecasts.

We sometimes limit the size of requests for area features such as bounding box and polygon to maintain quality of service.

Access to operational data is limited by our release schedule.
Loading

0 comments on commit bb8bc38

Please sign in to comment.