Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/irregular grid merged develop #120

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
a0360ba
small improvements
mathleur Nov 15, 2023
9e45b9e
add meteosuisse local grid test
mathleur Feb 7, 2024
0009eb1
update requirements
mathleur Feb 8, 2024
e547cb9
pull develop
mathleur Feb 8, 2024
9810972
black
mathleur Feb 8, 2024
af28bcb
remove unnecessary transformation function
mathleur Feb 12, 2024
53f4795
add TODO
mathleur Feb 12, 2024
06d95a0
simplify transformations
mathleur Feb 12, 2024
86bf99e
remove unnecessary find_indexes_between function in merge and type ch…
mathleur Feb 12, 2024
a16124f
remove unnecessary cyclic transformation decorator function
mathleur Feb 12, 2024
0eba71c
remove cyclic find_indices_between function
mathleur Feb 12, 2024
2633ed2
remove remap_path_key function for cyclic decorator
mathleur Feb 12, 2024
c37e7a7
separate mappers from reverse transformation and make better reverse …
mathleur Feb 13, 2024
05e5ff2
merge develop
mathleur Feb 13, 2024
b439d66
black
mathleur Feb 13, 2024
76b8b34
simplify transformations
mathleur Feb 14, 2024
cfd0983
clean up
mathleur Feb 14, 2024
c78c3d5
clean up
mathleur Feb 14, 2024
47f8fd0
put the offset calculations etc inside a find_indexes_between in cycl…
mathleur Feb 16, 2024
69250db
add a global datacube init
mathleur Feb 19, 2024
8116292
update readme
mathleur Feb 28, 2024
0100a77
small fix
mathleur Feb 28, 2024
85437c3
remove unused comments
mathleur Feb 28, 2024
0474752
remove numpy version
mathleur Mar 1, 2024
df734b3
remove numpy version
mathleur Mar 1, 2024
23ce604
add __init__.py to subfolders
mathleur Mar 1, 2024
179c6e4
Merge branch 'develop' of github.com:ecmwf/polytope into develop
mathleur Mar 1, 2024
bd0eb3d
remove requirement versions
mathleur Mar 1, 2024
b117f83
remove test requirement versions
mathleur Mar 1, 2024
2b8dc06
remove example requirement versions
mathleur Mar 1, 2024
a0c56db
Version 1.0.3
jameshawkes Mar 1, 2024
0294f09
fix pypi step in CI
mathleur Feb 16, 2024
efae719
Merge pull request #114 from ecmwf/feature/update_readme
mathleur Mar 4, 2024
8284c25
Merge pull request #103 from ecmwf/feature/swiss_grid_test
mathleur Mar 4, 2024
4eeefff
Merge pull request #105 from ecmwf/feature/easier_transformations
mathleur Mar 4, 2024
41011ee
Merge pull request #106 from ecmwf/feature/performance_optimisation
mathleur Mar 4, 2024
a14d993
merge develop
mathleur Mar 4, 2024
e7fb3b2
Merge pull request #104 from ecmwf/feature/fix_dependabots
mathleur Mar 4, 2024
01b5028
Merge branch 'develop' of github.com:ecmwf/polytope into develop
mathleur Mar 4, 2024
5670e4b
merge develop
mathleur Mar 4, 2024
f88e308
Merge pull request #111 from ecmwf/feature/reformat_datacube_init
mathleur Mar 4, 2024
d9e1823
add explicit error message to value error when no unsliceable child i…
mathleur Mar 4, 2024
0f6fcb6
put find_indexes outside of decorator
mathleur Feb 15, 2024
0d9db67
remove all functions from the decorators and put them into the transf…
mathleur Feb 15, 2024
6548e72
remove decorators completely when we create the axes/transformations
mathleur Feb 15, 2024
22f78da
completely remove decorator files and null transformation
mathleur Feb 15, 2024
b3612a5
put base functions in the transformations in the generic transformati…
mathleur Feb 16, 2024
522de07
merge develop
mathleur Mar 4, 2024
00f0161
Merge pull request #119 from ecmwf/feature/more_explicit_error
mathleur Mar 4, 2024
a65c2f1
merge with develop
mathleur Mar 5, 2024
c3f51b3
Merge pull request #109 from ecmwf/feature/refactor_cyclic_transforma…
mathleur Mar 5, 2024
bc9596a
merge develop
mathleur Mar 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -155,8 +155,8 @@ jobs:
pip install setuptools wheel twine
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
TWINE_USERNAME: "__token__"
TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
run: |
python setup.py sdist
twine upload dist/*
2 changes: 1 addition & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
jinja2<3.1.0
jinja2>=3.1.3
Markdown<3.2
mkdocs>=1.0
20 changes: 10 additions & 10 deletions examples/requirements_examples.txt
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
-r ../requirements.txt
-r ../tests/requirements_test.txt

matplotlib==3.6.2
matplotlib-inline==0.1.6
Pillow==9.3.0
Shapely==1.8.5.post1
shp==1.0.2
Fiona==1.8.22
geopandas==0.12.2
plotly==5.11.0
pyshp==2.3.1
cfgrib==0.9.10.3
matplotlib
matplotlib-inline
Pillow
Shapely
shp
Fiona
geopandas
plotly
pyshp
cfgrib
1 change: 1 addition & 0 deletions performance/fdb_performance_3D.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ def setup_method(self, method):
"date": {"transformation": {"merge": {"with": "time", "linkers": [" ", "00"]}}},
"step": {"transformation": {"type_change": "int"}},
"levelist": {"transformation": {"type_change": "int"}},
"longitude": {"transformation": {"cyclic": [0, 360]}},
}
self.config = {"class": "od", "expver": "0001", "levtype": "sfc"}
self.fdbdatacube = FDBDatacube(self.config, axis_options=self.options)
Expand Down
72 changes: 25 additions & 47 deletions polytope/datacube/backends/datacube.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
import importlib
import logging
import math
from abc import ABC, abstractmethod
from typing import Any

import xarray as xr

from ...utility.combinatorics import unique, validate_axes
from ...utility.combinatorics import validate_axes
from ..datacube_axis import DatacubeAxis
from ..index_tree import DatacubePath, IndexTree
from ..transformations.datacube_transformations import (
Expand All @@ -16,6 +14,24 @@


class Datacube(ABC):
def __init__(self, axis_options=None, datacube_options=None):
if axis_options is None:
self.axis_options = {}
else:
self.axis_options = axis_options
if datacube_options is None:
datacube_options = {}
self.axis_with_identical_structure_after = datacube_options.get("identical structure after")
self.coupled_axes = []
self.axis_counter = 0
self.complete_axes = []
self.blocked_axes = []
self.fake_axes = []
self.treated_axes = []
self.nearest_search = {}
self._axes = None
self.transformed_axes = []

@abstractmethod
def get(self, requests: IndexTree) -> Any:
"""Return data given a set of request trees"""
Expand Down Expand Up @@ -46,33 +62,29 @@ def _create_axes(self, name, values, transformation_type_key, transformation_opt

# first need to change the values so that we have right type
values = transformation.change_val_type(axis_name, values)
if self._axes is None:
DatacubeAxis.create_standard(axis_name, values, self)
elif axis_name not in self._axes.keys():
if self._axes is None or axis_name not in self._axes.keys():
DatacubeAxis.create_standard(axis_name, values, self)
# add transformation tag to axis, as well as transformation options for later
setattr(self._axes[axis_name], has_transform[transformation_type_key], True) # where has_transform is a
# factory inside datacube_transformations to set the has_transform, is_cyclic etc axis properties
# add the specific transformation handled here to the relevant axes
# Modify the axis to update with the tag
decorator_module = importlib.import_module("polytope.datacube.datacube_axis")
decorator = getattr(decorator_module, transformation_type_key)
decorator(self._axes[axis_name])

if transformation not in self._axes[axis_name].transformations: # Avoids duplicates being stored
self._axes[axis_name].transformations.append(transformation)

def _add_all_transformation_axes(self, options, name, values):
for transformation_type_key in options.keys():
if transformation_type_key != "cyclic":
self.transformed_axes.append(name)
self._create_axes(name, values, transformation_type_key, options)

def _check_and_add_axes(self, options, name, values):
if options is not None:
self._add_all_transformation_axes(options, name, values)
else:
if name not in self.blocked_axes:
if self._axes is None:
DatacubeAxis.create_standard(name, values, self)
elif name not in self._axes.keys():
if self._axes is None or name not in self._axes.keys():
DatacubeAxis.create_standard(name, values, self)

def has_index(self, path: DatacubePath, axis, index):
Expand All @@ -96,46 +108,12 @@ def get_indices(self, path: DatacubePath, axis, lower, upper, method=None):
"""
path = self.fit_path(path)
indexes = axis.find_indexes(path, self)
search_ranges = axis.remap([lower, upper])
original_search_ranges = axis.to_intervals([lower, upper])
# Find the offsets for each interval in the requested range, which we will need later
search_ranges_offset = []
for r in original_search_ranges:
offset = axis.offset(r)
search_ranges_offset.append(offset)
idx_between = self._look_up_datacube(search_ranges, search_ranges_offset, indexes, axis, method)
# Remove duplicates even if difference of the order of the axis tolerance
if offset is not None:
# Note that we can only do unique if not dealing with time values
idx_between = unique(idx_between)
idx_between = axis.find_indices_between(indexes, lower, upper, self, method)

logging.info(f"For axis {axis.name} between {lower} and {upper}, found indices {idx_between}")

return idx_between

def _look_up_datacube(self, search_ranges, search_ranges_offset, indexes, axis, method):
idx_between = []
for i in range(len(search_ranges)):
r = search_ranges[i]
offset = search_ranges_offset[i]
low = r[0]
up = r[1]
indexes_between = axis.find_indices_between([indexes], low, up, self, method)
# Now the indexes_between are values on the cyclic range so need to remap them to their original
# values before returning them
for j in range(len(indexes_between)):
# if we have a special indexes between range that needs additional offset, treat it here
if len(indexes_between[j]) == 0:
idx_between = idx_between
else:
for k in range(len(indexes_between[j])):
if offset is None:
indexes_between[j][k] = indexes_between[j][k]
else:
indexes_between[j][k] = round(indexes_between[j][k] + offset, int(-math.log10(axis.tol)))
idx_between.append(indexes_between[j][k])
return idx_between

def get_mapper(self, axis):
"""
Get the type mapper for a subaxis of the datacube given by label
Expand Down
22 changes: 5 additions & 17 deletions polytope/datacube/backends/fdb.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,25 +11,13 @@ class FDBDatacube(Datacube):
def __init__(self, config=None, axis_options=None, datacube_options=None, point_cloud_options=None):
if config is None:
config = {}
if axis_options is None:
axis_options = {}
if datacube_options is None:
datacube_options = {}

super().__init__(axis_options, datacube_options)

logging.info("Created an FDB datacube with options: " + str(axis_options))

self.axis_options = axis_options
self.axis_counter = 0
self._axes = None
treated_axes = []
self.complete_axes = []
self.blocked_axes = []
self.fake_axes = []
self.unwanted_path = {}
self.has_point_cloud = point_cloud_options # NOTE: here, will be True/False
self.coupled_axes = []
self.axis_with_identical_structure_after = datacube_options.get("identical structure after")
self.nearest_search = {}

partial_request = config
# Find values in the level 3 FDB datacube
Expand All @@ -44,12 +32,12 @@ def __init__(self, config=None, axis_options=None, datacube_options=None, point_
values.sort()
options = axis_options.get(name, None)
self._check_and_add_axes(options, name, values)
treated_axes.append(name)
self.treated_axes.append(name)
self.complete_axes.append(name)

# add other options to axis which were just created above like "lat" for the mapper transformations for eg
for name in self._axes:
if name not in treated_axes:
if name not in self.treated_axes:
options = axis_options.get(name, None)
val = self._axes[name].type
self._check_and_add_axes(options, name, val)
Expand Down Expand Up @@ -250,7 +238,7 @@ def sort_fdb_request_ranges(self, range_lengths, current_start_idx, lat_length):
return (original_indices, sorted_request_ranges)

def datacube_natural_indexes(self, axis, subarray):
indexes = subarray[axis.name]
indexes = subarray.get(axis.name, None)
return indexes

def select(self, path, unmapped_path):
Expand Down
3 changes: 1 addition & 2 deletions polytope/datacube/backends/mock.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

class MockDatacube(Datacube):
def __init__(self, dimensions, datacube_options={}):
super().__init__({}, datacube_options)
assert isinstance(dimensions, dict)

self.dimensions = dimensions
Expand All @@ -22,8 +23,6 @@ def __init__(self, dimensions, datacube_options={}):
for k, v in reversed(dimensions.items()):
self.stride[k] = stride_cumulative
stride_cumulative *= self.dimensions[k]
self.coupled_axes = []
self.axis_with_identical_structure_after = ""

def get(self, requests: IndexTree):
# Takes in a datacube and verifies the leaves of the tree are complete
Expand Down
96 changes: 42 additions & 54 deletions polytope/datacube/backends/xarray.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from copy import deepcopy

import numpy as np
import xarray as xr

from .datacube import Datacube, IndexTree
Expand All @@ -9,44 +10,32 @@ class XArrayDatacube(Datacube):
"""Xarray arrays are labelled, axes can be defined as strings or integers (e.g. "time" or 0)."""

def __init__(self, dataarray: xr.DataArray, axis_options=None, datacube_options=None, point_cloud_options=None):
if axis_options is None:
axis_options = {}
if datacube_options is None:
datacube_options = {}
self.axis_options = axis_options
self.axis_counter = 0
self._axes = None
super().__init__(axis_options, datacube_options)
self.dataarray = dataarray
treated_axes = []
self.complete_axes = []
self.blocked_axes = []
self.fake_axes = []
self.nearest_search = None
self.coupled_axes = []
self.axis_with_identical_structure_after = datacube_options.get("identical structure after")
self.has_point_cloud = point_cloud_options

for name, values in dataarray.coords.variables.items():
if name in dataarray.dims:
options = axis_options.get(name, None)
options = self.axis_options.get(name, None)
self._check_and_add_axes(options, name, values)
treated_axes.append(name)
self.treated_axes.append(name)
self.complete_axes.append(name)
else:
if self.dataarray[name].dims == ():
options = axis_options.get(name, None)
options = self.axis_options.get(name, None)
self._check_and_add_axes(options, name, values)
treated_axes.append(name)
self.treated_axes.append(name)
for name in dataarray.dims:
if name not in treated_axes:
options = axis_options.get(name, None)
if name not in self.treated_axes:
options = self.axis_options.get(name, None)
val = dataarray[name].values[0]
self._check_and_add_axes(options, name, val)
treated_axes.append(name)
self.treated_axes.append(name)
# add other options to axis which were just created above like "lat" for the mapper transformations for eg
for name in self._axes:
if name not in treated_axes:
options = axis_options.get(name, None)
if name not in self.treated_axes:
options = self.axis_options.get(name, None)
val = self._axes[name].type
self._check_and_add_axes(options, name, val)

Expand All @@ -55,45 +44,23 @@ def find_point_cloud(self):
if self.has_point_cloud:
return self.has_point_cloud

def old_get(self, requests: IndexTree):
for r in requests.leaves:
path = r.flatten()
if len(path.items()) == self.axis_counter:
# first, find the grid mapper transform
unmapped_path = {}
path_copy = deepcopy(path)
for key in path_copy:
axis = self._axes[key]
(path, unmapped_path) = axis.unmap_to_datacube(path, unmapped_path)
path = self.fit_path(path)
subxarray = self.dataarray.sel(path, method="nearest")
subxarray = subxarray.sel(unmapped_path)
value = subxarray.item()
key = subxarray.name
r.result = (key, value)
else:
r.remove_branch()

def get(self, requests: IndexTree):
# TODO: change to work with the irregular grid
axis_counter = self.axis_counter + 1
# if self.has_point_cloud:
# axis_counter = self.axis_counter - 1
# else:
# axis_counter = self.axis_counter

for r in requests.leaves:
# path = r.flatten()
path = r.flatten_with_result()
if len(path.items()) == axis_counter:
if len(path.items()) == self.axis_counter + 1:
# first, find the grid mapper transform
unmapped_path = {}
path_copy = deepcopy(path)
for key in path_copy:
if key != "result":
axis = self._axes[key]
(path, unmapped_path) = axis.unmap_to_datacube(path, unmapped_path)
path = self.fit_path(path)
key_value_path = {key: path_copy[key]}
(key_value_path, path, unmapped_path) = axis.unmap_path_key(key_value_path, path, unmapped_path)
path.update(key_value_path)
path.update(unmapped_path)

unmapped_path = {}
self.refit_path(path, unmapped_path, path)
subxarray = self.dataarray.sel(path, method="nearest")
subxarray = subxarray.sel(unmapped_path)
value = subxarray.item()
Expand All @@ -107,13 +74,34 @@ def datacube_natural_indexes(self, axis, subarray):
indexes = next(iter(subarray.xindexes.values())).to_pandas_index()
else:
if subarray[axis.name].values.ndim == 0:
indexes = [subarray[axis.name].values]
# NOTE how we handle the two special datetime and timedelta cases to conform with numpy arrays
if np.issubdtype(subarray[axis.name].values.dtype, np.datetime64):
indexes = [subarray[axis.name].astype("datetime64[us]").values]
elif np.issubdtype(subarray[axis.name].values.dtype, np.timedelta64):
indexes = [subarray[axis.name].astype("timedelta64[us]").values]
else:
indexes = [subarray[axis.name].values.tolist()]
else:
indexes = subarray[axis.name].values
return indexes

def refit_path(self, path_copy, unmapped_path, path):
for key in path.keys():
if key not in self.dataarray.dims:
path_copy.pop(key)
elif key not in self.dataarray.coords.dtypes:
unmapped_path.update({key: path[key]})
path_copy.pop(key, None)
for key in self.dataarray.coords.dtypes:
key_dtype = self.dataarray.coords.dtypes[key]
if key_dtype.type is np.str_ and key in path.keys():
unmapped_path.update({key: path[key]})
path_copy.pop(key, None)

def select(self, path, unmapped_path):
subarray = self.dataarray.sel(path, method="nearest")
path_copy = deepcopy(path)
self.refit_path(path_copy, unmapped_path, path)
subarray = self.dataarray.sel(path_copy, method="nearest")
subarray = subarray.sel(unmapped_path)
return subarray

Expand Down
Loading
Loading