Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify Layer listing #554

Merged
merged 25 commits into from
Jul 14, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
92147ba
add new trait for layer collection listing and return workflows inste…
michaelmattig Jun 15, 2022
06e5355
turning external dataset provider into external layer provider (wip)
michaelmattig Jun 20, 2022
1775fa6
turning external dataset provider into external layer provider (wip)
michaelmattig Jun 20, 2022
ada4d32
Merge branch 'layer_listing' of https://github.com/geo-engine/geoengi…
michaelmattig Jun 20, 2022
ab7f92f
make layer listing and mock provider work again
michaelmattig Jun 22, 2022
6a355ff
implement layer provider for dataset db
michaelmattig Jun 23, 2022
062a5d6
layer collections for the pro version
michaelmattig Jun 24, 2022
e605ecb
migrate gfbio to layer provider
michaelmattig Jun 27, 2022
435b153
migrate external providers
michaelmattig Jun 27, 2022
95a93f6
fix test
michaelmattig Jun 28, 2022
9d4fce3
Merge branch 'master' of https://github.com/geo-engine/geoengine into…
michaelmattig Jun 28, 2022
b857b38
layer listing for netcdfcf
michaelmattig Jun 29, 2022
99987d7
check root collection ids
michaelmattig Jun 29, 2022
c2d3b51
fix external dataset handling for dataset db
michaelmattig Jun 29, 2022
3a902a7
cleanup
michaelmattig Jun 29, 2022
7e0cad2
Merge branch 'master' of https://github.com/geo-engine/geoengine into…
michaelmattig Jul 6, 2022
158facb
add trait for OperatorNames and convenience method creating source op…
michaelmattig Jul 6, 2022
ddfafc8
add an "unsorted collection" for leftover items
michaelmattig Jul 6, 2022
c3a4f3f
rename ids and providers
michaelmattig Jul 12, 2022
9cbb82b
change dataset route
michaelmattig Jul 12, 2022
e7f6ffd
update changelog
michaelmattig Jul 13, 2022
63f2f31
Merge branch 'master' into layer_listing
michaelmattig Jul 14, 2022
38dc9a3
fix fmt
michaelmattig Jul 14, 2022
a3453ee
fix test
michaelmattig Jul 14, 2022
cd031d8
fix doctest
michaelmattig Jul 14, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- Added a layers API that allows browsing datasets, stored layers and external data in a uniform fashion

- https://github.com/geo-engine/geoengine/pull/554

- Added a `ClassHistogram` plot operator for creating histograms of categorical data

- https://github.com/geo-engine/geoengine/pull/560
Expand All @@ -21,6 +25,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Changed

- Refactored dataset ids and external provders

- https://github.com/geo-engine/geoengine/pull/554
- **breaking** the parameters of the source operators changed which makes old workflow jsons incompatible
- **breaking** the id of datasets changed which makes old dataset definition jsons incompatible

- Added `Measurement`s to vector data workflows

- https://github.com/geo-engine/geoengine/pull/557
Expand Down
48 changes: 29 additions & 19 deletions datatypes/src/dataset.rs
Original file line number Diff line number Diff line change
@@ -1,31 +1,41 @@
use crate::identifier;
use serde::{Deserialize, Serialize};

identifier!(DatasetProviderId);
identifier!(DataProviderId);

identifier!(InternalDatasetId);

identifier!(StagingDatasetId);
// Identifier for datasets managed by Geo Engine
identifier!(DatasetId);

#[derive(Debug, Clone, Hash, Eq, PartialEq, Deserialize, Serialize)]
#[serde(rename_all = "camelCase", tag = "type")]
pub enum DatasetId {
/// The identifier for loadable data. It is used in the source operators to get the loading info (aka parametrization)
/// for accessing the data. Internal data is loaded from datasets, external from `DataProvider`s.
pub enum DataId {
#[serde(rename_all = "camelCase")]
Internal {
dataset_id: InternalDatasetId,
dataset_id: DatasetId,
},
External(ExternalDatasetId),
External(ExternalDataId),
}

#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq, Hash)]
pub struct LayerId(pub String);

impl std::fmt::Display for LayerId {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(f, "{}", self.0)
}
}

#[derive(Debug, Clone, Hash, Eq, PartialEq, Deserialize, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct ExternalDatasetId {
pub provider_id: DatasetProviderId,
pub dataset_id: String,
pub struct ExternalDataId {
pub provider_id: DataProviderId,
pub layer_id: LayerId,
}

impl DatasetId {
pub fn internal(&self) -> Option<InternalDatasetId> {
impl DataId {
pub fn internal(&self) -> Option<DatasetId> {
if let Self::Internal {
dataset_id: dataset,
} = self
Expand All @@ -35,22 +45,22 @@ impl DatasetId {
None
}

pub fn external(&self) -> Option<ExternalDatasetId> {
pub fn external(&self) -> Option<ExternalDataId> {
if let Self::External(id) = self {
return Some(id.clone());
}
None
}
}

impl From<InternalDatasetId> for DatasetId {
fn from(value: InternalDatasetId) -> Self {
DatasetId::Internal { dataset_id: value }
impl From<DatasetId> for DataId {
fn from(value: DatasetId) -> Self {
DataId::Internal { dataset_id: value }
}
}

impl From<ExternalDatasetId> for DatasetId {
fn from(value: ExternalDatasetId) -> Self {
DatasetId::External(value)
impl From<ExternalDataId> for DataId {
fn from(value: ExternalDataId) -> Self {
DataId::External(value)
}
}
2 changes: 1 addition & 1 deletion operators/benches/expression.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ fn ndvi_source(execution_context: &mut MockExecutionContext) -> Box<dyn RasterOp
let ndvi_id = add_ndvi_dataset(execution_context);

let gdal_operator = GdalSource {
params: GdalSourceParameters { dataset: ndvi_id },
params: GdalSourceParameters { data: ndvi_id },
};

gdal_operator.boxed()
Expand Down
27 changes: 9 additions & 18 deletions operators/benches/workflows.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ use std::hint::black_box;
use std::time::{Duration, Instant};

use futures::TryStreamExt;
use geoengine_datatypes::dataset::DatasetId;
use geoengine_datatypes::dataset::{DataId, DatasetId};
use geoengine_datatypes::primitives::{
Measurement, QueryRectangle, RasterQueryRectangle, SpatialPartitioned,
};
Expand All @@ -12,7 +12,6 @@ use geoengine_datatypes::spatial_reference::SpatialReference;

use geoengine_datatypes::util::Identifier;
use geoengine_datatypes::{
dataset::InternalDatasetId,
primitives::{SpatialPartition2D, SpatialResolution, TimeInterval},
raster::{GridSize, RasterTile2D, TilingSpecification},
};
Expand Down Expand Up @@ -602,13 +601,11 @@ fn bench_gdal_source_operator_tile_size(bench_collector: &mut BenchmarkCollector
// TilingSpecification::new((0., 0.).into(), [9000, 9000].into()),
];

let id: DatasetId = InternalDatasetId::new().into();
let id: DataId = DatasetId::new().into();
let meta_data = create_ndvi_meta_data();

let gdal_operator = GdalSource {
params: GdalSourceParameters {
dataset: id.clone(),
},
params: GdalSourceParameters { data: id.clone() },
}
.boxed();

Expand Down Expand Up @@ -654,13 +651,11 @@ fn bench_gdal_source_operator_with_expression_tile_size(bench_collector: &mut Be
// TilingSpecification::new((0., 0.).into(), [9000, 9000].into()),
];

let id: DatasetId = InternalDatasetId::new().into();
let id: DataId = DatasetId::new().into();
let meta_data = create_ndvi_meta_data();

let gdal_operator = GdalSource {
params: GdalSourceParameters {
dataset: id.clone(),
},
params: GdalSourceParameters { data: id.clone() },
};

let expression_operator = Expression {
Expand Down Expand Up @@ -717,13 +712,11 @@ fn bench_gdal_source_operator_with_identity_reprojection(bench_collector: &mut B
// TilingSpecification::new((0., 0.).into(), [9000, 9000].into()),
];

let id: DatasetId = InternalDatasetId::new().into();
let id: DataId = DatasetId::new().into();
let meta_data = create_ndvi_meta_data();

let gdal_operator = GdalSource {
params: GdalSourceParameters {
dataset: id.clone(),
},
params: GdalSourceParameters { data: id.clone() },
};

let projection_operator = Reprojection {
Expand Down Expand Up @@ -781,13 +774,11 @@ fn bench_gdal_source_operator_with_4326_to_3857_reprojection(
// TilingSpecification::new((0., 0.).into(), [9000, 9000].into()),
];

let id: DatasetId = InternalDatasetId::new().into();
let id: DataId = DatasetId::new().into();
let meta_data = create_ndvi_meta_data();

let gdal_operator = GdalSource {
params: GdalSourceParameters {
dataset: id.clone(),
},
params: GdalSourceParameters { data: id.clone() },
};

let projection_operator = Reprojection {
Expand Down
23 changes: 10 additions & 13 deletions operators/src/engine/execution_context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use crate::mock::MockDatasetDataSourceLoadingInfo;
use crate::source::{GdalLoadingInfo, OgrSourceDataset};
use crate::util::{create_rayon_thread_pool, Result};
use async_trait::async_trait;
use geoengine_datatypes::dataset::DatasetId;
use geoengine_datatypes::dataset::DataId;
use geoengine_datatypes::primitives::{RasterQueryRectangle, VectorQueryRectangle};
use geoengine_datatypes::raster::TilingSpecification;
use geoengine_datatypes::util::test::TestDefault;
Expand Down Expand Up @@ -35,7 +35,7 @@ pub trait MetaDataProvider<L, R, Q>
where
R: ResultDescriptor,
{
async fn meta_data(&self, dataset: &DatasetId) -> Result<Box<dyn MetaData<L, R, Q>>>;
async fn meta_data(&self, id: &DataId) -> Result<Box<dyn MetaData<L, R, Q>>>;
}

#[async_trait]
Expand All @@ -60,7 +60,7 @@ where

pub struct MockExecutionContext {
pub thread_pool: Arc<ThreadPool>,
pub meta_data: HashMap<DatasetId, Box<dyn Any + Send + Sync>>,
pub meta_data: HashMap<DataId, Box<dyn Any + Send + Sync>>,
pub tiling_specification: TilingSpecification,
}

Expand Down Expand Up @@ -94,17 +94,14 @@ impl MockExecutionContext {
}
}

pub fn add_meta_data<L, R, Q>(
&mut self,
dataset: DatasetId,
meta_data: Box<dyn MetaData<L, R, Q>>,
) where
pub fn add_meta_data<L, R, Q>(&mut self, data: DataId, meta_data: Box<dyn MetaData<L, R, Q>>)
where
L: Send + Sync + 'static,
R: Send + Sync + 'static + ResultDescriptor,
Q: Send + Sync + 'static,
{
self.meta_data
.insert(dataset, Box::new(meta_data) as Box<dyn Any + Send + Sync>);
.insert(data, Box::new(meta_data) as Box<dyn Any + Send + Sync>);
}

pub fn mock_query_context(&self, chunk_byte_size: ChunkByteSize) -> MockQueryContext {
Expand Down Expand Up @@ -132,13 +129,13 @@ where
R: 'static + ResultDescriptor,
Q: 'static,
{
async fn meta_data(&self, dataset: &DatasetId) -> Result<Box<dyn MetaData<L, R, Q>>> {
async fn meta_data(&self, id: &DataId) -> Result<Box<dyn MetaData<L, R, Q>>> {
let meta_data = self
.meta_data
.get(dataset)
.ok_or(Error::UnknownDatasetId)?
.get(id)
.ok_or(Error::UnknownDataId)?
.downcast_ref::<Box<dyn MetaData<L, R, Q>>>()
.ok_or(Error::DatasetLoadingInfoProviderMismatch)?;
.ok_or(Error::InvalidMetaDataType)?;

Ok(meta_data.clone())
}
Expand Down
4 changes: 2 additions & 2 deletions operators/src/engine/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ pub use execution_context::{
ExecutionContext, MetaData, MetaDataProvider, MockExecutionContext, StaticMetaData,
};
pub use operator::{
InitializedPlotOperator, InitializedRasterOperator, InitializedVectorOperator,
OperatorDatasets, PlotOperator, RasterOperator, TypedOperator, VectorOperator,
InitializedPlotOperator, InitializedRasterOperator, InitializedVectorOperator, OperatorData,
OperatorName, PlotOperator, RasterOperator, TypedOperator, VectorOperator,
};
pub use operator_impl::{
MultipleRasterOrSingleVectorSource, MultipleRasterSources, MultipleVectorSources, Operator,
Expand Down
33 changes: 18 additions & 15 deletions operators/src/engine/operator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,31 +3,30 @@ use serde::{Deserialize, Serialize};
use crate::error;
use crate::util::Result;
use async_trait::async_trait;
use geoengine_datatypes::dataset::DatasetId;
use geoengine_datatypes::dataset::DataId;

use super::{
query_processor::{TypedRasterQueryProcessor, TypedVectorQueryProcessor},
CloneablePlotOperator, CloneableRasterOperator, CloneableVectorOperator, ExecutionContext,
PlotResultDescriptor, RasterResultDescriptor, TypedPlotQueryProcessor, VectorResultDescriptor,
};

pub trait OperatorDatasets {
/// Get the dataset ids of all the datasets involoved in this operator and its sources
fn datasets(&self) -> Vec<DatasetId> {
pub trait OperatorData {
/// Get the ids of all the data involoved in this operator and its sources
fn data_ids(&self) -> Vec<DataId> {
let mut datasets = vec![];
self.datasets_collect(&mut datasets);
self.data_ids_collect(&mut datasets);
datasets
}

#[allow(clippy::ptr_arg)] // must allow `push` on `datasets`
fn datasets_collect(&self, datasets: &mut Vec<DatasetId>);
fn data_ids_collect(&self, data_ids: &mut Vec<DataId>);
}

/// Common methods for `RasterOperator`s
#[typetag::serde(tag = "type")]
#[async_trait]
pub trait RasterOperator:
CloneableRasterOperator + OperatorDatasets + Send + Sync + std::fmt::Debug
CloneableRasterOperator + OperatorData + Send + Sync + std::fmt::Debug
{
async fn initialize(
self: Box<Self>,
Expand All @@ -47,7 +46,7 @@ pub trait RasterOperator:
#[typetag::serde(tag = "type")]
#[async_trait]
pub trait VectorOperator:
CloneableVectorOperator + OperatorDatasets + Send + Sync + std::fmt::Debug
CloneableVectorOperator + OperatorData + Send + Sync + std::fmt::Debug
{
async fn initialize(
self: Box<Self>,
Expand All @@ -67,7 +66,7 @@ pub trait VectorOperator:
#[typetag::serde(tag = "type")]
#[async_trait]
pub trait PlotOperator:
CloneablePlotOperator + OperatorDatasets + Send + Sync + std::fmt::Debug
CloneablePlotOperator + OperatorData + Send + Sync + std::fmt::Debug
{
async fn initialize(
self: Box<Self>,
Expand Down Expand Up @@ -264,12 +263,16 @@ macro_rules! call_on_typed_operator {
};
}

impl OperatorDatasets for TypedOperator {
fn datasets_collect(&self, datasets: &mut Vec<DatasetId>) {
impl OperatorData for TypedOperator {
fn data_ids_collect(&self, data_ids: &mut Vec<DataId>) {
match self {
TypedOperator::Vector(v) => v.datasets_collect(datasets),
TypedOperator::Raster(r) => r.datasets_collect(datasets),
TypedOperator::Plot(p) => p.datasets_collect(datasets),
TypedOperator::Vector(v) => v.data_ids_collect(data_ids),
TypedOperator::Raster(r) => r.data_ids_collect(data_ids),
TypedOperator::Plot(p) => p.data_ids_collect(data_ids),
}
}
}

pub trait OperatorName {
const TYPE_NAME: &'static str;
}
Loading