Skip to content

Commit

Permalink
Private Datasets: Update use case: Creating a dataset (#761)
Browse files Browse the repository at this point in the history
* AddCommand: add "--public" argument

* Command::before_run(): introduce

* AddCommand::before_run(): add single-tenant check

* DatasetRepository::create_dataset_from_snapshot(): add "publicly_available" argument

* Register RebacService

* Updates after painful rebasing

* CreateDatasetFromSnapshotUseCase: add Options (mostly for DatasetVisibility)

* CreateDatasetFromSnapshotUseCaseImpl: use Options

* DatasetRepository{LocalFs,S3}: move RebacService in favor of use cases

* AddCommand: provide Options for the use case

* Correct import names after rebasing: RebacRepositoryInMem -> InMemoryRebacRepository

* CreateDatasetFromSnapshotUseCase: use default options in tests

* create_dataset_from_snapshot_impl: remove outdated docstring

* Remove "cli_arguments" experiment

* CreateDatasetUseCase: add a todo

* AddCommand: get multi-tenantness from the dataset repo

* CreateDatasetFromSnapshotUseCaseOptions: remove "is_multi_tenant_workspace"

* GQL: DatasetsMut::{createEmpty, createFromSnapshot}(): add "datasetPubliclyAvailable" argument

* DatasetRepositoryWriter: is superset of DatasetRepository

* DatasetVisibility: remove confusing Default derive

* AddCommand: support dataset visibility as a value, not a bool flag

* kamu-cli, prepare_dependencies_graph_repository(): remove outdated deps

* kamu-cli: register SqliteRebacRepository

* kamu-cli: move InMemoryRebacRepository to the right place (configure_in_memory_components)

* DatasetRepositoryWriter: revert: now is not superset of DatasetRepository

* Fix tests after merge conflict fixes

* kamu-adapter-auth-oso: remove unused deps

* CreateDatasetFromSnapshotUseCaseImpl: cleanup of the created dataset, in case of a ReBAC property set failure

* kamu-auth-rebac: add MockRebacRepository hidden by the "testing" feature gate

* CreateDatasetFromSnapshotUseCase: check ReBAC stuff in tests

* test_created_datasets_have_the_correct_visibility_attribute(): add

* test_clearing_the_dataset_if_a_rebac_property_setting_error(): add

* CHANGELOG.md: update

* RebacRepository: absorb "mockall" import into the derive

* kamu-cli, command_needs_transaction(): add AddCommand

* GQL: DatasetsMut::{create_empty,create_from_snapshot}(): use DatasetVisibility

* CreateDatasetFromSnapshotUseCase::execute(): pass "options" by value

* Move ReBAC logic outside of Use Cases

* test_created_datasets_have_the_correct_visibility_attribute(): use OutboxImmediate

* test_clearing_the_dataset_if_a_rebac_property_setting_error(): remove outdated

* RebacServiceImpl::handle_dataset_lifecycle_deleted_message()

* DatasetLifecycleMessage::Renamed(): introduce

* RebacServiceImpl::handle_dataset_lifecycle_renamed_message()

* InMemoryRebacRepository::delete_entity_properties(): implement

* SqliteRebacRepository::delete_entity_properties(): implement

* DatasetLifecycleMessage::Renamed(): removed -- in fact, we don't really need to handle renaming, since we store IDs

* CHANGELOG.md: add recommendation re ordering

* CHANGELOG.md: update

* command_needs_transaction(): AddCommand no longer requires a transaction

* CreateDatasetUseCaseImpl: update todo

* InternalError::bail(): simplify implementation

* CreateDatasetUseCase::execute(): add "options" argument

* CreateDatasetFromSnapshotUseCase::execute(): use CreateDatasetUseCaseOptions as type

* RebacServiceImpl: remove pub modifier for message handlers

* Remove unused kamu-auth-rebac-{inmem,services} crates from tests

* RebacService::{delete_dataset_properties,delete_account_dataset_relation}(): idempotent errors

* RebacServiceImpl: do not lose errors context

* MultiTenantRebacDatasetLifecycleMessageConsumer: extract

* DatasetVisibility: PubliclyAvailable -> Public

* kamu-cli, value_parse_dataset_visibility(): reuse serde_yaml

* CreateDatasetUseCaseOptions: add the Default derive

* kamu-auth-rebac-services: absorb consumer tests

* CHANGELOG.md: correct after rebase

* DatasetVisibility: rename is_publicly_available() to is_public()

* RebacServiceImpl::delete_dataset_properties(): remove an outdated todo

* kamu-auth-rebac-services, test_rebac_properties_added(): update var names for readability

* DatasetLifecycleMessageCreated:dataset_visibility(): add deserialization default
  • Loading branch information
s373r authored Aug 28, 2024
1 parent 81f2327 commit 9f952a5
Show file tree
Hide file tree
Showing 92 changed files with 1,024 additions and 124 deletions.
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,20 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

<!--- Recommendation: for ease of reading, use the following order: -->
<!--- - Added -->
<!--- - Changed -->
<!--- - Fixed -->

## Unreleased
### Added
- Private Datasets, ReBAC integration:
- ReBAC properties update based on `DatasetLifecycleMessage`'s:
- `kamu add`: added hidden `--visibility private|public` argument, assumed to be used in multi-tenant case
- GQL: `DatasetsMut`:
- `createEmpty()`: added optional `datasetVisibility` argument
- `createFromSnapshot()`: added optional `datasetVisibility` argument

## [0.198.0] - 2024-08-27
### Changed
- If a polling/push source does not declare a `read` schema or a `preprocess` step (which is the case when ingesting data from a file upload) we apply the following new inference rules:
Expand Down
11 changes: 11 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ DROP INDEX IF EXISTS idx_account_token_name;

CREATE UNIQUE INDEX idx_account_token_name
ON access_tokens (account_id, token_name)
WHERE revoked_at IS NULL;
WHERE revoked_at IS NULL;
9 changes: 7 additions & 2 deletions resources/schema.gql
Original file line number Diff line number Diff line change
Expand Up @@ -703,6 +703,11 @@ type DatasetPermissions {

scalar DatasetRef

enum DatasetVisibility {
PRIVATE
PUBLIC
}

type Datasets {
"""
Returns dataset by its ID
Expand Down Expand Up @@ -730,11 +735,11 @@ type DatasetsMut {
"""
Creates a new empty dataset
"""
createEmpty(datasetKind: DatasetKind!, datasetAlias: DatasetAlias!): CreateDatasetResult!
createEmpty(datasetKind: DatasetKind!, datasetAlias: DatasetAlias!, datasetVisibility: DatasetVisibility): CreateDatasetResult!
"""
Creates a new dataset from provided DatasetSnapshot manifest
"""
createFromSnapshot(snapshot: String!, snapshotFormat: MetadataManifestFormat!): CreateDatasetFromSnapshotResult!
createFromSnapshot(snapshot: String!, snapshotFormat: MetadataManifestFormat!, datasetVisibility: DatasetVisibility): CreateDatasetFromSnapshotResult!
}

"""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ impl DatasetAuthorizerHarness {
alias,
MetadataFactory::metadata_block(MetadataFactory::seed(DatasetKind::Root).build())
.build_typed(),
Default::default(),
)
.await
.unwrap()
Expand Down
2 changes: 1 addition & 1 deletion src/adapter/graphql/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,8 @@ container-runtime = { workspace = true }
messaging-outbox = { workspace = true }
kamu-accounts-inmem = { workspace = true }
kamu-accounts-services = { workspace = true }
kamu-datasets-services = { workspace = true }
kamu-datasets-inmem = { workspace = true }
kamu-datasets-services = { workspace = true }
kamu-flow-system-inmem = { workspace = true }
kamu-task-system-inmem = { workspace = true }
kamu-task-system-services = { workspace = true }
Expand Down
23 changes: 19 additions & 4 deletions src/adapter/graphql/src/mutations/datasets_mut.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
// the Business Source License, use of this software will be governed
// by the Apache License, Version 2.0.

use kamu_core::{self as domain, DatasetRepositoryExt};
use kamu_core::{self as domain, CreateDatasetUseCaseOptions, DatasetRepositoryExt};
use opendatafabric as odf;

use crate::mutations::DatasetMut;
Expand Down Expand Up @@ -39,6 +39,9 @@ impl DatasetsMut {
ctx: &Context<'_>,
dataset_kind: DatasetKind,
dataset_alias: DatasetAlias,
// TODO: Private Datasets: GQL: make new parameters mandatory, after frontend update
// https://github.com/kamu-data/kamu-cli/issues/780
dataset_visibility: Option<DatasetVisibility>,
) -> Result<CreateDatasetResult> {
match self
.create_from_snapshot_impl(
Expand All @@ -48,6 +51,7 @@ impl DatasetsMut {
kind: dataset_kind.into(),
metadata: Vec::new(),
},
dataset_visibility.map(Into::into).unwrap_or_default(),
)
.await?
{
Expand All @@ -69,6 +73,9 @@ impl DatasetsMut {
ctx: &Context<'_>,
snapshot: String,
snapshot_format: MetadataManifestFormat,
// TODO: Private Datasets: GQL: make new parameters mandatory, after frontend update
// https://github.com/kamu-data/kamu-cli/issues/780
dataset_visibility: Option<DatasetVisibility>,
) -> Result<CreateDatasetFromSnapshotResult> {
use odf::serde::DatasetSnapshotDeserializer;

Expand All @@ -94,23 +101,31 @@ impl DatasetsMut {
}
};

self.create_from_snapshot_impl(ctx, snapshot).await
self.create_from_snapshot_impl(
ctx,
snapshot,
dataset_visibility.map(Into::into).unwrap_or_default(),
)
.await
}

// TODO: Multi-tenancy
// TODO: Multi-tenant resolution for derivative dataset inputs (should it only
// work by ID?)
// work by ID?)
#[allow(unused_variables)]
#[graphql(skip)]
async fn create_from_snapshot_impl(
&self,
ctx: &Context<'_>,
snapshot: odf::DatasetSnapshot,
dataset_visibility: domain::DatasetVisibility,
) -> Result<CreateDatasetFromSnapshotResult> {
let create_from_snapshot =
from_catalog::<dyn domain::CreateDatasetFromSnapshotUseCase>(ctx).unwrap();

let result = match create_from_snapshot.execute(snapshot).await {
let create_options = CreateDatasetUseCaseOptions { dataset_visibility };

let result = match create_from_snapshot.execute(snapshot, create_options).await {
Ok(result) => {
let dataset = Dataset::from_ref(ctx, &result.dataset_handle.as_local_ref()).await?;
CreateDatasetFromSnapshotResult::Success(CreateDatasetResultSuccess { dataset })
Expand Down
40 changes: 40 additions & 0 deletions src/adapter/graphql/src/scalars/dataset_visibility.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
// Copyright Kamu Data, Inc. and contributors. All rights reserved.
//
// Use of this software is governed by the Business Source License
// included in the LICENSE file.
//
// As of the Change Date specified in that file, in accordance with
// the Business Source License, use of this software will be governed
// by the Apache License, Version 2.0.

use kamu::domain;

use crate::prelude::*;

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

#[derive(Enum, Debug, Clone, Copy, PartialEq, Eq)]
pub enum DatasetVisibility {
Private,
Public,
}

impl From<domain::DatasetVisibility> for DatasetVisibility {
fn from(value: domain::DatasetVisibility) -> Self {
match value {
domain::DatasetVisibility::Private => Self::Private,
domain::DatasetVisibility::Public => Self::Public,
}
}
}

impl From<DatasetVisibility> for domain::DatasetVisibility {
fn from(value: DatasetVisibility) -> Self {
match value {
DatasetVisibility::Private => domain::DatasetVisibility::Private,
DatasetVisibility::Public => domain::DatasetVisibility::Public,
}
}
}

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
6 changes: 6 additions & 0 deletions src/adapter/graphql/src/scalars/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ mod data_schema;
mod dataset_endpoints;
mod dataset_env_var;
mod dataset_id_name;
mod dataset_visibility;
mod engine_desc;
mod event_id;
mod flow_configuration;
Expand All @@ -35,6 +36,7 @@ pub(crate) use data_schema::*;
pub(crate) use dataset_endpoints::*;
pub(crate) use dataset_env_var::*;
pub(crate) use dataset_id_name::*;
pub(crate) use dataset_visibility::*;
pub(crate) use engine_desc::*;
pub(crate) use event_id::*;
pub(crate) use flow_configuration::*;
Expand All @@ -47,6 +49,8 @@ pub(crate) use pagination::*;
pub(crate) use task_id::*;
pub(crate) use task_status_outcome::*;

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

macro_rules! simple_scalar {
($name: ident, $source_type: ty) => {
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
Expand Down Expand Up @@ -92,3 +96,5 @@ macro_rules! simple_scalar {
}

pub(crate) use simple_scalar;

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Original file line number Diff line number Diff line change
Expand Up @@ -741,6 +741,7 @@ impl FlowConfigHarness {
.name(dataset_alias)
.push_event(MetadataFactory::set_polling_source().build())
.build(),
Default::default(),
)
.await
.unwrap()
Expand Down
1 change: 1 addition & 0 deletions src/adapter/graphql/tests/tests/test_gql_data.rs
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ async fn create_test_dataset(
&DatasetAlias::new(account_name, DatasetName::new_unchecked("foo")),
MetadataFactory::metadata_block(MetadataFactory::seed(DatasetKind::Root).build())
.build_typed(),
Default::default(),
)
.await
.unwrap()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -387,6 +387,7 @@ impl DatasetEnvVarsHarness {
.name("foo")
.push_event(MetadataFactory::set_polling_source().build())
.build(),
Default::default(),
)
.await
.unwrap()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1617,6 +1617,7 @@ impl FlowConfigHarness {
.name("foo")
.push_event(MetadataFactory::set_polling_source().build())
.build(),
Default::default(),
)
.await
.unwrap()
Expand All @@ -1639,6 +1640,7 @@ impl FlowConfigHarness {
.build(),
)
.build(),
Default::default(),
)
.await
.unwrap()
Expand Down
2 changes: 2 additions & 0 deletions src/adapter/graphql/tests/tests/test_gql_dataset_flow_runs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3257,6 +3257,7 @@ impl FlowRunsHarness {
.name("foo")
.push_event(MetadataFactory::set_polling_source().build())
.build(),
Default::default(),
)
.await
.unwrap()
Expand All @@ -3279,6 +3280,7 @@ impl FlowRunsHarness {
.build(),
)
.build(),
Default::default(),
)
.await
.unwrap()
Expand Down
2 changes: 2 additions & 0 deletions src/adapter/graphql/tests/tests/test_gql_datasets.rs
Original file line number Diff line number Diff line change
Expand Up @@ -748,6 +748,7 @@ impl GraphQLDatasetsHarness {
.kind(DatasetKind::Root)
.push_event(MetadataFactory::set_polling_source().build())
.build(),
Default::default(),
)
.await
.unwrap()
Expand All @@ -774,6 +775,7 @@ impl GraphQLDatasetsHarness {
.build(),
)
.build(),
Default::default(),
)
.await
.unwrap()
Expand Down
1 change: 1 addition & 0 deletions src/adapter/graphql/tests/tests/test_gql_metadata.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ async fn test_current_push_sources() {
.kind(DatasetKind::Root)
.name("foo")
.build(),
Default::default(),
)
.await
.unwrap();
Expand Down
3 changes: 3 additions & 0 deletions src/adapter/graphql/tests/tests/test_gql_metadata_chain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ async fn test_metadata_chain_events() {
event: MetadataFactory::seed(DatasetKind::Root).build(),
sequence_number: 0,
},
Default::default(),
)
.await
.unwrap();
Expand Down Expand Up @@ -186,6 +187,7 @@ async fn metadata_chain_append_event() {
.name("foo")
.kind(DatasetKind::Root)
.build(),
Default::default(),
)
.await
.unwrap();
Expand Down Expand Up @@ -270,6 +272,7 @@ async fn metadata_update_readme_new() {
.name("foo")
.kind(DatasetKind::Root)
.build(),
Default::default(),
)
.await
.unwrap();
Expand Down
1 change: 1 addition & 0 deletions src/adapter/graphql/tests/tests/test_gql_search.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ async fn test_search_query() {
.kind(DatasetKind::Root)
.push_event(MetadataFactory::set_polling_source().build())
.build(),
Default::default(),
)
.await
.unwrap();
Expand Down
2 changes: 1 addition & 1 deletion src/adapter/http/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -78,9 +78,9 @@ uuid = { version = "1", default-features = false, features = ["v4"] }

[dev-dependencies]
container-runtime = { workspace = true }
kamu-accounts-inmem = { workspace = true }
kamu-accounts-services = { workspace = true }
kamu-datasets-services = { workspace = true }
kamu-accounts-inmem = { workspace = true }
kamu-ingest-datafusion = { workspace = true }
messaging-outbox = { workspace = true }

Expand Down
Loading

0 comments on commit 9f952a5

Please sign in to comment.