Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic dataset module and specific s3_datasets module - part 6 (Frontend) #1292

Merged
merged 41 commits into from
May 22, 2024

Conversation

dlpzx
Copy link
Contributor

@dlpzx dlpzx commented May 21, 2024

Feature or Bugfix

  • Feature

Detail

As explained in the design for #1123 we are trying to implement a generic datasets_base module that can be used by any type of datasets in a generic way.

In this PR we:

  • Create DatasetsBase module in frontend. Depends on S3_Datasets module
  • Move DatasetsList view and DatasetListItem component to DatasetsBase
  • Add CreateDataset modal that allows multiple types of datasets creation
  • Fix routes and redirects to point at /datasets/ or any other /X-dataset/
  • Move dataset_base services. In backend/datasets_base/api we define the following queries that are good candidates to become part of the DatasetsBase module
    • listDatasets - only used in DatasetsBase/DatasetList view - it should be in DatasetsBase/services
    • listOwnedDatasets - only used in Shares/SharesBoxList view - it should be in Shares/services
    • listDatasetsCreatedInEnvironment - only used in Environments/EnvDataset tab - it should be in Environment/services

If we want to keep everything clean we could rename all "datasets" as "s3_datasets" or equivalent in the S3_Datasets module. Because it is a cosmetic change that would pollute the PR a lot I have decided not to include it.

⚠️ UPDATE: Next steps
The Data Shared With You table in Environments>Datasets tab needs a remake. It contains references to each type of item and it is very coupled with s3-dataset-shares. For the moment I just made it work for the changes of s3-datasets, but when completing the work in #1283 we should fix this. Maybe in favor of DataGrid

Relates

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

  • Does this PR introduce or modify any input fields or queries - this includes
    fetching data from storage outside the application (e.g. a database, an S3 bucket)?
    • Is the input sanitized?
    • What precautions are you taking before deserializing the data you consume?
    • Is injection prevented by parametrizing queries?
    • Have you ensured no eval or similar functions are used?
  • Does this PR introduce any functionality or component that requires authorization?
    • How have you ensured it respects the existing AuthN/AuthZ mechanisms?
    • Are you logging failed auth attempts?
  • Are you using or adding any cryptographic features?
    • Do you use a standard proven implementations?
    • Are the used keys controlled by the customer? Where are they stored?
  • Are you introducing any new policies/roles/users?
    • Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

dlpzx added 30 commits May 6, 2024 14:33
…t-model-refactoring-2' into feat/generic-dataset-model-refactoring-3
…eric-dataset-model-refactoring-3

# Conflicts:
#	backend/dataall/modules/dataset_sharing/services/dataset_sharing_service.py
#	backend/dataall/modules/s3_datasets/api/dataset/resolvers.py
#	backend/dataall/modules/s3_datasets/db/dataset_models.py
#	backend/dataall/modules/s3_datasets/services/dataset_service.py
#	backend/dataall/modules/s3_datasets/services/dataset_table_service.py
dlpzx added 2 commits May 21, 2024 16:52
…eric-dataset-model-refactoring-6

# Conflicts:
#	backend/dataall/modules/datasets_base/db/dataset_repositories.py
#	backend/dataall/modules/datasets_base/services/dataset_list_service.py
#	frontend/src/modules/S3_Datasets/views/DatasetView.js
@dlpzx dlpzx marked this pull request as ready for review May 21, 2024 14:55
@dlpzx dlpzx requested a review from noah-paige May 21, 2024 14:56
@dlpzx
Copy link
Contributor Author

dlpzx commented May 22, 2024

Testing locally:

  • listDatasets from the menu side bar
  • click on a specific Dataset > go to DatasetView
  • createFolder, click Folder, edit Folder
  • click Table, edit table
  • hit NewDataset button from listDatasets > create new dataset > at the end it redirects to DatasetView
  • hit NewDataset button from listDatasets > import new dataset
  • Go to Catalog, click any Dataset/Table/Folder and get redirected to DatasetView/FodlerView/TableView
  • from Catalog NewDataset button, window opens
  • in Environments Dataset tab lists owned and shared items

@dlpzx dlpzx requested a review from noah-paige May 22, 2024 06:13
@dlpzx dlpzx force-pushed the feat/generic-dataset-model-refactoring-6 branch from b9d38de to 55531b1 Compare May 22, 2024 06:16
@@ -7,7 +7,8 @@ export const CatalogsModule = {
resolve_dependency: () => {
return (
getModuleActiveStatus(ModuleNames.S3_DATASETS) ||
getModuleActiveStatus(ModuleNames.DASHBOARDS)
getModuleActiveStatus(ModuleNames.DASHBOARDS) ||
getModuleActiveStatus(ModuleNames.DATASETS_BASE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: getModuleActiveStatus(ModuleNames.DATASETS_BASE) will always return false in datasets_base not in config.json

does not affect logic though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to review that, because maybe not now but I am counting on being able to define all datasets enabled like this.... I'll have a second look because by the way ModuleNames and dependencies are defined it should work

Copy link
Contributor

@noah-paige noah-paige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! Approving

@dlpzx dlpzx merged commit 372ac51 into main May 22, 2024
9 checks passed
@dlpzx dlpzx deleted the feat/generic-dataset-model-refactoring-6 branch June 6, 2024 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants