Main v2 #683

nikpodsh · 2023-08-21T16:13:06Z

A new PR without conflicts

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Syncs modularization main branch with main (as there is a bug fix that is useful to have) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com>

This is a draft PR for showing purposes. There are still some minor issues that needs to be addressed. ### Feature or Bugfix - Refactoring ### Detail There are following changes under this PR: 1. Modularization + Refactoring of notebooks There are new modules that will play a major role in the future refactoring: * Core = contains the code need for application to operate correctly * Common = common code for all modules * Modules = the plugin/feature that can be inserted into the system (at the moment only notebooks) The other part that is related to modularization is the creation of environment parameters. Environment parameter will replace all hardcoded parameters of the environment configuration. There is a new file - config.json that allows you to configure an application configuration. All existing parameters will be migrated via db migration in AWS 2. Extracting permissions and request context (Optional for the modularization) Engine, user, and user groups had been passed as a parameter of context in the request. This had forced to pass a lot of parameters to other methods that weren't even needed. This information should be as a scope of the request session. There is a new way to retrieve the information using `RequestContext.` There is also a new way to use permission checks that require less parameters and make code cleaner. The old way was marked as deprecated 3. Restructure of the code (Optional for the modularization) Since the modularization will touch all the places in the API code it can be a good change to set a new structure of the code. There are small re-organization in notebook module to address * Allocating the resources before the validating parameters * Not clear responsibility of the classes * Mixed layers There are new structure : - resolvers = validate and pass code to service layer - service layer = bisnesss logic - repositories = database logic (all queries should be placed here) - aws = contains a wrapper client upon boto3 - cdk = all logic related to create stacks or code for ecs - tasks = code that will be executed in AWS lambda (short-living tasks) All names can be changed. ### Relates [data-dot-all#295](data-dot-all#295) By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>

### Feature or Bug-fix - Refactoring ### Detail For the development dockerfile in the frontend, instead of starting from amazon linux image and installing everything manually, it's a lot simpler to start from node image (that is also from ECR for security) and have all node-related environment pre-installed and just focus the dockerfile on project-related configuration. This PR does not introduce any update in the logic, it just updates the development dockerfile and adds a `.dockerignore` file for frontend to ignore `node_modules` and `build` folders. --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

) ### Feature or Bug-fix - Refactoring ### Details This PR does 2 things: - Makes the styling consistent across the project - Removes dead code (unused variables, functions, and imports) --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bug-fix - Refactoring ### Detail My recent PR data-dot-all#394 has a lot of changes but all of these changes are to styles only and not the code itself, this way the blame history for all the changed files is lost (it will always blame me). This PR adds the file`.git-blame-ignore-revs` which contains the id of the styling commit and the command that you need to run once to fix the blame history for you. Also Github understands that file and will show the history correctly. This PR is based on [this article](https://www.stefanjudis.com/today-i-learned/how-to-exclude-commits-from-git-blame/) and is suggested by Raul. This file works for the whole repository, and in the future one only needs to add ids of styling commits and the history will be preserved correctly. --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring (Modularization) ### Relates - Related issues data-dot-all#295 and data-dot-all#412 ### Short Summary First part of migration of `Dataset` (`DatasetTableColumn`) TL;DR :) ### Long Summary Datasets are huge. It's one of the central modules that's spread everywhere across the application. Migrating the entire Dataset piece would be very difficult task and, more importantly, even more difficult to review. Therefore, I decided to break down this work into "small" steps to make it more convenient to review. Dataset's API consist of the following items: * `Dataset` * `DatasetTable` * `DatasetTableColumn` * `DatasetLocation` * `DatasetProfiling` In this PR, there is only creation of `Dataset` module and migration of `DatasetTableColumn` (and some related to it pieces). Why? Because the plan was to migrate it, to see what issues would come up along with it and to address them here. The refactoring of `DatasetTableColumn` will be in other PR. The issues: 1) Glossaries 2) Feed 3) Long tasks for datasets 4) Redshift Glossaries rely on GraphQL UNION of different type (including datasets). Created an abstraction for glossary registration. There was an idea to change frontend, but it'd require a lot of work to do this Feed: same as glossaries. Solved the similar way. For feed, changing frontend API is more feasible, but I wanted it to be consistent with glossaries Long tasks for datasets. They migrated into tasks folder and doesn't require a dedicated loading for its code (at least for now). But there are two concerns: 1) The deployment uses a direct module folder references to run them (e.g. `dataall.modules.datasets....`, so basically when a module is deactivated, then we shouldn't deploy this tasks as well). I left a TODO for it to address in future (when we migrate all modules), but we should bear in mind that it might lead to inconsistencies. 2) There is a reference to `redshift` from long-running tasks = should be address in `redshift` module Redshift: it has some references to `datasets`. So there will be either dependencies among modules or small code duplication (if `redshift` doesn't rely hard on `datasets`) = will be addressed in `redshift` module Other changes: Fixed and improved some tests Extracted glue handler code that related to `DatasetTableColumn` Renamed import mode from tasks to handlers for async lambda. A few hacks that will go away with next refactoring :) Next steps: [Part2 ](#1) in preview :) Extract rest of datasets functionality (perhaps, in a few steps) Refactor extractor modules the same way as notebooks Extract tests to follow the same structure. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring ### Detail Refactoring of DatasetProfilingRun ### Relates - data-dot-all#295 and data-dot-all#412 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring ### Detail Refactoring of the third part of dataset: `DatasetStorageLocation` Introduced the Indexers: this code was migrated from the `indexers.py` and put into modules. Removed unused alarms (which didn't call actual alarm code) Introduced `DatasetShareService` but it seems it will be migrated to share module. All `DatasetXXXServices` will be split onto Services (business logic) and Repositories( DAO layer) in future parts ### Relates - data-dot-all#412 and data-dot-all#295 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring ### Detail Refactoring of DatasetTable: Get rid of ElasticSearch connection for every request. Created a lazy way to establish connection. ### Relates data-dot-all#412 and data-dot-all#295 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>

### Feature or Bugfix - Refactoring ### Detail Refactoring of the Dataset entity and related to it code. Refactoring for Votes Introduced DataPolicy (the same way as ServicePolicy was used used) Extracted dataset related permissions. Used new `has_tenant_permission` instead of `has_tenant_perm` that allows not to pass unused parameters ### Relates data-dot-all#412 and data-dot-all#295 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring ### Detail Creation of the mechanism to import dependencies for modules. It should help easier extract common code between modules. Now, one module can refer to another module in order to import it. Deleted `common` folder since it has been replaced replaced. Added some checks if the module was unintentionally imported + logging By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Modularization of Worksheets - Moved Worksheet related code to its own new module - Merged AthenaQueryResult object into Worksheets - Worksheet related permissions moved to new module - Removed worksheet sharing related (unused) code from the entire repo By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…ata-dot-all#463) Merge of main (v1.5.2) -> modularization main By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com> Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com> Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Dariusz Osiennik <osiend@amazon.com> Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it> Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>

### Feature or Bugfix - Refactoring ### Detail First part of the modularization of ML Studio - Created `SageMakerStudioService` and move `resolvers` business logic into the service - Redefined `resolvers` and add request validators - Redefined `db` operations and added them into `SageMakerRepository` - Created `sagemaker_studio_client` with Sagemaker Studio SDK calls - Created `ec2_client` with EC2 SDK calls - Added CloudFormation stack in the `cdk` package of the module - Added CloudFormation stack as an environment extension in the `cdk` package of the module - Modified environment creation and edit api calls and views to read `mlStudiosEnabled` from environment parameters table - Fixed tests and added migration scripts Additional: - Standardized naming of functions and objects to "sagemaker_studio_user" and avoid legacy "Notebook" references - Removed unused api definition for `getSagemakerStudioUserApps` - Cleaned up unused frontend views and API methods - Split migration scripts for environments, worksheets and mlstudio - linting Worksheets module - prettier frontend ### Relates data-dot-all#448 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com> Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com> Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Dariusz Osiennik <osiend@amazon.com> Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com> Co-authored-by: Nikita Podshivalov <nikpodsh@amazon.com> Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it> Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: nikpodsh <124577300+nikpodsh@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>

@dbalintx

### Feature or Bugfix - Feature ### Detail - For ML Studio and Environments, the original procedure of synthesizing the template and testing the resulting json has been replaced by the `cdk assertions` library recommended in the [documentation](https://docs.aws.amazon.com/cdk/v2/guide/testing.html#testing_getting_started) of CDK. - In the Environment cdk testing, the `extent` method of the `EnvironmentStackExtension` subclasses has been mocked. Now it only tests the `EnvironmentSetup` stack as if no extensions were registered. - In the MLStudio cdk testing, this PR adds a test for the `SageMakerDomainExtension` mocking the environment stack. It tests the `SageMakerDomainExtension` as a standalone. Open question: 1) The rest of cdk stacks (notebooks, pipelines, datasets...) are tested using the old method of printing the json template. Should I go ahead and migrate to using `cdk assertions` library? If so, I thought of doing it for notebooks only and sync on the testing for datasets and pipelines with @dbalintx and @nikpodsh. 2) With the `cdk assertions` library we can test the resources properties more in depth. I added some tests on the environment stack but we could add more asserts in all stacks. I will add more tests on the MLStudio stack and I made a note in the GitHub project to review this once modularization is complete. ### Relates - data-dot-all#295 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…that are inactive (data-dot-all#522) Enabled the feature to turn off tests whenever a module if inactive.

Modularization of the sharing. ### Detail Migrated the sharing part (including tasks) Removed unused methods for datasets There are a few issues that needs to be addressed before merging this PR. But it can be view, since we don't expect major changes here. ### Related data-dot-all#295 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Dariusz Osiennik <osiend@amazon.com> Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com> Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it> Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>

Modularization of data pipelines Changes: - Relevant files moved from dataall/Objects/api and from dataall/db to the newly created module - Relevant permissions extracted to the newly created module and are being used with the new decorators - Functions interacting with the DB were outsourced to repository - extracted and moved Datapipelines related code from core CDK files to the new module dataall/cdkproxy/cdk_cli_wrapper.py dataall/cdkproxy/stacks/pipeline.py dataall/cdkproxy/cdkpipeline/cdk_pipeline.py service policies - extracted and moved Datapipelines related code from core AWS handlers to the new module dataall/aws/handlers/stepfunction.py dataall/aws/handlers/codecommit.py dataall/aws/handlers/codepipeline.py dataall/aws/handlers/glue.py - added module interface for the module - unit tests updated By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix Refactoring ### Detail Refactoring of the dashboards ### Relates data-dot-all#509 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. - [x] TODO: There was a huge merge conflict. I am testing the changes after the merge --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it> Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>

Removing all Redshift related code from backend, frontend and docs. Cleaned up dataall/core folder structure ### Feature or Bugfix  - Feature - Bugfix - Refactoring ### Detail - <feature1 or bug1> - <feature2 or bug2> ### Relates - <URL or Ticket> By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Added missed permission checks. This PR depends on data-dot-all#537. Should be review after the dashboard modularization By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it> Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>

…#568) Due to the [Dockerfile change for the local frontend container](data-dot-all#396), now it requires more memory, and during docker-compose up the deployment gets SIGKILL'd. Increasing the resource limit in the docker-compose.yaml settings fixes the issue. <img width="1010" alt="Screenshot 2023-07-12 at 12 35 17" src="https://github.com/awslabs/aws-dataall/assets/132444646/e47f9ffa-6692-4ae2-b81b-babaea642ebb"> ### Feature or Bugfix  - Feature - Bugfix - Refactoring ### Detail - <feature1 or bug1> - <feature2 or bug2> ### Relates - <URL or Ticket> By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

A way to change the deployment pipeline based on the configuration file

- Ability to turn disable features via a single configuration - Define feature boundaries for - AWS token/console actions on both dataset & env - File uploads/actions

AWS calls should be done only from clients (or old handlers). Some missing handler/client have been found that haven't been migrated during the modularization. Have moved a couple of calls to AWS from the resolvers/tasks to clients

There is a new permission checker API in the modularization branch. It allows us get away from passing unused variables into the methods. The previous permission checker used a certain signature that expected special parameters passed in a certain order. Though, most of the time those passed parameters were not used in the method, but only were needed for the permission checker decorator. The other issue with the previous permission checker, that it had forced us to pass a request data as a dictionary e.g. in the resolver we had had some parameters and had to put them into dictionary and unpack them back in the method. It was not memory inefficient All in all, a new permission checker doesn't require to follow previous agreements which will us to write shorter and more concise API, we can utilize type checking and avoid unnecessary memory allocation during the object creation. The PR is waiting other PR to merged. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>

The modularization of the core part. It contains modularization of the core feature adn moving the rest in base. A core feature is the feature we expect to operate in data.all constantly (no way to turn them off) Base is some code that can't be extrated as a feature. Most likely, it's functional code like establishing database connection, opersearch indexing etc. The core features: ``` activity catalog cognito_groups environment feed notifications organizations permissions stacks tasks vote vpc ``` This PR is mostly about restructuring. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com> Co-authored-by: Balint David <dbalintx@amazon.com>

Merge changes from main into modularization-main. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com> Co-authored-by: Dhruba <117375130+marjet26@users.noreply.github.com> Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com> Co-authored-by: Srinivas Reddy <srinivasreddych@outlook.com> Co-authored-by: Balint David <dbalintx@amazon.com> Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com>

Resolving inconsistencies that were introduced during modularization. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

PR to fix issues on modularization-main - removing duplicated SSM parameter from container stack (issue came from merging of main) - adding missing imports By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Nikita Podshivalov <nikpodsh@amazon.com> Co-authored-by: dlpzx <dlpzx@amazon.com>

### Feature or Bugfix - Refactoring ### Detail - move environment CFN custom resources from `base` to `core/environment` - fix typo in dockerfile mkdir for jars of profiling job ### Relates By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Feature and refactoring ### Detail Implement a class to add IAM Policy statements to the pivotRole policies from the modules. The additional pivot role policies needed for each module can be defined inside the module code as in the following picture: <img width="1264" alt="image" src="https://github.com/awslabs/aws-dataall/assets/71252798/d82d2411-7639-494b-9783-f3e17a0d633d"> Policies that are needed independently from the modules are defined in `dataall/core/environment/cdk/pivot_role_core_policies`. I moved all permissions, including those for datasets and data_sharing inside their modules. In addition: - Splitted MLStudio and Notebooks pivot role permissions - Renamed policies added to the environment team roles and moved them to `core/environment/cdk` ### Questions - Some modules share permissions so there is some code-duplication (around 10 lines). In my opinion it is minimal, and it is not worth mixing module permissions, but I wanted to bring it up. - With this PR we can only update based on modules the `dataallPivotRole-cdk` that is created as part of the environment stack when auto-create is set to true. Should we implement something for manually created pivot roles? These are roles that are manually deployed in the environment account and cannot be handled by data.all. ### Relates - data-dot-all#610 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

1) Created Workflow for `modularization-main`. There were many issues with linting and testing, so it should be automated 2) The ML Studio extension shouldn't be deployed if there ML Studios are not enabled for the env 3) Fix of wrong env and dataset in share manager

) ### Feature or Bugfix - Refactoring ### Detail - moved dataset related custom resources to `Dataset` module - created EnvironmentExtension for the custom resources - fix one small bug in creation of datasets - the KMS alias name defaulted to `sse-s3` for created datasets, while it should be the bucketName ### Relates By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

There was a problem if we wanted to disable all modules at once: A GraphQL union type cannot be empty. We resolve all unions programmatically depending whether a module is present or not. Some unions consist of module related parts only and, when all modules are disabled, the application fails to start. To fix the issue we could try not to create a unions only, but the rest schema depends on it. To solve the issue, the following modules have been introduced: `feeds`, `catalog` and `vote`. Those are not standalone modules, but just dependencies, hence if no module using them is present they won't be loaded Also, removed the feeds from worksheets

Merge latest changes from main into modularization-main It includes changes from data-dot-all#626, data-dot-all#630, data-dot-all#648, data-dot-all#649, and data-dot-all#651 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com> Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com> Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Dariusz Osiennik <osiend@amazon.com> Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it> Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com> Co-authored-by: Dhruba <117375130+marjet26@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com> Co-authored-by: Srinivas Reddy <srinivasreddych@outlook.com> Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com> Co-authored-by: Noah Paige <noahpaig@amazon.com> Co-authored-by: dlpzx <dlpzx@amazon.com>

### Feature or Bugfix  - Feature ### Detail - Change cdk context CloudFront feature flag in `cdk.json` from `"@aws-cdk/aws-cloudfront:defaultSecurityPolicyTLSv1.2_2021": false` to `true` By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…all#665) ### Feature or Bugfix  - Merge request ### Detail - Merge modularization-main into modularization/frontend ### Relates - data-dot-all#398 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com> Co-authored-by: Raul Cajias <cajias@users.noreply.github.com> Co-authored-by: nikpodsh <124577300+nikpodsh@users.noreply.github.com> Co-authored-by: Maryam Khidir <maryamolaide95@gmail.com> Co-authored-by: Nikita Podshivalov <nikpodsh@amazon.com>

In this PR, the tests have been refactored, rearranged, and de-duplicated. Tests should be have the layout similar with the backend . This PR depends on the changes data-dot-all#643 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com> Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com> Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Dariusz Osiennik <osiend@amazon.com> Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it> Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com> Co-authored-by: Dhruba <117375130+marjet26@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com> Co-authored-by: Srinivas Reddy <srinivasreddych@outlook.com> Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com> Co-authored-by: Noah Paige <noahpaig@amazon.com> Co-authored-by: dlpzx <dlpzx@amazon.com>

The corresponding module interface for a dependency should be loaded locally - Bugfix There is a issue with the loading of modules for cdkproxy task. The issue appears since the loading of dependency happens in the module scope, but should be local for each `ModuleInterface` For instance, when there is `from dataall.modules.catalog import api` in `dataall.modules.datasets.__init__` the Catalog API will be loaded for each dataset module interfaces. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Refactoring ### Detail Make naming inside the `db` package consistent. Always use `-models` or `repositories` suffix. ### Relates - 2.0 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Nikita Podshivalov <nikpodsh@amazon.com> Co-authored-by: nikpodsh <124577300+nikpodsh@users.noreply.github.com>

### Feature or Bugfix  - Bugfix ### Detail - Add `AdministratorPolicy` as default cfn execution policy when bootstrapping linked envs ### Relates By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dlpzx <dlpzx@amazon.com>

### Feature or Bugfix  - Bugfix ### Detail - Add `BatchCreate` and `BatchDelete` glue permissions to the dataset IAM role - Needed by Glue Crawler to add tables/partitions - Add checks to environment resources before deleting environments or env groups (WIP) ### Relates - data-dot-all#670 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix  - Feature ### Detail - module enablement for `datasets`, `catalog`, and `glossaries` ### Relates - data-dot-all#398 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

### Feature or Bugfix - Bugfix ### Detail Add `resource_prefix` to Albfront class initialization ### Relates Solve data-dot-all#680 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). N/A I do not introduce any major changes. - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Merge main -> modularization-main ### Security N/A. it's a merge PR. Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com> Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com> Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Dariusz Osiennik <osiend@amazon.com> Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it> Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com> Co-authored-by: Dhruba <117375130+marjet26@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com> Co-authored-by: Srinivas Reddy <srinivasreddych@outlook.com> Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com> Co-authored-by: Noah Paige <noahpaig@amazon.com> Co-authored-by: dlpzx <dlpzx@amazon.com> Co-authored-by: Jorge Iglesias Garcia <44316552+jorgeig-space@users.noreply.github.com> Co-authored-by: Jorge Iglesias Garcia <jorgeig@amazon.de>

commit 4cb33324 Author: Admin/noahpaig-Isengard <Admin/noahpaig-Isengard> Date: Mon Sep 11 2023 12:12:11 GMT-0400 (Eastern Daylight Time) Conflicts resolved in the console. commit 9e9b0e9 Author: dlpzx <71252798+dlpzx@users.noreply.github.com> Date: Mon Sep 11 2023 10:57:24 GMT-0400 (Eastern Daylight Time) Merge branch 'main' into main-v2 commit 0eefd78 Author: dlpzx <dlpzx@amazon.com> Date: Mon Sep 11 2023 10:55:04 GMT-0400 (Eastern Daylight Time) Upgrade css-tools and revert format of yarn.lock commit 88acc19 Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Mon Sep 11 2023 08:56:30 GMT-0400 (Eastern Daylight Time) Modularization/fix glossary associations (#742) ### Feature or Bugfix  - Bugfix ### Detail - Unable to load Glossary Term Associations for `DataStorageLocation` - Pass the correct target_type of `Folder` when setting Glossary Term Links - Unable to load Glossary Term Associations when either Dashboards or Datasets Module Disabled - We have inline fragments on the `getGlossary` graphQL API that will throw an error if the ObjectType is not defined (i.e. when Dashboards or Datasets are disabled) - Since we only require a common field returned by any of the Term Association Objects (we only need `label` field) we can remove the inline fragments so there is no need to have object types defined when loading the glossary term associations - Also cleaned up some of the other API Get calls on glossary that return additional information that is unused **NOTE: This PR is branched off of #731 and should be merged ONLY AFTER the prior is to `main-v2`** ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). NA ``` - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? ``` By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dlpzx <dlpzx@amazon.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> commit 9158691 Author: dlpzx <71252798+dlpzx@users.noreply.github.com> Date: Mon Sep 11 2023 01:34:04 GMT-0400 (Eastern Daylight Time) DA v2: fix glossaries permissions and refactor catalog module (#731) ### Feature or Bugfix - Bugfix - Refactoring ### Detail - moved glossaries permissions from core to module.catalog - refractored catalog module to follow services, resolvers, db layer design - fix list_term_associations - remove Columns from glossaries registry --> they cannot be tagged - clean up unused code in glossaries ### Relates V2.0.0 release ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). `N/A` - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 75171c0 Author: nikpodsh <124577300+nikpodsh@users.noreply.github.com> Date: Fri Sep 08 2023 09:44:08 GMT-0400 (Eastern Daylight Time) Add dependency for Worksheets (#740) ### Feature or Bugfix Added datasets as dependency for worksheets ### Security N/A - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 8d95241 Author: dlpzx <71252798+dlpzx@users.noreply.github.com> Date: Fri Sep 08 2023 05:17:08 GMT-0400 (Eastern Daylight Time) DA v2: Modularized structure of feeds and votes (#735) ### Feature or Bugfix - Refactoring ### Detail - Adjust `feed` module to the new code layer design - Adjust `vote` module to the new code layer design - Clean up unused code ### Relates - V2.0 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). `N/A` - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit a4b831e Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Thu Sep 07 2023 12:43:38 GMT-0400 (Eastern Daylight Time) DA v2: Fix Handling of Delete Dataset Resources (#737) ### Feature or Bugfix  - Bugfix in V2.0 data.all code ### Detail - When deleting a dataset - we were redefining the `uri` variable used to track the datasetUri. This caused errors with: - Folders to not be removed from opensearch catalog - Folders to not be removed from opensearch catalog - Deletion of outstanding shares with no shared items on the dataset (if they exist) - Deletion of Glossary Term Links on the Dataset or Dataset Tables (if they exist) - Deletion of Resource Policies on the Dataset - Deletion of the Dataset CloudFormation Stack (if enabled) - The above handling of deletion should be resolved as of this PR ### Relates - #733 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). NA ``` - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? ``` By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit b43a247 Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Thu Sep 07 2023 09:21:39 GMT-0400 (Eastern Daylight Time) Fix/cdkpipeline local deploy (#730) ### Feature or Bugfix  - Bugfix ### Detail - Fix bug with local deployments creation of CDK Pipelines. Error Caused By: - Not registering `ImportMode.CDK_CLI_EXTENSION` since `load_modules()` for this import mode type was only being called for AWS Deployments - Incorrect path specified for the location of the cdk/ddk app created for CDK Pipelines in local deployments - ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). NA ``` - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? ``` By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 51890cb Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Wed Sep 06 2023 13:08:11 GMT-0400 (Eastern Daylight Time) DA v2: Fix Get Query on Chat/Feed Feature (#729) ### Feature or Bugfix  - Bugfix ### Detail - If 1 of the dataset, dashboard, or datapipeline modules was disabled, the query to get Feed (i.e. the Chat Feature) would fail with an error similar to error: - `[GraphQL error]: Message: Unknown type 'DataPipeline'. Did you mean 'DatasetLink'?, Location: [object Object], Path: undefined` - This is due to the fixed inline fragments on the `getFeed(...)` query that do not exists such as `... on DatasetTable {...}` - In this PR we remove the fragments as the `label` return value is not used in the `FeedComments` View ### Relates - [#723](#723) ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit fe3ed03 Author: Mo <136477937+itsmo-amzn@users.noreply.github.com> Date: Wed Sep 06 2023 05:53:00 GMT-0400 (Eastern Daylight Time) Fix frontend issues (#727) ### Feature or Bugfix  - Bugfix ### Detail - disabled `datasets` tab in `environment` view when `datasets` module is disabled - disabled module list item in environment features when module is disabled - fix issue with landing route when `catalog` module is disabled - `worksheets` tab enablement rules By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 0c22580 Author: dlpzx <71252798+dlpzx@users.noreply.github.com> Date: Tue Sep 05 2023 11:09:46 GMT-0400 (Eastern Daylight Time) DA v2: fix default permissions and update migrations scripts (#728) ### Feature or Bugfix - Bugfix ### Detail There are some issues with the permissions that appear in the invitation request Original: ![image](https://github.com/awslabs/aws-dataall/assets/71252798/3b4f409c-e9f4-4bc7-9f4b-123e4c4d0f0b) Fresh deployment (with mlStudio module disabled): <img width="500" alt="image" src="https://github.com/awslabs/aws-dataall/assets/71252798/996aae76-3069-4562-9baf-d7255d009650"> With a pre-existing deployment: ![image](https://github.com/awslabs/aws-dataall/assets/71252798/e3b82bb8-7431-4dda-90d4-7f632c4fcbbb) old deployment: ``` Invite other teams Add consumption roles create networks create pipelines create notebooks Request datasets access create datasets create redshift ---> removed! already done create ML Studio ---> renamed! already done ``` The following are new or wrong permissions in fresh deployments / backwards ``` List datasets on this environment / LIST_ENVIRONMENT_DATASETS Run athena queries / RUN_ATHENA_QUERY List datasets shared with this environments (TYPO!) / LIST_ENVIRONMENT_SHARED_WITH_OBJECTS?? nothing (mlstudio disabled) / LIST_ENVIRONMENT_SGMSTUDIO_USERS ``` This PR includes 3 fixes: 1) `RUN_ATHENA_QUERY`: Good to add, but was not there before so we need to update the description in the migration scrips 2) `LIST_ENVIRONMENT_DATASETS`, `LIST_ENVIRONMENT_SHARED_WITH_OBJECTS`: they are needed by default, so we can add a new list `ENVIRONMENT_INVITED_DEFAULT` and add them there instead of adding them in the list that is used for the toogle menu. 3) `LIST_ENVIRONMENT_SGMSTUDIO_USERS`: this permission is not used, we just need to remove it In addition some permissions have been renamed. I used the `migrations/versions/4a0618805341_rename_sgm_studio_permissions.py` script as it already handles renames Testing: [X] - Local testing of renaming and descriptions [X] - AWS testing of the permissions that appear on screen [ ] - AWS testing with an invited group - check that they can list datasets and shares in environment This is the end result for a deployment with the dashboards modules disabled: ![image](https://github.com/awslabs/aws-dataall/assets/71252798/b2222c1f-6d37-447d-bee5-e5d8521ff145) ### Relates - V2 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). `N/A` - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 12db96a Author: nikpodsh <124577300+nikpodsh@users.noreply.github.com> Date: Tue Sep 05 2023 08:18:52 GMT-0400 (Eastern Daylight Time) Add conditions on catalog indexer tasks (#725) The catalog indexer fails if all modules are disabled. Here is a small improvement to fix it Checked against: 1) all disabled 2) all enabled 3) datasets enabled only 4) dashboards enabled only ### Security `N/A` . it's a small improvement for disable/enabling tasks based on modules. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 2295e51 Author: dlpzx <71252798+dlpzx@users.noreply.github.com> Date: Mon Sep 04 2023 09:03:18 GMT-0400 (Eastern Daylight Time) fix: pin version of npm@9 for VPC facing architecture (#724) ### Feature or Bugfix - Bugfix ### Detail In the [latest release of npm](https://github.com/npm/cli/releases/tag/v10.0.0) 4 days ago, node 16 is no longer supported. Which leads to failure of the frontend image building when using VPC-facing architecture. In this PR the version of npm is fixed to the previous version. It solves the issue and it is a quick fix, but we need to upgrade to node 18. There are GitHub issues ( #655 #38 ...) dedicated to the node version upgrade, which should be done in a separate PR. I tried upgrading the version of node directly in the docker image, but it lead to deployment errors (different ones) ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? N/A for all By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit fb7632d Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Fri Sep 01 2023 09:43:56 GMT-0400 (Eastern Daylight Time) DA V2: Fix Environment Stack CDKProxy Register by default (#717) ### Feature or Bugfix  - Bugfix ### Detail - When all modules are disabled (or all modules that depend on EnvironmentSetup for their CDK Module Interface) then the Environment Stack is not imported properly or registered as a stack in the StackManager - This PR adds EnvironmentSetup to the `__init__.py` file of the `base/cdkproxy/stacks/` so Environment stack will always be registered by default - It also separates `mlstudio_stack` and mlstudio_extension` to avoid circular dependencies when importing `EnvironmentSetup` and `stack` in the same file ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). NA - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit bf77f7c Author: dlpzx <71252798+dlpzx@users.noreply.github.com> Date: Fri Sep 01 2023 04:45:00 GMT-0400 (Eastern Daylight Time) DA v2: fix issues with modularization pipelines (#705) ### Feature or Bugfix - Refactoring - datapipelines module is not following design guidelines. `api` =validation of input, `services`=check permissions and business logic - there is a lot of unused code ### Detail - move business logic to `services` - implement validation of parameters in `api` - clean up unused code in `datapipelines`: cat, ls and the `aws` clients ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). N/A - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Balint David <dbalintx@amazon.com> commit af860f6 Author: dbalintx <132444646+dbalintx@users.noreply.github.com> Date: Thu Aug 31 2023 12:16:17 GMT-0400 (Eastern Daylight Time) Fix canary user pw creation (#718) In rare cases the current way of generating a Canary user password in Cognito can result in a string containing no numerical values, hence following error is thrown during deployment ( requires **enable_cw_canaries** config parameter set to True in cdk.json): `botocore.errorfactory.InvalidPasswordException: An error occurred (InvalidPasswordException) when calling the AdminCreateUser operation: Password did not conform with password policy: Password must have numeric characters` Changing the password creation to contain at least 1 uppercase and 1 numerical character. ### Feature or Bugfix - Bugfix ### Detail - <feature1 or bug1> - <feature2 or bug2> ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). n/a - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 43eb082 Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Wed Aug 30 2023 09:07:15 GMT-0400 (Eastern Daylight Time) Add env URI to role share query (#706) ### Feature or Bugfix  - Bugfix ### Detail - If I have Group_1 that is invited to Env1 and Env2 and has existing shares for that group to Env1 - I still cannot remove the EnvGroup from Env2 (where no shares exists) - When we count resources to validate before removing group - for share objects in the `count_principal_resources()` we only filter by principalId and principalType but we also need to filter by environmentUri - This PR fixes query to add environmentURI filter when counting envGroup resources before removing groups ### Relates ### Security NA ``` Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? ``` By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 4ced415 Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Tue Aug 29 2023 14:49:18 GMT-0400 (Eastern Daylight Time) DA v2: Fix Params for RAM Invitations Table Sharing (#713) ### Feature or Bugfix  - Bugfix ### Detail - LF Cross Account Table shares was originally unable to find the correct RAM Share Invitiation because it was filtering on incorrect `Sender` and `Receiver` Accounts - Typically led to a `Insufficient Glue Permissions on GrantPermissions Operation` when attempting to share a table in v2 data.all ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). NA ``` - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? ``` By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit ad53c33 Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Mon Aug 28 2023 03:15:25 GMT-0400 (Eastern Daylight Time) DA v2: Dataset IAM Role Default DB Glue Permissions (#701) ### Feature or Bugfix  - Bugfix ### Detail - In the case where the `default` Glue Database does not already exist - Glue Script is failing with "DATASET IAM ROLE is not authorized to perform: glue:CreateDatabase on resource XXX" - Root Casue - The Glue Job needs to verify that the default database exists and if it does not it needs to create the default DB - Adding the glue:CreateDatabase permissions on the catalog and only for the default DB resolves the issue - POTENTIAL ISSUE: - In the Custom Resource of Dataset Stack we add permission for the dataset IAM Role to the default DB if it exists - In the scenario where default DB does not already exist and we create Dataset1 and Dataset2 - Then Dataset1 Role creates default DB when running profiling job and the Dataset2 IAM Role never gets proper LF Permissions - On testing - I tried a Dataset Stack update but it determines "No Change Set" on the Cloudformation Stack so the Custom Resource does not re-run to add the required permissions - A potential solution is to pass a randomly generated uuid as part of the Custom Resource properties to trigger a run of the Lambda each time? ### Relates - <URL or Ticket> ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dlpzx <dlpzx@amazon.com> commit e097d08 Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Thu Aug 24 2023 15:41:40 GMT-0400 (Eastern Daylight Time) DA v2: Worksheets View need default export (#703) ### Feature or Bugfix  - Bugfix ### Detail - Worksheet view is not loading with error: ``` Element type is invalid. Received a promise that resolves to: [missing argument]. Lazy element type must resolve to a class or function.[missing argument] ``` - From React [Docs](https://legacy.reactjs.org/docs/code-splitting.html#reactlazy): ``` React.lazy takes a function that must call a dynamic import(). This must return a Promise which resolves to a module with a default export containing a React component. ``` ### Relates ### Security NA ``` Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? ``` By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 9156189 Author: nikpodsh <124577300+nikpodsh@users.noreply.github.com> Date: Thu Aug 24 2023 04:45:43 GMT-0400 (Eastern Daylight Time) Fix chats. (#698) Fixed chats. Worksheets are no longer eligible as a chat entity. Security `N/A`, just removing a few lines of frontend code. Based on [OWASP 10](https://owasp.org/Top10/en/). Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? Is the input sanitized? What precautions are you taking before deserializing the data you consume? Is injection prevented by parametrizing queries? Have you ensured no eval or similar functions are used? Does this PR introduce any functionality or component that requires authorization? How have you ensured it respects the existing AuthN/AuthZ mechanisms? Are you logging failed auth attempts? Are you using or adding any cryptographic features? Do you use a standard proven implementations? Are the used keys controlled by the customer? Where are they stored? Are you introducing any new policies/roles/users? Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 9ab933a Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Thu Aug 24 2023 02:25:28 GMT-0400 (Eastern Daylight Time) DA v2: List Datasets Pagination (#700) ### Feature or Bugfix  - Bugfix ### Detail - When owning 3 data.all dataset with shares created and viewing the Datasets Tab in the data.all UI --> There were multiple pages showing a different number of datasets on each page (expected behavior to only have 1 page in the datasets tab showing all 3 datasets since the default page size is set to 10) - This was happening because when we call the `listDatasets` API we are returning a query that joins datasets with existing shares to ensure that we list all datasets that a user either owns, stewards, or is shared to - But we did not filter the query for distinct datasetUris --> Meaning if I own 1 dataset with 3 existing shares each on 1 table in the dataset and I the query to find all user datasets - it would return 4 records to me instead of just 1 for the dataset I own ### Security NA ``` Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? ``` By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 12d0596 Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Thu Aug 24 2023 02:24:07 GMT-0400 (Eastern Daylight Time) DA v2: Fix Lakeformation Registered Location Check Dataset Stack (#699) ### Feature or Bugfix  - Bugfix ### Detail - In v2 code, when creating/updating a dataset stack it would go back and forth creating and deleting the `Lakeformation::Resource` construct in Cloudformation which was registering/un-registering the resource - This led to issues querying the data in Athena with a `Permissions denied on S3 Path` error - The root cause was when performing `check_existing_lf_registered_location()` in the dataset stack synth we were passing the wrong variable to resolve the AWS Account Id for the role arn returned from the API call `lakeformation.describe_resource()` ### Relates - #672 - #671 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). NA - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 9a250e5 Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Wed Aug 23 2023 11:58:42 GMT-0400 (Eastern Daylight Time) Fix Delete Env Validation on Consumption Role Resource (#693) ### Feature or Bugfix  - Bugfix ### Detail - Resolve incorrect parameters passed when counting consumption role resources before handling deletes of environments ### Relates ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 37c8cb7 Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Wed Aug 23 2023 11:58:25 GMT-0400 (Eastern Daylight Time) re create pivot roles on upgrade to v2 (#696) ### Feature or Bugfix  - Bugfix ### Detail - Change name of pivot role policies created via CDK to handle upgrades from v1.6.2 to v2.X - Originally throwing an error on updates to the environment stack `res-pivotrole-cdk-policy-2 already exists in stack `arn:aws:cloudformation:eu-west-1:xxxxxxxx:stack/PIVOTROLESTACKNAME` - Add permissions required by Custom CDK Exec Role to create policy versions for stack upgrades ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit b9b64d1 Author: nikpodsh <124577300+nikpodsh@users.noreply.github.com> Date: Wed Aug 23 2023 11:56:41 GMT-0400 (Eastern Daylight Time) Fix auth issue happening on the frontend (#689) Fix of auth issue in frontend commit e0b287f Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> Date: Wed Aug 23 2023 09:29:46 GMT-0400 (Eastern Daylight Time) Fix Env Parameters Migration Script (#692) ### Feature or Bugfix  - Bugfix ### Detail - Resolve errors in migration script for environemnt parameters when upgrading from data.all v1.6.2 with pre-existing environments to beta-2.0 ### Relates - #691 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: dlpzx <dlpzx@amazon.com> commit 2fb7d97 Author: dlpzx <71252798+dlpzx@users.noreply.github.com> Date: Tue Aug 22 2023 05:09:41 GMT-0400 (Eastern Daylight Time) Add missing frontend path in albfront stage (#686) ### Feature or Bugfix - Bugfix ### Detail - Add frontend to path to dockerfile in albfront stage ### Relates - V2.0 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. commit 568a8fe Author: nikpodsh <124577300+nikpodsh@users.noreply.github.com> Date: Tue Aug 22 2023 04:36:42 GMT-0400 (Eastern Daylight Time) Main v2 (#683) A new PR without conflicts By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com> Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com> Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com> Co-authored-by: kukushking <kukushkin.anton@gmail.com> Co-authored-by: Dariusz Osiennik <osiend@amazon.com> Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com> Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it> Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com> Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com> Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com> Co-authored-by: Mohit Arora <1666133+blitzmohit@users.noreply.github.com> Co-authored-by: Balint David <dbalintx@amazon.com> Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com> Co-authored-by: Dhruba <117375130+marjet26@users.noreply.github.com> Co-authored-by: Srinivas Reddy <srinivasreddych@outlook.com> Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com> Co-authored-by: dlpzx <dlpzx@amazon.com> Co-authored-by: Noah Paige <noahpaig@amazon.com> Co-authored-by: Mo <136477937+itsmo-amzn@users.noreply.github.com> Co-authored-by: Raul Cajias <cajias@users.noreply.github.com> Co-authored-by: Maryam Khidir <maryamolaide95@gmail.com> Co-authored-by: Jorge Iglesias Garcia <44316552+jorgeig-space@users.noreply.github.com> Co-authored-by: Jorge Iglesias Garcia <jorgeig@amazon.de>

AmrSaber and others added 30 commits March 30, 2023 13:08

modularization: Disable and skip module test directories for modules …

f599a8a

…that are inactive (data-dot-all#522) Enabled the feature to turn off tests whenever a module if inactive.

Change the pipeline for the modularization (data-dot-all#545)

ea0e3e9

A way to change the deployment pipeline based on the configuration file

Add generic way to toggle data.all features (data-dot-all#538)

e870882

- Ability to turn disable features via a single configuration - Define feature boundaries for - AWS token/console actions on both dataset & env - File uploads/actions

Refactoring of aws calls (data-dot-all#550)

1b92710

AWS calls should be done only from clients (or old handlers). Some missing handler/client have been found that haven't been migrated during the modularization. Have moved a couple of calls to AWS from the resolvers/tasks to clients

Resolve modularization inconsistencies (data-dot-all#605)

5b71cac

Resolving inconsistencies that were introduced during modularization. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

dlpzx and others added 19 commits August 8, 2023 13:26

Merge remote-tracking branch 'upstream/modularization-main' into main-v2

7d88d6e

Resolve merge conflicts

0b7fd96

Changed branch for GitHub workflow

cf01bbf

nikpodsh changed the base branch from main to main-v2 August 22, 2023 08:36

nikpodsh marked this pull request as ready for review August 22, 2023 08:36

nikpodsh merged commit 568a8fe into data-dot-all:main-v2 Aug 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Main v2 #683

Main v2 #683

nikpodsh commented Aug 21, 2023

Main v2 #683

Main v2 #683

Conversation

nikpodsh commented Aug 21, 2023