Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Main v2 #683

Merged
merged 49 commits into from
Aug 22, 2023
Merged

Main v2 #683

merged 49 commits into from
Aug 22, 2023

Conversation

nikpodsh
Copy link
Contributor

A new PR without conflicts

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

AmrSaber and others added 30 commits March 30, 2023 13:08
Syncs modularization main branch with main (as there is a bug fix that
is useful to have)

---
By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com>
This is a draft PR for showing purposes. There are still some minor
issues that needs to be addressed.

### Feature or Bugfix
- Refactoring

### Detail
There are following changes under this PR:
1. Modularization + Refactoring of notebooks
There are new modules that will play a major role in the future
refactoring:

   * Core = contains the code need for application to operate correctly
   * Common = common code for all modules
* Modules = the plugin/feature that can be inserted into the system (at
the moment only notebooks)

The other part that is related to modularization is the creation of
environment parameters.
Environment parameter will replace all hardcoded parameters of the
environment configuration.
There is a new file - config.json that allows you to configure an
application configuration.
All existing parameters will be migrated via db migration in AWS

2. Extracting permissions and request context (Optional for the
modularization)

Engine, user, and user groups had been passed as a parameter of context
in the request. This had forced to pass a lot of parameters to other
methods that weren't even needed. This information should be as a scope
of the request session.
There is a new way to retrieve the information using `RequestContext.`
There is also a new way to use permission checks that require less
parameters and make code cleaner. The old way was marked as deprecated

3. Restructure of the code (Optional for the modularization)

Since the modularization will touch all the places in the API code it
can be a good change to set a new structure of the code. There are small
re-organization in notebook module to address

   * Allocating the resources before the validating parameters
   * Not clear responsibility of the classes
   * Mixed layers 
  
   There are new structure :
   - resolvers = validate and pass code to service layer
   - service layer = bisnesss logic
   - repositories = database logic (all queries should be placed here)
   - aws = contains a wrapper client upon boto3
   - cdk = all logic related to create stacks or code for ecs
- tasks = code that will be executed in AWS lambda (short-living tasks)

   All names can be changed.

### Relates
[data-dot-all#295](data-dot-all#295)



By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
### Feature or Bug-fix
- Refactoring

### Detail
For the development dockerfile in the frontend, instead of starting from
amazon linux image and installing everything manually, it's a lot
simpler to start from node image (that is also from ECR for security)
and have all node-related environment pre-installed and just focus the
dockerfile on project-related configuration.

This PR does not introduce any update in the logic, it just updates the
development dockerfile and adds a `.dockerignore` file for frontend to
ignore `node_modules` and `build` folders.

---
By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
)

### Feature or Bug-fix
- Refactoring

### Details
This PR does 2 things:
- Makes the styling consistent across the project
- Removes dead code (unused variables, functions, and imports)

---
By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bug-fix
- Refactoring

### Detail
My recent PR data-dot-all#394 has a lot
of changes but all of these changes are to styles only and not the code
itself, this way the blame history for all the changed files is lost (it
will always blame me).

This PR adds the file`.git-blame-ignore-revs` which contains the id of
the styling commit and the command that you need to run once to fix the
blame history for you.

Also Github understands that file and will show the history correctly.

This PR is based on [this
article](https://www.stefanjudis.com/today-i-learned/how-to-exclude-commits-from-git-blame/)
and is suggested by Raul.

This file works for the whole repository, and in the future one only
needs to add ids of styling commits and the history will be preserved
correctly.

---
By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Refactoring (Modularization)

### Relates
- Related issues data-dot-all#295 and data-dot-all#412 

### Short Summary
First part of migration of `Dataset` (`DatasetTableColumn`) TL;DR :) 

### Long Summary 
Datasets are huge. It's one of the central modules that's spread
everywhere across the application. Migrating the entire Dataset piece
would be very difficult task and, more importantly, even more difficult
to review. Therefore, I decided to break down this work into "small"
steps to make it more convenient to review.
Dataset's API consist of the following items:
* `Dataset`
* `DatasetTable`
* `DatasetTableColumn`
* `DatasetLocation`
* `DatasetProfiling`

In this PR, there is only creation of `Dataset` module and migration of
`DatasetTableColumn` (and some related to it pieces). Why? Because the
plan was to migrate it, to see what issues would come up along with it
and to address them here. The refactoring of `DatasetTableColumn` will
be in other PR.
The issues: 
1) Glossaries
2) Feed
3) Long tasks for datasets
4) Redshift

Glossaries rely on GraphQL UNION of different type (including datasets).
Created an abstraction for glossary registration. There was an idea to
change frontend, but it'd require a lot of work to do this

Feed: same as glossaries. Solved the similar way. For feed, changing
frontend API is more feasible, but I wanted it to be consistent with
glossaries

Long tasks for datasets. They migrated into tasks folder and doesn't
require a dedicated loading for its code (at least for now). But there
are two concerns:
1) The deployment uses a direct module folder references to run them
(e.g. `dataall.modules.datasets....`, so basically when a module is
deactivated, then we shouldn't deploy this tasks as well). I left a TODO
for it to address in future (when we migrate all modules), but we should
bear in mind that it might lead to inconsistencies.
2) There is a reference to `redshift` from long-running tasks = should
be address in `redshift` module

Redshift: it has some references to `datasets`. So there will be either
dependencies among modules or small code duplication (if `redshift`
doesn't rely hard on `datasets`) = will be addressed in `redshift`
module

Other changes:
Fixed and improved some tests 
Extracted glue handler code that related to `DatasetTableColumn`
Renamed import mode from tasks to handlers for async lambda.
A few hacks that will go away with next refactoring :) 

Next steps:
[Part2 ](#1) in preview :) 
Extract rest of datasets functionality (perhaps, in a few steps)
Refactor extractor modules the same way as notebooks
Extract tests to follow the same structure.



By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Refactoring

### Detail
Refactoring of DatasetProfilingRun

### Relates
- data-dot-all#295 and data-dot-all#412 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Refactoring

### Detail
Refactoring of the third part of dataset: `DatasetStorageLocation`
Introduced the Indexers: this code was migrated from the `indexers.py`
and put into modules.
Removed unused alarms (which didn't call actual alarm code)
Introduced `DatasetShareService` but it seems it will be migrated to
share module.
All `DatasetXXXServices` will be split onto Services (business logic)
and Repositories( DAO layer) in future parts

### Relates
- data-dot-all#412 and data-dot-all#295

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Refactoring

### Detail
Refactoring of DatasetTable:
Get rid of ElasticSearch connection for every request. Created a lazy
way to establish connection.

### Relates
data-dot-all#412 and data-dot-all#295

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
### Feature or Bugfix
- Refactoring

### Detail
Refactoring of the Dataset entity and related to it code.
Refactoring for Votes
Introduced DataPolicy (the same way as ServicePolicy was used used)
Extracted dataset related permissions. 
Used new `has_tenant_permission` instead of `has_tenant_perm` that
allows not to pass unused parameters

### Relates
data-dot-all#412 and data-dot-all#295

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Refactoring

### Detail
Creation of the mechanism to import dependencies for modules. It should
help easier extract common code between modules.
Now, one module can refer to another module in order to import it. 

Deleted `common` folder since it has been replaced replaced.
Added some checks if the module was unintentionally imported + logging


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
Modularization of Worksheets

- Moved Worksheet related code to its own new module
- Merged AthenaQueryResult object into Worksheets
- Worksheet related permissions moved to new module
- Removed worksheet sharing related (unused) code from the entire repo


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
…ata-dot-all#463)

Merge of main (v1.5.2) -> modularization main

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com>
Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com>
Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Dariusz Osiennik <osiend@amazon.com>
Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com>
Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it>
Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
### Feature or Bugfix
- Refactoring

### Detail
First part of the modularization of ML Studio 

- Created `SageMakerStudioService` and move `resolvers` business logic
into the service
- Redefined `resolvers` and add request validators
- Redefined `db` operations and added them into `SageMakerRepository`
- Created `sagemaker_studio_client` with Sagemaker Studio SDK calls
- Created `ec2_client` with EC2 SDK calls
- Added CloudFormation stack in the `cdk` package of the module
- Added CloudFormation stack as an environment extension in the `cdk`
package of the module
- Modified environment creation and edit api calls and views to read
`mlStudiosEnabled` from environment parameters table
- Fixed tests and added migration scripts

Additional:
- Standardized naming of functions and objects to
"sagemaker_studio_user" and avoid legacy "Notebook" references
- Removed unused api definition for `getSagemakerStudioUserApps` 
- Cleaned up unused frontend views and API methods
- Split migration scripts for environments, worksheets and mlstudio
- linting Worksheets module
- prettier frontend

### Relates
data-dot-all#448 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com>
Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com>
Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Dariusz Osiennik <osiend@amazon.com>
Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com>
Co-authored-by: Nikita Podshivalov <nikpodsh@amazon.com>
Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it>
Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: nikpodsh <124577300+nikpodsh@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
### Feature or Bugfix
- Feature

### Detail
- For ML Studio and Environments, the original procedure of synthesizing
the template and testing the resulting json has been replaced by the
`cdk assertions` library recommended in the
[documentation](https://docs.aws.amazon.com/cdk/v2/guide/testing.html#testing_getting_started)
of CDK.
- In the Environment cdk testing, the `extent` method of the
`EnvironmentStackExtension` subclasses has been mocked. Now it only
tests the `EnvironmentSetup` stack as if no extensions were registered.
- In the MLStudio cdk testing, this PR adds a test for the
`SageMakerDomainExtension` mocking the environment stack. It tests the
`SageMakerDomainExtension` as a standalone.

Open question: 
1) The rest of cdk stacks (notebooks, pipelines, datasets...) are tested
using the old method of printing the json template. Should I go ahead
and migrate to using `cdk assertions` library? If so, I thought of doing
it for notebooks only and sync on the testing for datasets and pipelines
with @dbalintx and @nikpodsh.

2) With the `cdk assertions` library we can test the resources
properties more in depth. I added some tests on the environment stack
but we could add more asserts in all stacks. I will add more tests on
the MLStudio stack and I made a note in the GitHub project to review
this once modularization is complete.

### Relates
- data-dot-all#295 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
…that are inactive (data-dot-all#522)

Enabled the feature to turn off tests whenever a module if inactive.
Modularization of the sharing.

### Detail
Migrated the sharing part (including tasks)
Removed unused methods for datasets

There are a few issues that needs to be addressed before merging this
PR. But it can be view, since we don't expect major changes here.

### Related
 data-dot-all#295 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Dariusz Osiennik <osiend@amazon.com>
Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it>
Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
Modularization of data pipelines

Changes:
- Relevant files moved from dataall/Objects/api and from dataall/db to
the newly created module
- Relevant permissions extracted to the newly created module and are
being used with the new decorators
- Functions interacting with the DB were outsourced to repository
- extracted and moved Datapipelines related code from core CDK files to
the new module
dataall/cdkproxy/cdk_cli_wrapper.py
dataall/cdkproxy/stacks/pipeline.py
dataall/cdkproxy/cdkpipeline/cdk_pipeline.py
service policies

- extracted and moved Datapipelines related code from core AWS handlers
to the new module
dataall/aws/handlers/stepfunction.py
dataall/aws/handlers/codecommit.py
dataall/aws/handlers/codepipeline.py
dataall/aws/handlers/glue.py

- added module interface for the module
- unit tests updated



By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
Refactoring

### Detail
Refactoring of the dashboards

### Relates
data-dot-all#509

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.


- [x] TODO: There was a huge merge conflict. I am testing the changes
after the merge

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it>
Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
Removing all Redshift related code from backend, frontend and docs.
Cleaned up dataall/core folder structure


### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- <feature1 or bug1>
- <feature2 or bug2>

### Relates
- <URL or Ticket>

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
Added missed permission checks.

This PR depends on data-dot-all#537. Should be review after the dashboard
modularization

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it>
Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
…#568)

Due to the [Dockerfile change for the local frontend
container](data-dot-all#396), now it
requires more memory, and during docker-compose up the deployment gets
SIGKILL'd.
Increasing the resource limit in the docker-compose.yaml settings fixes
the issue.

<img width="1010" alt="Screenshot 2023-07-12 at 12 35 17"
src="https://github.com/awslabs/aws-dataall/assets/132444646/e47f9ffa-6692-4ae2-b81b-babaea642ebb">


### Feature or Bugfix
<!-- please choose -->
- Feature
- Bugfix
- Refactoring

### Detail
- <feature1 or bug1>
- <feature2 or bug2>

### Relates
- <URL or Ticket>

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
A way to change the deployment pipeline based on the configuration file
- Ability to turn disable features via a single configuration
- Define feature boundaries for 
  - AWS token/console actions on both dataset & env
  - File uploads/actions
AWS calls should be done only from clients (or old handlers). 
Some missing handler/client have been found  that haven't been migrated during the
modularization. Have moved a couple of calls to AWS from the resolvers/tasks
to clients
There is a new permission checker API in the modularization branch. It
allows us get away from passing unused variables into the methods.
The previous permission checker used a certain signature that expected
special parameters passed in a certain order. Though, most of the time
those passed parameters were not used in the method, but only were
needed for the permission checker decorator.
The other issue with the previous permission checker, that it had forced
us to pass a request data as a dictionary e.g. in the resolver we had
had some parameters and had to put them into dictionary and unpack them
back in the method. It was not memory inefficient
All in all, a new permission checker doesn't require to follow previous
agreements which will us to write shorter and more concise API, we can
utilize type checking and avoid unnecessary memory allocation during the
object creation.


The PR is waiting other PR to merged.


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
The modularization of the core part.
It contains modularization of the core feature adn moving the rest in
base.
A core feature is the feature we expect to operate in data.all
constantly (no way to turn them off)
Base is some code that can't be extrated as a feature. Most likely, it's
functional code like establishing database connection, opersearch
indexing etc.

The core features:
```
activity
catalog
cognito_groups
environment
feed
notifications
organizations
permissions
stacks
tasks
vote
vpc
```
This PR is mostly about restructuring.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
Co-authored-by: Balint David <dbalintx@amazon.com>
Merge changes from main into modularization-main. 

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
Co-authored-by: Dhruba <117375130+marjet26@users.noreply.github.com>
Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: Srinivas Reddy <srinivasreddych@outlook.com>
Co-authored-by: Balint David <dbalintx@amazon.com>
Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com>
Resolving inconsistencies that were introduced during modularization.


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
PR to fix issues on modularization-main

- removing duplicated SSM parameter from container stack (issue came
from merging of main)
- adding missing imports


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Nikita Podshivalov <nikpodsh@amazon.com>
Co-authored-by: dlpzx <dlpzx@amazon.com>
dlpzx and others added 19 commits August 8, 2023 13:26
### Feature or Bugfix
- Refactoring

### Detail
- move environment CFN custom resources from `base` to
`core/environment`
- fix typo in dockerfile mkdir for jars of profiling job

### Relates


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Feature and refactoring

### Detail

Implement a class to add IAM Policy statements to the pivotRole policies
from the modules. The additional pivot role policies needed for each
module can be defined inside the module code as in the following
picture:
<img width="1264" alt="image"
src="https://github.com/awslabs/aws-dataall/assets/71252798/d82d2411-7639-494b-9783-f3e17a0d633d">

Policies that are needed independently from the modules are defined in
`dataall/core/environment/cdk/pivot_role_core_policies`.

I moved all permissions, including those for datasets and data_sharing
inside their modules.

In addition:
- Splitted MLStudio and Notebooks pivot role permissions
- Renamed policies added to the environment team roles and moved them to
`core/environment/cdk`

### Questions
- Some modules share permissions so there is some code-duplication
(around 10 lines). In my opinion it is minimal, and it is not worth
mixing module permissions, but I wanted to bring it up.
- With this PR we can only update based on modules the
`dataallPivotRole-cdk` that is created as part of the environment stack
when auto-create is set to true. Should we implement something for
manually created pivot roles? These are roles that are manually deployed
in the environment account and cannot be handled by data.all.


### Relates
- data-dot-all#610 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
1) Created Workflow for `modularization-main`. There were many issues
with linting and testing, so it should be automated
2) The ML Studio extension shouldn't be deployed if there ML Studios are
not enabled for the env
3) Fix of wrong env and dataset in share manager
)

### Feature or Bugfix
- Refactoring

### Detail
- moved dataset related custom resources to `Dataset` module
- created EnvironmentExtension for the custom resources
- fix one small bug in creation of datasets - the KMS alias name
defaulted to `sse-s3` for created datasets, while it should be the
bucketName

### Relates


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
There was a problem if we wanted to disable all modules at once:
A GraphQL union type cannot be empty. We resolve all unions
programmatically depending whether a module is present or not. Some
unions consist of module related parts only and, when all modules are
disabled, the application fails to start.
To fix the issue we could try not to create a unions only, but the rest
schema depends on it. To solve the issue, the following modules have
been introduced: `feeds`, `catalog` and `vote`. Those are not standalone
modules, but just dependencies, hence if no module using them is present
they won't be loaded

Also, removed the feeds from worksheets
Merge latest changes from main into modularization-main

It includes changes from data-dot-all#626, data-dot-all#630, data-dot-all#648, data-dot-all#649, and data-dot-all#651

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com>
Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com>
Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Dariusz Osiennik <osiend@amazon.com>
Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com>
Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it>
Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com>
Co-authored-by: Dhruba <117375130+marjet26@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
Co-authored-by: Srinivas Reddy <srinivasreddych@outlook.com>
Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com>
Co-authored-by: Noah Paige <noahpaig@amazon.com>
Co-authored-by: dlpzx <dlpzx@amazon.com>
### Feature or Bugfix
<!-- please choose -->
- Feature

### Detail
- Change cdk context CloudFront feature flag in `cdk.json` from
`"@aws-cdk/aws-cloudfront:defaultSecurityPolicyTLSv1.2_2021": false` to
`true`


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
…all#665)

### Feature or Bugfix
<!-- please choose -->
- Merge request

### Detail
- Merge modularization-main into modularization/frontend 

### Relates
- data-dot-all#398 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com>
Co-authored-by: Raul Cajias <cajias@users.noreply.github.com>
Co-authored-by: nikpodsh <124577300+nikpodsh@users.noreply.github.com>
Co-authored-by: Maryam Khidir <maryamolaide95@gmail.com>
Co-authored-by: Nikita Podshivalov <nikpodsh@amazon.com>
In this PR, the tests have been refactored, rearranged, and
de-duplicated.
Tests should be have the layout similar with the backend .
This PR depends on the changes  data-dot-all#643

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com>
Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com>
Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Dariusz Osiennik <osiend@amazon.com>
Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com>
Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it>
Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com>
Co-authored-by: Dhruba <117375130+marjet26@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
Co-authored-by: Srinivas Reddy <srinivasreddych@outlook.com>
Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com>
Co-authored-by: Noah Paige <noahpaig@amazon.com>
Co-authored-by: dlpzx <dlpzx@amazon.com>
The corresponding module interface for a dependency should be loaded
locally

- Bugfix

There is a issue with the loading of modules for cdkproxy task. The
issue appears since the loading of dependency happens in the module
scope, but should be local for each `ModuleInterface`

For instance, when there is `from dataall.modules.catalog import api` in
`dataall.modules.datasets.__init__` the Catalog API will be loaded for
each dataset module interfaces.



By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Refactoring

### Detail
Make naming inside the `db` package consistent. Always use `-models` or
`repositories` suffix.

### Relates
- 2.0

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Nikita Podshivalov <nikpodsh@amazon.com>
Co-authored-by: nikpodsh <124577300+nikpodsh@users.noreply.github.com>
### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Add `AdministratorPolicy` as default cfn execution policy when
bootstrapping linked envs

### Relates

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <dlpzx@amazon.com>
### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Add `BatchCreate` and `BatchDelete` glue permissions to the dataset
IAM role
  - Needed by Glue Crawler to add tables/partitions
- Add checks to environment resources before deleting environments or
env groups (WIP)

### Relates
- data-dot-all#670 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
<!-- please choose -->
- Feature

### Detail
- module enablement for `datasets`, `catalog`, and `glossaries`

### Relates
- data-dot-all#398 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
### Feature or Bugfix
- Bugfix

### Detail
Add `resource_prefix` to Albfront class initialization

### Relates
Solve data-dot-all#680 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

N/A I do not introduce any major changes.

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
Merge main -> modularization-main

### Security
N/A. it's a merge PR.


Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com>
Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com>
Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Dariusz Osiennik <osiend@amazon.com>
Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com>
Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it>
Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com>
Co-authored-by: Dhruba <117375130+marjet26@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
Co-authored-by: Srinivas Reddy <srinivasreddych@outlook.com>
Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com>
Co-authored-by: Noah Paige <noahpaig@amazon.com>
Co-authored-by: dlpzx <dlpzx@amazon.com>
Co-authored-by: Jorge Iglesias Garcia <44316552+jorgeig-space@users.noreply.github.com>
Co-authored-by: Jorge Iglesias Garcia <jorgeig@amazon.de>
@nikpodsh nikpodsh changed the base branch from main to main-v2 August 22, 2023 08:36
@nikpodsh nikpodsh marked this pull request as ready for review August 22, 2023 08:36
@nikpodsh nikpodsh merged commit 568a8fe into data-dot-all:main-v2 Aug 22, 2023
noah-paige added a commit that referenced this pull request Jun 25, 2024
commit 4cb33324 
Author: Admin/noahpaig-Isengard <Admin/noahpaig-Isengard> 
Date: Mon Sep 11 2023 12:12:11 GMT-0400 (Eastern Daylight Time) 

    Conflicts resolved in the console.

commit 9e9b0e9 
Author: dlpzx <71252798+dlpzx@users.noreply.github.com> 
Date: Mon Sep 11 2023 10:57:24 GMT-0400 (Eastern Daylight Time) 

    Merge branch 'main' into main-v2

commit 0eefd78 
Author: dlpzx <dlpzx@amazon.com> 
Date: Mon Sep 11 2023 10:55:04 GMT-0400 (Eastern Daylight Time) 

    Upgrade css-tools and revert format of yarn.lock


commit 88acc19 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Mon Sep 11 2023 08:56:30 GMT-0400 (Eastern Daylight Time) 

    Modularization/fix glossary associations (#742)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Unable to load Glossary Term Associations for `DataStorageLocation`
- Pass the correct target_type of `Folder` when setting Glossary Term
Links

- Unable to load Glossary Term Associations when either Dashboards or
Datasets Module Disabled
- We have inline fragments on the `getGlossary` graphQL API that will
throw an error if the ObjectType is not defined (i.e. when Dashboards or
Datasets are disabled)
- Since we only require a common field returned by any of the Term
Association Objects (we only need `label` field) we can remove the
inline fragments so there is no need to have object types defined when
loading the glossary term associations
- Also cleaned up some of the other API Get calls on glossary that
return additional information that is unused

**NOTE: This PR is branched off of #731 and should be merged ONLY AFTER
the prior is to `main-v2`**

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

NA 
```
- Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
  - Is the input sanitized?
  - What precautions are you taking before deserializing the data you consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires authorization?
  - How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?
```

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <dlpzx@amazon.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>

commit 9158691 
Author: dlpzx <71252798+dlpzx@users.noreply.github.com> 
Date: Mon Sep 11 2023 01:34:04 GMT-0400 (Eastern Daylight Time) 

    DA v2: fix glossaries permissions and refactor catalog module (#731)

### Feature or Bugfix
- Bugfix
- Refactoring

### Detail
- moved glossaries permissions from core to module.catalog
- refractored catalog module to follow services, resolvers, db layer
design
- fix list_term_associations
- remove Columns from glossaries registry --> they cannot be tagged
- clean up unused code in glossaries

### Relates
V2.0.0 release

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

`N/A` 
- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 75171c0 
Author: nikpodsh <124577300+nikpodsh@users.noreply.github.com> 
Date: Fri Sep 08 2023 09:44:08 GMT-0400 (Eastern Daylight Time) 

    Add dependency for Worksheets (#740)

### Feature or Bugfix
Added datasets as dependency for worksheets 


### Security
N/A

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 8d95241 
Author: dlpzx <71252798+dlpzx@users.noreply.github.com> 
Date: Fri Sep 08 2023 05:17:08 GMT-0400 (Eastern Daylight Time) 

    DA v2: Modularized structure of feeds and votes (#735)

### Feature or Bugfix
- Refactoring

### Detail
- Adjust `feed` module to the new code layer design
- Adjust `vote` module to the new code layer design
- Clean up unused code

### Relates
- V2.0

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

`N/A`

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit a4b831e 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Thu Sep 07 2023 12:43:38 GMT-0400 (Eastern Daylight Time) 

    DA v2: Fix Handling of Delete Dataset Resources (#737)

### Feature or Bugfix
<!-- please choose -->
- Bugfix in V2.0 data.all code


### Detail
- When deleting a dataset - we were redefining the `uri` variable used
to track the datasetUri. This caused errors with:
  - Folders to not be removed from opensearch catalog
  - Folders to not be removed from opensearch catalog
- Deletion of outstanding shares with no shared items on the dataset (if
they exist)
- Deletion of Glossary Term Links on the Dataset or Dataset Tables (if
they exist)
  - Deletion of Resource Policies on the Dataset
  - Deletion of the Dataset CloudFormation Stack (if enabled)

- The above handling of deletion should be resolved as of this PR


### Relates
- #733 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

NA
```
- Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
  - Is the input sanitized?
  - What precautions are you taking before deserializing the data you consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires authorization?
  - How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?
```

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit b43a247 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Thu Sep 07 2023 09:21:39 GMT-0400 (Eastern Daylight Time) 

    Fix/cdkpipeline local deploy (#730)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Fix bug with local deployments creation of CDK Pipelines. Error Caused
By:
- Not registering `ImportMode.CDK_CLI_EXTENSION` since `load_modules()`
for this import mode type was only being called for AWS Deployments
- Incorrect path specified for the location of the cdk/ddk app created
for CDK Pipelines in local deployments

- 

### Relates


### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

NA

```
- Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
  - Is the input sanitized?
  - What precautions are you taking before deserializing the data you consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires authorization?
  - How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?
```

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 51890cb 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Wed Sep 06 2023 13:08:11 GMT-0400 (Eastern Daylight Time) 

    DA v2: Fix Get Query on Chat/Feed Feature  (#729)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- If 1 of the dataset, dashboard, or datapipeline modules was disabled,
the query to get Feed (i.e. the Chat Feature) would fail with an error
similar to error:
- `[GraphQL error]: Message: Unknown type 'DataPipeline'. Did you mean
'DatasetLink'?, Location: [object Object], Path: undefined`
- This is due to the fixed inline fragments on the `getFeed(...)` query
that do not exists such as `... on DatasetTable {...}`
   
- In this PR we remove the fragments as the `label` return value is not
used in the `FeedComments` View


### Relates
- [#723](#723)

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit fe3ed03 
Author: Mo <136477937+itsmo-amzn@users.noreply.github.com> 
Date: Wed Sep 06 2023 05:53:00 GMT-0400 (Eastern Daylight Time) 

    Fix frontend issues (#727)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- disabled `datasets` tab in `environment` view when `datasets` module
is disabled
- disabled module list item in environment features when module is
disabled
- fix issue with landing route when `catalog` module is disabled
- `worksheets` tab enablement rules

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 0c22580 
Author: dlpzx <71252798+dlpzx@users.noreply.github.com> 
Date: Tue Sep 05 2023 11:09:46 GMT-0400 (Eastern Daylight Time) 

    DA v2: fix default permissions and update migrations scripts (#728)

### Feature or Bugfix
- Bugfix

### Detail
There are some issues with the permissions that appear in the invitation
request

Original:


![image](https://github.com/awslabs/aws-dataall/assets/71252798/3b4f409c-e9f4-4bc7-9f4b-123e4c4d0f0b)

Fresh deployment (with mlStudio module disabled):

<img width="500" alt="image"
src="https://github.com/awslabs/aws-dataall/assets/71252798/996aae76-3069-4562-9baf-d7255d009650">

With a pre-existing deployment:


![image](https://github.com/awslabs/aws-dataall/assets/71252798/e3b82bb8-7431-4dda-90d4-7f632c4fcbbb)


old deployment:
```
	Invite other teams
	Add consumption roles
	create networks
	create pipelines
	create notebooks
	Request datasets access
	create datasets
	create redshift ---> removed! already done
	create ML Studio ---> renamed! already done
```

The following are new or wrong permissions in fresh deployments /
backwards
```
	List datasets on this environment / LIST_ENVIRONMENT_DATASETS
	Run athena queries / RUN_ATHENA_QUERY
	List datasets shared with this environments (TYPO!) / LIST_ENVIRONMENT_SHARED_WITH_OBJECTS??
	nothing (mlstudio disabled) / LIST_ENVIRONMENT_SGMSTUDIO_USERS
```

This PR includes 3 fixes:
1) `RUN_ATHENA_QUERY`: Good to add, but was not there before so we need
to update the description in the migration scrips

2) `LIST_ENVIRONMENT_DATASETS`, `LIST_ENVIRONMENT_SHARED_WITH_OBJECTS`:
they are needed by default, so we can add a new list
`ENVIRONMENT_INVITED_DEFAULT` and add them there instead of adding them
in the list that is used for the toogle menu.

3) `LIST_ENVIRONMENT_SGMSTUDIO_USERS`: this permission is not used, we
just need to remove it

In addition some permissions have been renamed. I used the
`migrations/versions/4a0618805341_rename_sgm_studio_permissions.py`
script as it already handles renames

Testing:
[X] - Local testing of renaming and descriptions
[X] - AWS testing of the permissions that appear on screen
[ ] - AWS testing with an invited group - check that they can list
datasets and shares in environment

This is the end result for a deployment with the dashboards modules
disabled:

![image](https://github.com/awslabs/aws-dataall/assets/71252798/b2222c1f-6d37-447d-bee5-e5d8521ff145)


### Relates
- V2

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).
 `N/A`
- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 12db96a 
Author: nikpodsh <124577300+nikpodsh@users.noreply.github.com> 
Date: Tue Sep 05 2023 08:18:52 GMT-0400 (Eastern Daylight Time) 

    Add conditions on catalog indexer tasks (#725)

The catalog indexer fails if all modules are disabled. Here is a small
improvement to fix it

Checked against:
1)  all disabled
2)  all enabled
3) datasets enabled only
4) dashboards enabled only

### Security
`N/A` . it's a small improvement for disable/enabling tasks based on
modules. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 2295e51 
Author: dlpzx <71252798+dlpzx@users.noreply.github.com> 
Date: Mon Sep 04 2023 09:03:18 GMT-0400 (Eastern Daylight Time) 

    fix: pin version of npm@9 for VPC facing architecture (#724)

### Feature or Bugfix
- Bugfix

### Detail
In the [latest release of
npm](https://github.com/npm/cli/releases/tag/v10.0.0) 4 days ago, node
16 is no longer supported. Which leads to failure of the frontend image
building when using VPC-facing architecture. In this PR the version of
npm is fixed to the previous version. It solves the issue and it is a
quick fix, but we need to upgrade to node 18. There are GitHub issues (
#655 #38 ...) dedicated to the node version upgrade, which should be
done in a separate PR.

I tried upgrading the version of node directly in the docker image, but
it lead to deployment errors (different ones)

### Relates


### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

N/A for all

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit fb7632d 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Fri Sep 01 2023 09:43:56 GMT-0400 (Eastern Daylight Time) 

    DA V2: Fix Environment Stack CDKProxy Register by default (#717)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- When all modules are disabled (or all modules that depend on
EnvironmentSetup for their CDK Module Interface) then the Environment
Stack is not imported properly or registered as a stack in the
StackManager

- This PR adds EnvironmentSetup to the `__init__.py` file of the
`base/cdkproxy/stacks/` so Environment stack will always be registered
by default
- It also separates `mlstudio_stack` and mlstudio_extension` to avoid
circular dependencies when importing `EnvironmentSetup` and `stack` in
the same file






### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

NA
- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit bf77f7c 
Author: dlpzx <71252798+dlpzx@users.noreply.github.com> 
Date: Fri Sep 01 2023 04:45:00 GMT-0400 (Eastern Daylight Time) 

    DA v2: fix issues with modularization pipelines (#705)

### Feature or Bugfix
- Refactoring
- datapipelines module is not following design guidelines. `api`
=validation of input, `services`=check permissions and business logic
- there is a lot of unused code

### Detail
- move business logic to `services`
- implement validation of parameters in `api`
- clean up unused code in `datapipelines`: cat, ls and the `aws` clients

### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

N/A

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Balint David <dbalintx@amazon.com>

commit af860f6 
Author: dbalintx <132444646+dbalintx@users.noreply.github.com> 
Date: Thu Aug 31 2023 12:16:17 GMT-0400 (Eastern Daylight Time) 

    Fix canary user pw creation (#718)

In rare cases the current way of generating a Canary user password in
Cognito can result in a string containing no numerical values, hence
following error is thrown during deployment ( requires
**enable_cw_canaries** config parameter set to True in cdk.json):
`botocore.errorfactory.InvalidPasswordException: An error occurred
(InvalidPasswordException) when calling the AdminCreateUser operation:
Password did not conform with password policy: Password must have
numeric characters`

Changing the password creation to contain at least 1 uppercase and 1
numerical character.


### Feature or Bugfix
- Bugfix


### Detail
- <feature1 or bug1>
- <feature2 or bug2>

### Relates
- <URL or Ticket>

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

n/a

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 43eb082 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Wed Aug 30 2023 09:07:15 GMT-0400 (Eastern Daylight Time) 

    Add env URI to role share query (#706)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- If I have Group_1 that is invited to Env1 and Env2 and has existing
shares for that group to Env1 - I still cannot remove the EnvGroup from
Env2 (where no shares exists)
- When we count resources to validate before removing group - for share
objects in the `count_principal_resources()` we only filter by
principalId and principalType but we also need to filter by
environmentUri

- This PR fixes query to add environmentURI filter when counting
envGroup resources before removing groups


### Relates

### Security
NA 
```
Please answer the questions below briefly where applicable, or write `N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
  - Is the input sanitized?
  - What precautions are you taking before deserializing the data you consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires authorization?
  - How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?
```

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 4ced415 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Tue Aug 29 2023 14:49:18 GMT-0400 (Eastern Daylight Time) 

    DA v2: Fix Params for RAM Invitations Table Sharing (#713)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- LF Cross Account Table shares was originally unable to find the
correct RAM Share Invitiation because it was filtering on incorrect
`Sender` and `Receiver` Accounts
- Typically led to a `Insufficient Glue Permissions on GrantPermissions
Operation` when attempting to share a table in v2 data.all

### Relates


### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

NA

```
- Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
  - Is the input sanitized?
  - What precautions are you taking before deserializing the data you consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires authorization?
  - How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?
```


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit ad53c33 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Mon Aug 28 2023 03:15:25 GMT-0400 (Eastern Daylight Time) 

    DA v2: Dataset IAM Role Default DB Glue Permissions (#701)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- In the case where the `default` Glue Database does not already exist
- Glue Script is failing with "DATASET IAM ROLE is not authorized to
perform: glue:CreateDatabase on resource XXX"

- Root Casue
- The Glue Job needs to verify that the default database exists and if
it does not it needs to create the default DB
- Adding the glue:CreateDatabase permissions on the catalog and only for
the default DB resolves the issue

- POTENTIAL ISSUE:
- In the Custom Resource of Dataset Stack we add permission for the
dataset IAM Role to the default DB if it exists

- In the scenario where default DB does not already exist and we create
Dataset1 and Dataset2
- Then Dataset1 Role creates default DB when running profiling job and
the Dataset2 IAM Role never gets proper LF Permissions
- On testing - I tried a Dataset Stack update but it determines "No
Change Set" on the Cloudformation Stack so the Custom Resource does not
re-run to add the required permissions
- A potential solution is to pass a randomly generated uuid as part of
the Custom Resource properties to trigger a run of the Lambda each time?

### Relates
- <URL or Ticket>

### Security

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <dlpzx@amazon.com>

commit e097d08 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Thu Aug 24 2023 15:41:40 GMT-0400 (Eastern Daylight Time) 

    DA v2: Worksheets View need default export (#703)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Worksheet view is not loading with error:
```
Element type is invalid. Received a promise that resolves to: [missing argument]. Lazy element type must resolve to a class or function.[missing argument]
```

- From React
[Docs](https://legacy.reactjs.org/docs/code-splitting.html#reactlazy):
```
React.lazy takes a function that must call a dynamic import(). This must return a Promise which resolves to a module with a default export containing a React component.
```


### Relates


### Security
NA 
```
Please answer the questions below briefly where applicable, or write `N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
  - Is the input sanitized?
  - What precautions are you taking before deserializing the data you consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires authorization?
  - How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?
```

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 9156189 
Author: nikpodsh <124577300+nikpodsh@users.noreply.github.com> 
Date: Thu Aug 24 2023 04:45:43 GMT-0400 (Eastern Daylight Time) 

    Fix chats. (#698)

Fixed chats. Worksheets are no longer eligible as a chat entity.


Security `N/A`, just removing a few lines of frontend code. Based on
[OWASP 10](https://owasp.org/Top10/en/).

Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
        Is the input sanitized?
What precautions are you taking before deserializing the data you
consume?
        Is injection prevented by parametrizing queries?
        Have you ensured no eval or similar functions are used?
Does this PR introduce any functionality or component that requires
authorization?
How have you ensured it respects the existing AuthN/AuthZ mechanisms?
        Are you logging failed auth attempts?
    Are you using or adding any cryptographic features?
        Do you use a standard proven implementations?
Are the used keys controlled by the customer? Where are they stored?
    Are you introducing any new policies/roles/users?
        Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 9ab933a 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Thu Aug 24 2023 02:25:28 GMT-0400 (Eastern Daylight Time) 

    DA v2: List Datasets Pagination (#700)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- When owning 3 data.all dataset with shares created and viewing the
Datasets Tab in the data.all UI --> There were multiple pages showing a
different number of datasets on each page (expected behavior to only
have 1 page in the datasets tab showing all 3 datasets since the default
page size is set to 10)

- This was happening because when we call the `listDatasets` API we are
returning a query that joins datasets with existing shares to ensure
that we list all datasets that a user either owns, stewards, or is
shared to

- But we did not filter the query for distinct datasetUris --> Meaning
if I own 1 dataset with 3 existing shares each on 1 table in the dataset
and I the query to find all user datasets - it would return 4 records to
me instead of just 1 for the dataset I own


### Security

NA
```
Please answer the questions below briefly where applicable, or write `N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this includes
fetching data from storage outside the application (e.g. a database, an S3 bucket)?
  - Is the input sanitized?
  - What precautions are you taking before deserializing the data you consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires authorization?
  - How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?
```
By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 12d0596 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Thu Aug 24 2023 02:24:07 GMT-0400 (Eastern Daylight Time) 

    DA v2: Fix Lakeformation Registered Location Check Dataset Stack (#699)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- In v2 code, when creating/updating a dataset stack it would go back
and forth creating and deleting the `Lakeformation::Resource` construct
in Cloudformation which was registering/un-registering the resource
- This led to issues querying the data in Athena with a `Permissions
denied on S3 Path` error
- The root cause was when performing
`check_existing_lf_registered_location()` in the dataset stack synth we
were passing the wrong variable to resolve the AWS Account Id for the
role arn returned from the API call `lakeformation.describe_resource()`


### Relates
- #672
- #671

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

NA

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 9a250e5 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Wed Aug 23 2023 11:58:42 GMT-0400 (Eastern Daylight Time) 

    Fix Delete Env Validation on Consumption Role Resource (#693)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Resolve incorrect parameters passed when counting consumption role
resources before handling deletes of environments

### Relates

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 37c8cb7 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Wed Aug 23 2023 11:58:25 GMT-0400 (Eastern Daylight Time) 

    re create pivot roles on upgrade to v2 (#696)

### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- Change name of pivot role policies created via CDK to handle upgrades
from v1.6.2 to v2.X
- Originally throwing an error on updates to the environment stack
`res-pivotrole-cdk-policy-2 already exists in stack
`arn:aws:cloudformation:eu-west-1:xxxxxxxx:stack/PIVOTROLESTACKNAME`

- Add permissions required by Custom CDK Exec Role to create policy
versions for stack upgrades



### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit b9b64d1 
Author: nikpodsh <124577300+nikpodsh@users.noreply.github.com> 
Date: Wed Aug 23 2023 11:56:41 GMT-0400 (Eastern Daylight Time) 

    Fix auth issue happening on the frontend (#689)

Fix of auth issue in frontend

commit e0b287f 
Author: Noah Paige <69586985+noah-paige@users.noreply.github.com> 
Date: Wed Aug 23 2023 09:29:46 GMT-0400 (Eastern Daylight Time) 

    Fix Env Parameters Migration Script (#692)

### Feature or Bugfix
<!-- please choose -->
- Bugfix


### Detail
- Resolve errors in migration script for environemnt parameters when
upgrading from data.all v1.6.2 with pre-existing environments to
beta-2.0

### Relates
- #691 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: dlpzx <dlpzx@amazon.com>

commit 2fb7d97 
Author: dlpzx <71252798+dlpzx@users.noreply.github.com> 
Date: Tue Aug 22 2023 05:09:41 GMT-0400 (Eastern Daylight Time) 

    Add missing frontend path in albfront stage (#686)

### Feature or Bugfix
- Bugfix

### Detail
- Add frontend to path to dockerfile in albfront stage

### Relates
- V2.0

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

commit 568a8fe 
Author: nikpodsh <124577300+nikpodsh@users.noreply.github.com> 
Date: Tue Aug 22 2023 04:36:42 GMT-0400 (Eastern Daylight Time) 

    Main v2 (#683)

A new PR without conflicts 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Amr Saber <amr.m.saber.mail@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dlpzx <71252798+dlpzx@users.noreply.github.com>
Co-authored-by: wolanlu <101870655+wolanlu@users.noreply.github.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: kukushking <kukushkin.anton@gmail.com>
Co-authored-by: Dariusz Osiennik <osiend@amazon.com>
Co-authored-by: Dennis Goldner <107395339+degoldner@users.noreply.github.com>
Co-authored-by: Abdulrahman Kaitoua <abdulrahman.kaitoua@polimi.it>
Co-authored-by: akaitoua-sa <126820454+akaitoua-sa@users.noreply.github.com>
Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: Rick Bernotas <97474536+rbernotas@users.noreply.github.com>
Co-authored-by: David Mutune Kimengu <57294718+kimengu-david@users.noreply.github.com>
Co-authored-by: Mohit Arora <1666133+blitzmohit@users.noreply.github.com>
Co-authored-by: Balint David <dbalintx@amazon.com>
Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com>
Co-authored-by: Dhruba <117375130+marjet26@users.noreply.github.com>
Co-authored-by: Srinivas Reddy <srinivasreddych@outlook.com>
Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com>
Co-authored-by: dlpzx <dlpzx@amazon.com>
Co-authored-by: Noah Paige <noahpaig@amazon.com>
Co-authored-by: Mo <136477937+itsmo-amzn@users.noreply.github.com>
Co-authored-by: Raul Cajias <cajias@users.noreply.github.com>
Co-authored-by: Maryam Khidir <maryamolaide95@gmail.com>
Co-authored-by: Jorge Iglesias Garcia <44316552+jorgeig-space@users.noreply.github.com>
Co-authored-by: Jorge Iglesias Garcia <jorgeig@amazon.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants