Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-approval for datasets #925

Closed
zsaltys opened this issue Dec 18, 2023 · 6 comments · Fixed by #988
Closed

Auto-approval for datasets #925

zsaltys opened this issue Dec 18, 2023 · 6 comments · Fixed by #988
Labels
effort: medium priority: medium status: in-progress This issue has been picked and is being implemented type: newfeature New feature request
Milestone

Comments

@zsaltys
Copy link
Contributor

zsaltys commented Dec 18, 2023

Is your feature request related to a problem? Please describe.
Sometimes teams publish public data for which they are happy to give everyone access without any approvals. Having to always request approval just adds unnecessary delays and interrupts the approving team as they have to monitor request emails and approve requests.

Describe the solution you'd like
There should be a simple editable checkbox on the dataset that allows to mark a dataset as auto-approve. If a dataset is marked an auto-approve then after submit is clicked the share should immediately go through without an approve step.

@dlpzx dlpzx added type: newfeature New feature request priority: high priority: medium status: not-picked-yet At the moment we have not picked this item. Anyone can pick it up effort: medium effort: large labels Jan 3, 2024
@dlpzx
Copy link
Contributor

dlpzx commented Jan 3, 2024

Very interesting feature @zsaltys! We can start working on ideas and designs directly in this issue. That will help us clarifying the specific tasks

@dlpzx
Copy link
Contributor

dlpzx commented Jan 3, 2024

We can split the work in 2 different tasks:

1. Include "auto-approve" option in Dataset creation and store it as metadata

This list of sub-tasks can be modified with the questions raised below:

  • Modify Dataset creation UI view and add checkbox "auto-approve"
  • Modify createDataset API with the new input field and modify the resolver to store the auto-approve info in RDS
  • Modify Dataset GraphQL object and add new column auto-approve
  • Modify Dataset RDS table and add new column auto-approve
  • Add migration script to add the new column to RDS table

Some ideas/questions to start the discussion:

  • Can we re-use existing metadata? For example, we classify the confidentiality level public, official , secret. At first I thought about re-using the public confidentiality tag as "auto-approve", but I don't think it is enough intuitive for users to infer that if a dataset is tagged that way it is automatically shared. Plus, we would need to deal with backwards compatibility.
  • Should dataset owners be able to modify the auto-approve value for an existing dataset?

2. Modify share workflow to skip approval process on auto-approved datasets

This list of sub-tasks can be modified with the questions raised below:

  • Modify the submitShareObject API resolver to execute 2 state machine transitions if auto-approve is enabled in the dataset (The sharing workflow is defined as a state machine in dataall.modules.dataset_sharing.db.share_object_repositories.ShareItemSM)

Some ideas/questions to start the discussion:

  • The proposed sub-task implements the simplest solution for this feature without modifications to ShareItemSM . We can also consider improvements in the state machine definition to make it more flexible and configurable. @noah-paige I would like to know your thoughts.

@anmolsgandhi anmolsgandhi added this to the v2.3.0 milestone Jan 5, 2024
@anmolsgandhi anmolsgandhi added status: in-progress This issue has been picked and is being implemented and removed status: not-picked-yet At the moment we have not picked this item. Anyone can pick it up labels Jan 11, 2024
@SofiaSazonova
Copy link
Contributor

@dlpzx
Currently confidentiality levels are "unclassified", "official" , "secret".
We can add one "public (auto-approval)". Anyhow, I think, it would be useful to add a legend into GUI: what level means what. May be as a floating info-note or separate link.

But it is not the solution, if we want to keep 'auto-approval' as a separate property of data set, not connected with confidentiality. May I mark e.g. the 'secret' dataset with auto-approval flag? If not, than it's better be connected to confidantiality.

@anmolsgandhi
Copy link
Contributor

To align with the described issue, I propose we make it unrelated to confidentiality levels since confidentiality standards may vary for customers. Rather than a blanket approval for say ‘unofficial/public,' let's introduce a new field (checkbox) for each dataset, enabling auto-approval. Some datasets may be public but the dataset owner may still want to have control over when it should be approved for use. Essentially. we should allow users to mark datasets as auto-approved during dataset creation/import. we store that flag and use it at the time of share request consumption to skip the approval process. Also, this should also retroactively work for already created datasets and the users should able to update metadata and check the auto approve box which would then make that existing dataset auto approved for new shares. Thought? @SofiaSazonova @noah-paige @dlpzx

@noah-paige
Copy link
Contributor

Agree with the above - think keeping auto-approve separate from confidentiality may be best

Currently confidentiality affects the level of data preview that is allowed but also could be extended in the future to change the immutability/editability of the dataset parameters or similar.

I think this new feature of auto-approval can be a separate standalone boolean property of the dataset specific to share requests

Nonetheless, I think adding a tooltip or some legend in the UI to make sense of what each confidentiality level means could be a nice enhancement as well as it is not very intuitive to new data.all users as of now

@dlpzx
Copy link
Contributor

dlpzx commented Jan 12, 2024

To not repeat the same, I agree with the above. For simplicity and to make users aware of the "auto-approve" let's keep it a separate feature

@noah-paige noah-paige linked a pull request Jan 19, 2024 that will close this issue
noah-paige pushed a commit that referenced this issue Jan 23, 2024
### Feature or Bugfix
<!-- please choose -->
- Feature


### Detail
- AutoApprove property was added to UI and API
- If it is True, then the sharing request is approved right after
submission

### Relates
#925

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)? Yes
  - Is the input sanitized? No
- What precautions are you taking before deserializing the data you
consume? Default deserialiser is used
  - Is injection prevented by parametrizing queries? YES
  - Have you ensured no `eval` or similar functions are used? N/A
- Does this PR introduce any functionality or component that requires
authorization? No
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features? No
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users? No
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Sofia Sazonova <sazonova@amazon.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort: medium priority: medium status: in-progress This issue has been picked and is being implemented type: newfeature New feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants