Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quicksight groups mapped with data.all groups #86

Open
dlpzx opened this issue Jul 13, 2022 · 4 comments
Open

Quicksight groups mapped with data.all groups #86

dlpzx opened this issue Jul 13, 2022 · 4 comments
Labels
priority: medium status: not-picked-yet At the moment we have not picked this item. Anyone can pick it up type: enhancement Feature enhacement

Comments

@dlpzx
Copy link
Contributor

dlpzx commented Jul 13, 2022

Is your idea related to a problem? Please describe.
In the current implementation when users start a Quicksight session they are added to a single default group called 'dataall'. All new users are added to this group. They have access to the whole Glue Catalog in the account.

Describe the solution you'd like
I would like to use groups in Quicksight the same way that I use teams in data.all. That means that when users start a Quicksight session they should start a session with a team and in this session they see only the data owned or shared with that data.all team.

Drafted solution
PART 1: groups and users (~16 days)

  • Creation of Quicksight groups API when we "invite a team" to an Environment that has Quicksight enabled.
    • backend changes ~ 3 days (mostly for testing)
  • Creation of Quicksight groups API when we enable Quicksight on an environment.
    • backend changes ~ 2 days (mostly for testing)
  • Creation of users = same as now, we create users when they "Start a session". We add them to the groups they belong to.
    • backend changes ~ 2 days (mostly for testing)
  • Create mechanism to sync Quicksight users with active users: with the above there is a problem, when a user is removed from a group, it is not removed from QS group automatically. We can check and remove them when they "Start a new session", but if they never log-in through data.all they are never removed. We can add a sync-QS-user-groups task on schedule
    • backend changes ~ 9 days (mostly for testing)

PART 2: data access (~ 4 days)

  • At creation of data.all dataset, we grant Lake Formation permissions to the Quicksight group correspondent to the dataset owners team
    • backend changes ~ 2 days (mostly for testing)
  • At sharing of data.all tables, we grant Lake Formation permissions to the Quicksight group correspondent to the requester team
    • backend changes ~ 2 days (mostly for testing)

------------------------------------------------

Resources needed:

  • Backend developer with experience in Python
    Around ~20days work

------------------------------------------------

PART 3 (not included as part of this feature request): data access with data sources
After part 2, data sources and data sets created in Quicksight can be shared by the creator to any user and group in Quicksight. Meaning that data access to the data in those Quicksight resources is not managed through Lake Formation or through the data.all sharing process.

We can leave the responsibility of sharing the datasets and data sources to the creators, which will always be part of the data.all dataset owner group or requester groups. If we want to implement a way in which from Quicksight users are just data consumers, then we need to work with custom permissions and data-source sharing, which is out of the scope of this issue. I will open another GitHub issue for discussion.

@dlpzx dlpzx added this to the v1.2.0 milestone Jul 13, 2022
@dlpzx dlpzx removed this from the v1.3.0 milestone Sep 27, 2022
@dlpzx dlpzx added priority: medium status: not-picked-yet At the moment we have not picked this item. Anyone can pick it up and removed data-governance labels Feb 13, 2023
@enr0c
Copy link

enr0c commented Jan 26, 2024

i do see high value for the implementation of the proposals especially for an enterprise environment. The current 'one-fits-all' approach with managing access to data via quicksight management control is not workable on a larger scale.

@dlpzx
Copy link
Contributor Author

dlpzx commented Jan 29, 2024

Hi @enr0c thanks for the response! We will try to prioritize this feature. Can you describe more in depth how do you currently use Quicksight?

  • What type of Quicksight users do you used? How are they created/mapped to IdP?
  • How many AWS accounts do they use?
  • How do you manage data access control to the teams?

@enr0c
Copy link

enr0c commented Jan 29, 2024

We are not using data.all currently, but we are assessing if we will use it in the future. One challenge we do currently see is the usage of quicksight and the fact that user that belong to an environment do have access to all dashboards, regardless if the underlying data is not accessible to them. Furthermore one needs to manually select Athena sources for users per environment.

We foresee many thousand QS users from 20-40 AWS account. Identify will be delivered by an external IdP, leveraging SAML or OIDC.
Data Access Control shall be implemented with Lakeformation

Thank you for asking :)

@enr0c
Copy link

enr0c commented Feb 2, 2024

One further clarification - The ticket says:
That means that when users start a Quicksight session they should start a session with a team and in this session they see only the data owned or shared with that data.all team.

This is already an improvement, however for our case not sufficient.

Our requirement

We foresee a multi-Account setup. In this ecosystem, multiple roles (delivered via external IdP via SAML) can be assigned to a user. Permissions are "additive", the union of the permission shall be applied.
I can be in a team, but that should not determine if I have now access to data or not. And the data permission should not be related to any team I belong to…

For important components (e.g. Quicksight or Sagemaker Studios) it is currently not working in data.all.

Example with Sagemaker Studio, can be applied 1:1 to Quicksight

  • User A creates a ML Studio, leveraging role $\color{blue}{\textsf{playersRW}}$ The ML studio is only accessible by users that has the role $\color{blue}{\textsf{playersRW}}$

Than there are two tables (in glue for example sitting in two different accounts:

  • goalkeepers, which is readable by users that have $\color{blue}{\textsf{playersRW}}$
  • fouls that is readable for users with $\color{green}{\textsf{sanctionsRW}}$
  • Sagemaker studio execution role is: $\color{blue}{\textsf{playersRW}}$

Current behavior

  • User A, having $\color{blue}{\textsf{playersRW}}$ and $\color{green}{\textsf{sanctionsRW}}$ is not able to access fouls table from within sagemaker

Expected behavior for the two users:

User Role ML Studio Access Read goalkeepers table (Quicksight, sagemaker, athena, ...) Read fouls table (Quicksight, sagemaker, athena, ...)
User A $\color{blue}{\textsf{playersRW}}$ , $\color{green}{\textsf{sanctionsRW}}$ yes yes yes
User B $\color{blue}{\textsf{playersRW}}$ yes yes no
User C $\color{green}{\textsf{sanctionsRW}}$ no no yes

@dlpzx dlpzx moved this to Backlog in Data.all Backlog Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: medium status: not-picked-yet At the moment we have not picked this item. Anyone can pick it up type: enhancement Feature enhacement
Projects
Status: Backlog
Development

No branches or pull requests

2 participants