diff --git a/docs/authorization/access-policies-guide.md b/docs/authorization/access-policies-guide.md index 2040d7ff79e99..0f741a95282bd 100644 --- a/docs/authorization/access-policies-guide.md +++ b/docs/authorization/access-policies-guide.md @@ -15,7 +15,9 @@ There are 2 types of Access Policy within DataHub:

-**Platform** Policies determine who has platform-level Privileges on DataHub. These include: +## Platform + +Policies determine who has platform-level Privileges on DataHub. These include: - Managing Users & Groups - Viewing the DataHub Analytics Page @@ -31,7 +33,9 @@ A few Platform Policies in plain English include: - The Data Platform team should be allowed to manage users & groups, view platform analytics, & manage policies themselves - John from IT should be able to invite new users -**Metadata** policies determine who can do what to which Metadata Entities. For example: +## Metadata + +Metadata policies determine who can do what to which Metadata Entities. For example: - Who can edit Dataset Documentation & Links? - Who can add Owners to a Chart? @@ -51,17 +55,14 @@ A few **Metadata** Policies in plain English include: Each of these can be implemented by constructing DataHub Access Policies. -## Access Policies Setup, Prerequisites, and Permissions - -What you need to manage Access Policies on DataHub: +## Using Access Policies +:::note Required Access * **Manage Policies** Privilege This Platform Privilege allows users to create, edit, and remove all Access Policies on DataHub. Therefore, it should only be given to those users who will be serving as Admins of the platform. The default `Admin` role has this Privilege. - - -## Using Access Policies +::: Policies can be created by first navigating to **Settings > Permissions > Policies**. @@ -270,10 +271,5 @@ Policies only affect REST APIs when the environment variable `REST_API_AUTHORIZA Policies are the lowest level primitive for granting Privileges to users on DataHub. Roles are built for convenience on top of Policies. Roles grant Privileges to actors indirectly, driven by Policies -behind the scenes. Both can be used in conjunction to grant Privileges to end users. - - - -### Related Features - -- [Roles](./roles.md) \ No newline at end of file +behind the scenes. Both can be used in conjunction to grant Privileges to end users. For more information on roles +please refer to [Authorization > Roles](./roles.md). diff --git a/docs/authorization/policies.md b/docs/authorization/policies.md index 91b0241c7d514..b393c8ffa3757 100644 --- a/docs/authorization/policies.md +++ b/docs/authorization/policies.md @@ -49,14 +49,23 @@ and so on. A Metadata Policy can be broken down into 3 parts: -1. **Actors**: The 'who'. Specific users, groups that the policy applies to. +1. **Resources**: The 'which'. Resources that the policy applies to, e.g. "All Datasets". 2. **Privileges**: The 'what'. What actions are being permitted by a policy, e.g. "Add Tags". -3. **Resources**: The 'which'. Resources that the policy applies to, e.g. "All Datasets". +3. **Actors**: The 'who'. Specific users, groups that the policy applies to. -#### Actors +#### Resources + +Resources can be associated with the policy in a number of ways. -We currently support 3 ways to define the set of actors the policy applies to: a) list of users b) list of groups, and -c) owners of the entity. You also have the option to apply the policy to all users or groups. +1. List of resource types - The entity's type for example: dataset, chart, dashboard +2. List of resource URNs +3. List of tags +4. List of domains + +:::note Important Note +The associations in the list above are an *intersection* or an _AND_ operation. For example, if the policy targets +`1. resource type: dataset` and `3. resources tagged: 'myTag'`, it will apply to datasets that are tagged with tag 'myTag'. +::: #### Privileges @@ -64,55 +73,162 @@ Check out the list of privileges [here](https://github.com/datahub-project/datahub/blob/master/metadata-utils/src/main/java/com/linkedin/metadata/authorization/PoliciesConfig.java) . Note, the privileges are semantic by nature, and does not tie in 1-to-1 with the aspect model. -All edits on the UI are covered by a privilege, to make sure we have the ability to restrict write access. +All edits on the UI are covered by a privilege, to make sure we have the ability to restrict write access. See the +[Reference](#Reference) section below. + +#### Actors + +We currently support 3 ways to define the set of actors the policy applies to: + +1. list of users (or all users) +2. list of groups (or all groups) +3. owners of the entity + +:::note Important Note +Unlike resources, the definitions for actors are a union of the actors. For example, if user `1. Alice` is associated +with the policy as well as `3. owners of the entity`. This means that Alice _OR_ any owner of +the targeted resource(s) will be included in the policy. +::: + +## Managing Policies + +Policies can be managed on the page **Settings > Permissions > Policies** page. The `Policies` tab will only +be visible to those users having the `Manage Policies` privilege. -We currently support the following: +Out of the box, DataHub is deployed with a set of pre-baked Policies. The set of default policies are created at deploy +time and can be found inside the `policies.json` file within `metadata-service/war/src/main/resources/boot`. This set of policies serves the +following purposes: + +1. Assigns immutable super-user privileges for the root `datahub` user account (Immutable) +2. Assigns all Platform privileges for all Users by default (Editable) + +The reason for #1 is to prevent people from accidentally deleting all policies and getting locked out (`datahub` super user account can be a backup) +The reason for #2 is to permit administrators to log in via OIDC or another means outside of the `datahub` root account +when they are bootstrapping with DataHub. This way, those setting up DataHub can start managing policies without friction. +Note that these privilege *can* and likely *should* be altered inside the **Policies** page of the UI. + +:::note Pro-Tip +To login using the `datahub` account, simply navigate to `/login` and enter `datahub`, `datahub`. Note that the password can be customized for your +deployment by changing the `user.props` file within the `datahub-frontend` module. Notice that JaaS authentication must be enabled. +:::note + +## Configuration + +By default, the Policies feature is *enabled*. This means that the deployment will support creating, editing, removing, and +most importantly enforcing fine-grained access policies. + +In some cases, these capabilities are not desirable. For example, if your company's users are already used to having free reign, you +may want to keep it that way. Or perhaps it is only your Data Platform team who actively uses DataHub, in which case Policies may be overkill. + +For these scenarios, we've provided a back door to disable Policies in your deployment of DataHub. This will completely hide +the policies management UI and by default will allow all actions on the platform. It will be as though +each user has *all* privileges, both of the **Platform** & **Metadata** flavor. + +To disable Policies, you can simply set the `AUTH_POLICIES_ENABLED` environment variable for the `datahub-gms` service container +to `false`. For example in your `docker/datahub-gms/docker.env`, you'd place + +``` +AUTH_POLICIES_ENABLED=false +``` + +### REST API Authorization + +Policies only affect REST APIs when the environment variable `REST_API_AUTHORIZATION` is set to `true` for GMS. Some policies only apply when this setting is enabled, marked above, and other Metadata and Platform policies apply to the APIs where relevant, also specified in the table above. + +## Reference + +For a complete list of privileges see the +privileges [here](https://github.com/datahub-project/datahub/blob/master/metadata-utils/src/main/java/com/linkedin/metadata/authorization/PoliciesConfig.java). + +### Platform-level privileges -##### Platform-level privileges These privileges are for DataHub operators to access & manage the administrative functionality of the system. -| Platform Privileges | Description | -|-----------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Generate Personal Access Tokens | Allow actor to generate personal access tokens for use with DataHub APIs. | -| Manage Domains | Allow actor to create and remove Asset Domains. | -| Manage Home Page Posts | Allow actor to create and delete home page posts | -| Manage Glossaries | Allow actor to create, edit, and remove Glossary Entities | -| Manage Tags | Allow actor to create and remove Tags. | -| Manage Business Attribute | Allow actor to create, update, delete Business Attribute | -| Manage Documentation Forms | Allow actor to manage forms assigned to assets to assist in documentation efforts. | -| Manage Policies | Allow actor to create and remove access control policies. Be careful - Actors with this privilege are effectively super users. | -| Manage Metadata Ingestion | Allow actor to create, remove, and update Metadata Ingestion sources. | -| Manage Secrets | Allow actor to create & remove Secrets stored inside DataHub. | -| Manage Users & Groups | Allow actor to create, remove, and update users and groups on DataHub. | -| View Analytics | Allow actor to view the DataHub analytics dashboard. | -| Manage All Access Tokens | Allow actor to create, list and revoke access tokens on behalf of users in DataHub. Be careful - Actors with this privilege are effectively super users that can impersonate other users. | -| Manage User Credentials | Allow actor to manage credentials for native DataHub users, including inviting new users and resetting passwords | -| Manage Public Views | Allow actor to create, update, and delete any Public (shared) Views. | -| Manage Ownership Types | Allow actor to create, update and delete Ownership Types. | -| Create Business Attribute | Allow actor to create new Business Attribute. | -| Manage Connections | Allow actor to manage connections to external DataHub platforms. | -| Restore Indices API[^1] | Allow actor to use the Restore Indices API. | -| Get Timeseries index sizes API[^1] | Allow actor to use the get Timeseries indices size API. | -| Truncate timeseries aspect index size API[^1] | Allow actor to use the API to truncate a timeseries index. | -| Get ES task status API[^1] | Allow actor to use the get task status API for an ElasticSearch task. | -| Enable/Disable Writeability API[^1] | Allow actor to enable or disable GMS writeability for data migrations. | -| Apply Retention API[^1] | Allow actor to apply retention using the API. | -| Analytics API access[^1] | Allow actor to use API read access to raw analytics data. | -| Manage Tests[^2] | Allow actor to create and remove Asset Tests. | -| View Metadata Proposals[^2] | Allow actor to view the requests tab for viewing metadata proposals. | -| Create metadata constraints[^2] | Allow actor to create metadata constraints. | -| Manage Platform Settings[^2] | Allow actor to view and change platform-level settings, like integrations & notifications. | -| Manage Monitors[^2] | Allow actor to create, update, and delete any data asset monitors, including Custom SQL monitors. Grant with care. | +#### Access & Credentials + +| Platform Privileges | Description | +|--------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Generate Personal Access Tokens | Allow actor to generate personal access tokens for use with DataHub APIs. | +| Manage Policies | Allow actor to create and remove access control policies. Be careful - Actors with this privilege are effectively super users. | +| Manage Secrets | Allow actor to create & remove Secrets stored inside DataHub. | +| Manage Users & Groups | Allow actor to create, remove, and update users and groups on DataHub. | +| Manage All Access Tokens | Allow actor to create, list and revoke access tokens on behalf of users in DataHub. Be careful - Actors with this privilege are effectively super users that can impersonate other users. | +| Manage User Credentials | Allow actor to manage credentials for native DataHub users, including inviting new users and resetting passwords | | +| Manage Connections | Allow actor to manage connections to external DataHub platforms. | + +#### Product Features + +| Platform Privileges | Description | +|-------------------------------------|--------------------------------------------------------------------------------------------------------------------| +| Manage Home Page Posts | Allow actor to create and delete home page posts | +| Manage Business Attribute | Allow actor to create, update, delete Business Attribute | +| Manage Documentation Forms | Allow actor to manage forms assigned to assets to assist in documentation efforts. | +| Manage Metadata Ingestion | Allow actor to create, remove, and update Metadata Ingestion sources. | +| Manage Features | Umbrella privilege to manage all features. | +| View Analytics | Allow actor to view the DataHub analytics dashboard. | +| Manage Public Views | Allow actor to create, update, and delete any Public (shared) Views. | +| Manage Ownership Types | Allow actor to create, update and delete Ownership Types. | +| Create Business Attribute | Allow actor to create new Business Attribute. | +| Manage Structured Properties | Manage structured properties in your instance. | +| View Tests | View Asset Tests. | +| Manage Tests[^2] | Allow actor to create and remove Asset Tests. | +| View Metadata Proposals[^2] | Allow actor to view the requests tab for viewing metadata proposals. | +| Create metadata constraints[^2] | Allow actor to create metadata constraints. | +| Manage Platform Settings[^2] | Allow actor to view and change platform-level settings, like integrations & notifications. | +| Manage Monitors[^2] | Allow actor to create, update, and delete any data asset monitors, including Custom SQL monitors. Grant with care. | [^1]: Only active if REST_API_AUTHORIZATION_ENABLED is true [^2]: DataHub Cloud only -##### Common metadata privileges +#### Entity Management + +| Platform Privileges | Description | +|-------------------------------------|------------------------------------------------------------------------------------| +| Manage Domains | Allow actor to create and remove Asset Domains. | +| Manage Glossaries | Allow actor to create, edit, and remove Glossary Entities | +| Manage Tags | Allow actor to create and remove Tags. | + +#### System Management + +| Platform Privileges | Description | +|-----------------------------------------------|--------------------------------------------------------------------------| +| Restore Indices API[^1] | Allow actor to use the Restore Indices API. | | +| Get Timeseries index sizes API[^1] | Allow actor to use the get Timeseries indices size API. | +| Truncate timeseries aspect index size API[^1] | Allow actor to use the API to truncate a timeseries index. | +| Get ES task status API[^1] | Allow actor to use the get task status API for an ElasticSearch task. | +| Enable/Disable Writeability API[^1] | Allow actor to enable or disable GMS writeability for data migrations. | +| Apply Retention API[^1] | Allow actor to apply retention using the API. | +| Analytics API access[^1] | Allow actor to use API read access to raw analytics data. | + +[^1]: Only active if REST_API_AUTHORIZATION_ENABLED is true +[^2]: DataHub Cloud only + +### Common Metadata Privileges + These privileges are to view & modify any entity within DataHub. -| Common Privileges | Description | +#### Entity Privileges + +| Entity Privileges | Description | |-------------------------------------|--------------------------------------------------------------------------------------------| | View Entity Page | Allow actor to view the entity page. | +| Edit Entity | Allow actor to edit any information about an entity. Super user privileges for the entity. | +| Delete | Allow actor to delete this entity. | +| Create Entity | Allow actor to create an entity if it doesn't exist. | +| Entity Exists | Allow actor to determine whether the entity exists. | +| Get Timeline API[^1] | Allow actor to use the GET Timeline API. | +| Get Entity + Relationships API[^1] | Allow actor to use the GET Entity and Relationships API. | +| Get Aspect/Entity Count APIs[^1] | Allow actor to use the GET Aspect/Entity Count APIs. | +| View Entity[^2] | Allow actor to view the entity in search results. | +| Share Entity[^2] | Allow actor to share an entity with another DataHub Cloud instance. | + +[^1]: Only active if REST_API_AUTHORIZATION_ENABLED is true +[^2]: DataHub Cloud only + +#### Aspect Privileges + +| Aspect Privileges | Description | +|-------------------------------------|--------------------------------------------------------------------------------------------| | Edit Tags | Allow actor to add and remove tags to an asset. | | Edit Glossary Terms | Allow actor to add and remove glossary terms to an asset. | | Edit Description | Allow actor to edit the description (documentation) of an entity. | @@ -122,35 +238,57 @@ These privileges are to view & modify any entity within DataHub. | Edit Data Product | Allow actor to edit the Data Product of an entity. | | Edit Deprecation | Allow actor to edit the Deprecation status of an entity. | | Edit Incidents | Allow actor to create and remove incidents for an entity. | -| Edit Entity | Allow actor to edit any information about an entity. Super user privileges for the entity. | | Edit Lineage | Allow actor to add and remove lineage edges for this entity. | | Edit Properties | Allow actor to edit the properties for an entity. | | Edit Owners | Allow actor to add and remove owners of an entity. | -| Delete | Allow actor to delete this entity. | -| Search API[^1] | Allow actor to access search APIs. | -| Get Aspect/Entity Count APIs[^1] | Allow actor to use the GET Aspect/Entity Count APIs. | | Get Timeseries Aspect API[^1] | Allow actor to use the GET Timeseries Aspect API. | -| Get Entity + Relationships API[^1] | Allow actor to use the GET Entity and Relationships API. | -| Get Timeline API[^1] | Allow actor to use the GET Timeline API. | + +[^1]: Only active if REST_API_AUTHORIZATION_ENABLED is true +[^2]: DataHub Cloud only + +#### Proposals + +| Proposals Privileges | Description | +|------------------------------------|--------------------------------------------------------------------------------------------| +| Propose Tags[^2] | Allow actor to propose adding a tag to an asset. | +| Propose Glossary Terms[^2] | Allow actor to propose adding a glossary term to an asset. | +| Propose Documentation[^2] | Allow actor to propose updates to an asset's documentation. | +| Manage Tag Proposals[^2] | Allow actor to manage a proposal to add a tag to an asset. | +| Manage Glossary Term Proposals[^2] | Allow actor to manage a proposal to add a glossary term to an asset. | +| Manage Documentation Proposals[^2] | Allow actor to manage a proposal update an asset's documentation | + +[^1]: Only active if REST_API_AUTHORIZATION_ENABLED is true +[^2]: DataHub Cloud only + +#### System Management + +| System Privileges | Description | +|-------------------------------------|--------------------------------------------------------------------------------------------| | Explain ElasticSearch Query API[^1] | Allow actor to use the Operations API explain endpoint. | | Produce Platform Event API[^1] | Allow actor to produce Platform Events using the API. | -| Create Entity | Allow actor to create an entity if it doesn't exist. | -| Entity Exists | Allow actor to determine whether the entity exists. | -| View Entity[^2] | Allow actor to view the entity in search results. | -| Propose Tags[^2] | Allow actor to propose adding a tag to an asset. | -| Propose Glossary Terms[^2] | Allow actor to propose adding a glossary term to an asset. | -| Propose Documentation[^2] | Allow actor to propose updates to an asset's documentation. | -| Manage Tag Proposals[^2] | Allow actor to manage a proposal to add a tag to an asset. | -| Manage Glossary Term Proposals[^2] | Allow actor to manage a proposal to add a glossary term to an asset. | -| Manage Documentation Proposals[^2] | Allow actor to manage a proposal update an asset's documentation | -| Share Entity[^2] | Allow actor to share an entity with another DataHub Cloud instance. | [^1]: Only active if REST_API_AUTHORIZATION_ENABLED is true [^2]: DataHub Cloud only -##### Specific entity-level privileges +### Specific Entity-level Privileges These privileges are not generalizable. +#### Users & Groups + +| Entity | Privilege | Description | +|--------------|-------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Group | Edit Group Members | Allow actor to add and remove members to a group. | +| Group | Manage Group Notification Settings[^2] | Allow actor to manage notification settings for a group. | +| Group | Manage Group Subscriptions[^2] | Allow actor to manage subscriptions for a group. | +| Group | Edit Contact Information | Allow actor to change the contact information such as email & chat handles. | +| User | Edit Contact Information | Allow actor to change the contact information such as email & chat handles. | +| User | Edit User Profile | Allow actor to change the user's profile including display name, bio, title, profile image, etc. | + +[^1]: Only active if REST_API_AUTHORIZATION_ENABLED is true +[^2]: DataHub Cloud only + +#### Dataset + | Entity | Privilege | Description | |--------------|-------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Dataset | View Dataset Usage | Allow actor to access dataset usage information (includes usage statistics and queries). | @@ -174,101 +312,22 @@ These privileges are not generalizable. | Domain | Manage Data Products | Allow actor to create, edit, and delete Data Products within a Domain | | GlossaryNode | Manage Direct Glossary Children | Allow actor to create and delete the direct children of this entity. | | GlossaryNode | Manage All Glossary Children | Allow actor to create and delete everything underneath this entity. | -| Group | Edit Group Members | Allow actor to add and remove members to a group. | -| Group | Manage Group Notification Settings[^2] | Allow actor to manage notification settings for a group. | -| Group | Manage Group Subscriptions[^2] | Allow actor to manage subscriptions for a group. | -| Group | Edit Contact Information | Allow actor to change the contact information such as email & chat handles. | -| User | Edit Contact Information | Allow actor to change the contact information such as email & chat handles. | -| User | Edit User Profile | Allow actor to change the user's profile including display name, bio, title, profile image, etc. | - -#### Resources - -Resource filter defines the set of resources that the policy applies to is defined using a list of criteria. Each -criterion defines a field type (like type, urn, domain), a list of field values to compare, and a -condition (like EQUALS). It essentially checks whether the field of a certain resource matches any of the input values. -Note, that if there are no criteria or resource is not set, policy is applied to ALL resources. - -For example, the following resource filter will apply the policy to datasets, charts, and dashboards under domain 1. - -```json -{ - "resources": { - "filter": { - "criteria": [ - { - "field": "TYPE", - "condition": "EQUALS", - "values": [ - "dataset", - "chart", - "dashboard" - ] - }, - { - "field": "DOMAIN", - "values": [ - "urn:li:domain:domain1" - ], - "condition": "EQUALS" - } - ] - } - } -} -``` -Where `resources` is inside the `info` aspect of a Policy. - -Supported fields are as follows - -| Field Type | Description | Example | -|---------------|------------------------|-------------------------| -| type | Type of the resource | dataset, chart, dataJob | -| urn | Urn of the resource | urn:li:dataset:... | -| domain | Domain of the resource | urn:li:domain:domainX | - -## Managing Policies - -Policies can be managed on the page **Settings > Permissions > Policies** page. The `Policies` tab will only -be visible to those users having the `Manage Policies` privilege. - -Out of the box, DataHub is deployed with a set of pre-baked Policies. The set of default policies are created at deploy -time and can be found inside the `policies.json` file within `metadata-service/war/src/main/resources/boot`. This set of policies serves the -following purposes: - -1. Assigns immutable super-user privileges for the root `datahub` user account (Immutable) -2. Assigns all Platform privileges for all Users by default (Editable) - -The reason for #1 is to prevent people from accidentally deleting all policies and getting locked out (`datahub` super user account can be a backup) -The reason for #2 is to permit administrators to log in via OIDC or another means outside of the `datahub` root account -when they are bootstrapping with DataHub. This way, those setting up DataHub can start managing policies without friction. -Note that these privilege *can* and likely *should* be altered inside the **Policies** page of the UI. - -> Pro-Tip: To login using the `datahub` account, simply navigate to `/login` and enter `datahub`, `datahub`. Note that the password can be customized for your -deployment by changing the `user.props` file within the `datahub-frontend` module. Notice that JaaS authentication must be enabled. - -## Configuration - -By default, the Policies feature is *enabled*. This means that the deployment will support creating, editing, removing, and -most importantly enforcing fine-grained access policies. - -In some cases, these capabilities are not desirable. For example, if your company's users are already used to having free reign, you -may want to keep it that way. Or perhaps it is only your Data Platform team who actively uses DataHub, in which case Policies may be overkill. -For these scenarios, we've provided a back door to disable Policies in your deployment of DataHub. This will completely hide -the policies management UI and by default will allow all actions on the platform. It will be as though -each user has *all* privileges, both of the **Platform** & **Metadata** flavor. -To disable Policies, you can simply set the `AUTH_POLICIES_ENABLED` environment variable for the `datahub-gms` service container -to `false`. For example in your `docker/datahub-gms/docker.env`, you'd place +[^1]: Only active if REST_API_AUTHORIZATION_ENABLED is true +[^2]: DataHub Cloud only -``` -AUTH_POLICIES_ENABLED=false -``` +#### Misc -### REST API Authorization - -Policies only affect REST APIs when the environment variable `REST_API_AUTHORIZATION` is set to `true` for GMS. Some policies only apply when this setting is enabled, marked above, and other Metadata and Platform policies apply to the APIs where relevant, also specified in the table above. +| Entity | Privilege | Description | +|--------------|-------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Tag | Edit Tag Color | Allow actor to change the color of a Tag. | +| Domain | Manage Data Products | Allow actor to create, edit, and delete Data Products within a Domain | +| GlossaryNode | Manage Direct Glossary Children | Allow actor to create and delete the direct children of this entity. | +| GlossaryNode | Manage All Glossary Children | Allow actor to create and delete everything underneath this entity. | +[^1]: Only active if REST_API_AUTHORIZATION_ENABLED is true +[^2]: DataHub Cloud only ## Coming Soon @@ -278,7 +337,7 @@ The DataHub team is hard at work trying to improve the Policies feature. We are Under consideration -- Ability to define Metadata Policies against multiple reosurces scoped to particular "Containers" (e.g. A "schema", "database", or "collection") +- Ability to define Metadata Policies against multiple resources scoped to particular "Containers" (e.g. A "schema", "database", or "collection") ## Feedback / Questions / Concerns