Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Identify owners for inventory types that we track the history of #2761

Closed
1 of 2 tasks
asnare opened this issue Sep 26, 2024 · 1 comment
Closed
1 of 2 tasks
Assignees
Labels
feat/migration-progress Issues related to the migration progress workflow

Comments

@asnare
Copy link
Contributor

asnare commented Sep 26, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Problem statement

We will shortly be tracking history for inventory types that are routinely refreshed during migration. The history journal that we maintain will require that for each record an owner is available: the owner is the person (or group) responsible for the underlying resource being migrated. If this is not available, the workspace administrator should be used instead.

Proposed Solution

Each crawler that is responsible for a refreshable class will need to be updated to have code that can identify the owner for its Result type.

Documentation

Additional Context

Related issues:

Blocks:

@asnare
Copy link
Contributor Author

asnare commented Oct 1, 2024

We need to track owners for the following inventory types:

  • ClusterInfo:

    1. Existing creator field, which is optional.
    2. Workspace admin.
  • DirectFsAccess:

    1. For:

      • Queries: the user attribute of the query1, which is optional.
      • Jobs: the owner of the DBFS or Workspace path12 for the notebook or source, which is optional.
    2. Workspace admin.

  • Grant, Table and UDF:

    1. Workspace admin.

    Note: No attempt to determine creator via table/UDF properties.

  • JobInfo:

    1. Existing creator field, which is optional.
    2. Workspace admin.
  • PipelineInfo:

    1. Existing creator_name field, which is optional.
    2. Workspace admin.
  • PolicyInfo:

    1. Existing creator field, which is optional.
    2. Workspace admin.
  • TableMigrationStatus:

    1. The owner of the source table, as defined above.

Where the workspace admin is needed (because a more appropriate owner cannot be determined) the algorithm is as follows:

  1. Query all active workspace admins, sort alphabetically by user-name, use the first one.
  2. If there are no workspace admins, query all active account admins associated with the workspace, sort alphabetically by user-name, use the first one.
  3. Raise an error if we still don't have an identity. (Note: this is possible, especially due to accounts being decommissioned and leaving a workspace without an active admin.)

Footnotes

  1. This is not yet available on the DirectFsAccess instances; a schema change will probably be required along with code to expose this information. 2

  2. The APIs for DBFS and Workspace paths don't expose the owner/creator information, so this information is unavailable. If it were available, this would first be exposed via the .owner attribute of our pathlib emulation.

@nfx nfx added feat/migration-progress Issues related to the migration progress workflow and removed enhancement New feature or request needs-triage labels Oct 9, 2024
@nfx nfx moved this from Triage to Active Backlog in UCX Oct 9, 2024
nfx pushed a commit that referenced this issue Oct 9, 2024
## Changes

As part of #2761 we need to have a way for determining the user
responsible for some of our inventory types. This PR updates the crawler
framework so that:

- There is a way to identify an owner of a resource referred to by
inventory records.
- When the owner cannot be identified, a workspace or account
administrator is used instead.

### Linked issues

Progresses #2761.

### Functionality

- A component for locating an administrator user.
- Ownership information for the following inventory types:

  - [X] `ClusterInfo`
  - [x] `DirectFsAccess` (stubbed)
  - [X] `Grant`
  - [x] `JobInfo`
  - [x] `PipelineInfo`
  - [X] `PolicyInfo`
  - [x] `Table`
  - [x] `TableMigrationStatus`
  - [x] `UDF`

### Tests

- [x] added unit tests
- [x] added integration tests
@nfx nfx closed this as completed Oct 31, 2024
@github-project-automation github-project-automation bot moved this from Active Backlog to Archive in UCX Oct 31, 2024
@nfx nfx removed this from UCX Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat/migration-progress Issues related to the migration progress workflow
Projects
None yet
Development

No branches or pull requests

2 participants