Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest/bigquery): Support for View Labels #10648

Conversation

ethan-cartwright
Copy link
Contributor

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@ethan-cartwright
Copy link
Contributor Author

ethan-cartwright commented Jun 13, 2024

todo: add option_name = "labels" to retrieve labels in the views query in the queries.py eg.

SELECT
  t.table_catalog as table_catalog,
  t.table_schema as table_schema,
  t.table_name as table_name,
  t.table_type as table_type,
  t.creation_time as created,
  ts.last_modified_time as last_altered,
  tos.OPTION_VALUE as comment,
  t.is_insertable_into,
  t.ddl as view_definition,
  ts.row_count,
  ts.size_bytes
FROM
  `acryl-staging.ethan_test_dataset.INFORMATION_SCHEMA.TABLES` t
  JOIN `acryl-staging.ethan_test_dataset.__TABLES__` as ts ON ts.table_id = t.TABLE_NAME
  LEFT JOIN `acryl-staging.ethan_test_dataset.INFORMATION_SCHEMA.TABLE_OPTIONS` as tos ON t.table_schema = tos.table_schema
  AND t.TABLE_NAME = tos.TABLE_NAME
  AND tos.OPTION_NAME = "labels"
WHERE
  table_type in ('VIEW', 'MATERIALIZED_VIEW')
  AND t.table_name = 'account_types'
ORDER BY
  table_schema ASC,
  table_name ASC;

@ethan-cartwright ethan-cartwright changed the title [DO NOT MERGE] add support for view labels add support for view labels Jun 13, 2024
@ethan-cartwright ethan-cartwright changed the title add support for view labels feat(ingest/bigquery): Support for View Labels Jun 13, 2024
@ethan-cartwright
Copy link
Contributor Author

Screenshot 2024-06-13 at 7 55 54 PM

Copy link
Collaborator

@asikowitz asikowitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some style comments. Otherwise LGTM although would like a second look on the SQL query

Comment on lines 26 to 30
# Find all matches in the labels string
matches = re.findall(pattern, labels_str)

labels_dict = {key: value for key, value in matches}
return labels_dict
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Find all matches in the labels string
matches = re.findall(pattern, labels_str)
labels_dict = {key: value for key, value in matches}
return labels_dict
# Map ?? to ??
return dict(re.findall(pattern, labels_str))

I don't actually know what the keys and values of the returned dict represent. Let's document that somewhere

Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
@ethan-cartwright
Copy link
Contributor Author

Just some style comments. Otherwise LGTM although would like a second look on the SQL query

@hsheth2 can you take a look?

@ethan-cartwright
Copy link
Contributor Author

Just some style comments. Otherwise LGTM although would like a second look on the SQL query

@hsheth2 can you take a look?

Harshal took a look and gave the go ahead. https://acryldata.slack.com/archives/C02ENS87APK/p1718410928205629?thread_ts=1718392351.832049&cid=C02ENS87APK

@anshbansal anshbansal merged commit c58be15 into datahub-project:master Jun 17, 2024
57 of 58 checks passed
@anshbansal
Copy link
Collaborator

Airflow problem is separate

sleeperdeep pushed a commit to sleeperdeep/datahub that referenced this pull request Jun 25, 2024
Co-authored-by: Ethan Cartwright <ethan.cartwright@acryl.io>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
yoonhyejin pushed a commit that referenced this pull request Jul 16, 2024
Co-authored-by: Ethan Cartwright <ethan.cartwright@acryl.io>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants