Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery nested fields missing descriptions #8763

Closed
maaaikoool opened this issue Aug 31, 2023 · 4 comments · Fixed by #8950
Closed

BigQuery nested fields missing descriptions #8763

maaaikoool opened this issue Aug 31, 2023 · 4 comments · Fixed by #8950
Labels
accepted An Issue that is confirmed as a bug by the DataHub Maintainers. bug Bug report good-first-issue Issues that are good candidates for DataHub newcomers to tackle

Comments

@maaaikoool
Copy link
Contributor

Describe the bug
BigQuery schemaMetadata aspect is missing description in record fields that are not the root record field.

To Reproduce
Steps to reproduce the behavior:

  1. Ingest any dataset with a record.
pipeline_name: bigquery_test
source:
  type: bigquery
  config:
    match_fully_qualified_names: true
    include_table_lineage: false
    include_usage_statistics: false
    include_views: false
    project_id: pepe
    schema_pattern:
      allow:
      - "pepe.dataset"
    table_pattern:
      allow:
        - .*some_table_with_record
    use_date_sharded_audit_log_tables: false
    use_exported_bigquery_audit_metadata: false
    profiling:
      enabled: false
sink:
  type: file
  config:
      filename: out.json

Expected behavior
All fields should have description

Screenshots

bigquery datahub

Additional context
Tested with latest version master e7d140f

@maaaikoool maaaikoool added the bug Bug report label Aug 31, 2023
@maaaikoool
Copy link
Contributor Author

I have debugged this and the issue seems to be related to this conditional here. This is only true for the struct field, hence it's the only one that has a description set.

Mentioning @treff7es as I see he was testing this here #6062

@github-actions
Copy link

github-actions bot commented Oct 1, 2023

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

@github-actions github-actions bot added the stale label Oct 1, 2023
@treff7es treff7es added the accepted An Issue that is confirmed as a bug by the DataHub Maintainers. label Oct 1, 2023
@github-actions github-actions bot removed the stale label Oct 2, 2023
@treff7es
Copy link
Contributor

treff7es commented Oct 2, 2023

@maaaikoool please, can you retest?
I tried to reproduce but I was unable :(
CleanShot 2023-10-02 at 15 01 08@2x
And this is how it is ingested:
CleanShot 2023-10-02 at 15 01 50@2x

@yoonhyejin yoonhyejin added the good-first-issue Issues that are good candidates for DataHub newcomers to tackle label Oct 4, 2023
@maaaikoool
Copy link
Contributor Author

maaaikoool commented Oct 4, 2023

Hey @treff7es after upgrading to v0.11.0.2 I can still reproduce it. Here is a minimal example:

[
  {
    "name": "root",
    "type": "RECORD",
    "description": "root",
    "fields": [
      {
        "name": "nested_record",
        "type": "RECORD",
        "description": "nested_record",
        "fields": [
          {
            "name": "value",
            "type": "STRING",
            "description": "value"
          }
        ]
      },
      {
        "name": "updatedAt",
        "type": "STRING",
        "description": "updatedAt"
      }
    ]
  }
]

updatedAt won't have a description 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted An Issue that is confirmed as a bug by the DataHub Maintainers. bug Bug report good-first-issue Issues that are good candidates for DataHub newcomers to tackle
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants