Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ingestion): match nested LookML files mentioned in 'include' statements #2957

Merged
merged 2 commits into from
Jul 26, 2021
Merged

fix(ingestion): match nested LookML files mentioned in 'include' statements #2957

merged 2 commits into from
Jul 26, 2021

Conversation

jameslamb
Copy link
Contributor

@jameslamb jameslamb commented Jul 26, 2021

The LookML metadata ingestion logic includes a method, resolve_includes(), whose goal is to turn LookML include statements into valid Python glob patterns. This is used to identify .lkml files to be read in and used to create MCE messages.

I believe that a mistake in that logic leads to files in deeply-nested directories being ignored by datahub ingest.

the LookML

include: "/**/some_cool_view.view.lkml"

is treated as equivalent to

glob.glob(f"{config['base_folder']}/**/some_cool_view.lkml")

But I believe that's incorrect. According to https://docs.looker.com/data-modeling/getting-started/ide-folders#wildcard_examples, ** matches all directories at any level of nesting. Without recursive=True, that glob.glob() call says "match files called some_cool_view.view.lkml exactly one directory below base_folder".

This PR proposes a fix to make this project's Python code match LookML's treatment of wildcards more closely.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable)

Notes for reviewers

I tested this against a DataHub environment I have access to and can confirm that before this change (as of acryl-datahub==0.8.6.1), .view.lkml files nested more than one directory below the base_folder were not found, and after this change they are.

Thanks for your time and consideration.

Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks for catching this @jameslamb!

Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shirshanka shirshanka merged commit 5d396b1 into datahub-project:master Jul 26, 2021
@jameslamb jameslamb deleted the fix/lookml-views branch July 26, 2021 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants