Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt docs generate (>1.3) does not work with glue catalog with large number of tables in schema #325

Closed
josephberni opened this issue Apr 28, 2023 · 2 comments · Fixed by #405
Assignees
Labels
bug Something isn't working

Comments

@josephberni
Copy link
Contributor

Describe the bug

A clear and concise description of what the bug is. What command did you run? What happened?

Currently when attempting to run dbt docs generate you get a failure message if you have a large number of tables in any given schema. This is because the package will attempt to run the following code show table extended in <schema_name> like '<table_1>|<table_2>|...'.

The call to glue has the following constraint 'Length Constraints: Minimum length of 0. Maximum length of 2048.' which results in the query failing and ultimately an inability to create dbt docs when using glue on any version above 1.3

Steps To Reproduce

In as much detail as possible, please provide steps to reproduce the issue. Sample data that triggers the issue, example model code, etc is all very helpful here.

Use glue as your metastore
Have a large number of tables in a given schema
Attempt to run dbt docs generate

Expected behavior

A clear and concise description of what you expected to happen.

dbt docs generate to work regardless of how many tables you have in a given schema.

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

System information

The output of dbt --version:

failed to satisfy constraint: Member must have length less than or equal to 2048 (Service: AWSGlue; Status Code: 400; Error Code: ValidationException; Request ID: <REQUEST_ID>; Proxy: null))

The operating system you're using:

databricks-cli==0.17.5
dbt-core==1.4.6
dbt-databricks==1.4.2
dbt-spark[PyHive]==1.4.1
elementary-data==0.7.7
graphviz==0.20.1
pre-commit==3.1.1
PyYAML==6.0
rich==13.3.2
shyaml==0.6.2
sqlfluff-templater-dbt==2.0.6
sqlfluff==2.0.6

The output of python --version:

3.8.10

Additional context

Add any other context about the problem here.

@josephberni josephberni added the bug Something isn't working label Apr 28, 2023
@susodapop
Copy link

We reverted the fix from #326 in #404 and will need to reimplement gated behind a config (environment variable) as this fix was breaking for non glue catalog users.

@susodapop
Copy link

Hey @josephberni I've opened a pull request to reimplement this fix behind an environment variable. If I push a beta release of dbt-databricks to Pypi can you confirm that it works for your use case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants