Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt docs generate does not work with glue catalog with large number of tables in schema #727

Closed
lguyaux opened this issue Jul 8, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@lguyaux
Copy link

lguyaux commented Jul 8, 2024

Describe the bug

Currently when attempting to run dbt docs generate you get a failure message if you have a large number of tables in any given schema. This is because the package will attempt to run the following code show table extended in <schema_name> like '<table_1>|<table_2>|...'.

Steps To Reproduce

Use glue as your metastore
Have a large number of tables in a given schema
Attempt to run dbt docs generate

Expected behavior

dbt docs generate to work regardless of how many tables you have in a given schema.

Screenshots and log output

09:40:38 Running with dbt=1.8.3
09:40:39 Registered adapter: databricks=1.8.3
09:40:40 Found 161 models, 4 data tests, 117 sources, 1142 macros

09:41:32 Concurrency: 4 threads (target='dev')

09:41:34 Building catalog
09:41:35 Encountered an error while generating catalog: Runtime Error
Runtime Error
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:1 validation error detected: Value 'table1|table2|table3|...' at 'expression' failed to satisfy constraint: Member must have length less than or equal to 2048 (Service: AWSGlue; Status Code: 400; Error Code: ValidationException; Request ID: 0ed28cc6-16cf-4962-9d60-ddcf2fe8ba82; Proxy: null))

System information

The output of dbt --version:

Core:
  - installed: 1.8.3
  - latest:    1.8.3 - Up to date!

Plugins:
  - databricks: 1.8.3 - Up to date!
  - spark:      1.8.0 - Up to date!

The operating system you're using:
macOS Sonoma 14.5

The output of python --version:
Python 3.9.6

Additional context

This issue is a replica of this already fixed issue that seems not to work : #325

@lguyaux lguyaux added the bug Something isn't working label Jul 8, 2024
@benc-db
Copy link
Collaborator

benc-db commented Jul 8, 2024

Thanks for filing. We had fixed this before, but I guess something I changed recently regressed it.

@benc-db
Copy link
Collaborator

benc-db commented Jul 8, 2024

Do you know when this issue reappeared? I don't have a glue setup to test against, so I could use some help tracking down the issue.

@benc-db
Copy link
Collaborator

benc-db commented Jul 8, 2024

Ah, have you tried setting DBT_DESCRIBE_TABLE_2048_CHAR_BYPASS=true?

@lguyaux
Copy link
Author

lguyaux commented Jul 10, 2024

Ah, have you tried setting DBT_DESCRIBE_TABLE_2048_CHAR_BYPASS=true?

Thank you very much @benc-db, it worked with this parameter !

@lguyaux lguyaux closed this as completed Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants