[Bug] `docs generate` appears to be returning no table metadata when run with the `--no-compile` option #1216

mikealfare · 2024-05-01T01:52:51Z

Current Behavior

Starting on the evening of 4/29/2024, TestDocsGenerateBigQuery is failing across all versions of dbt-bigquery. For 1.7.latest and main, test_run_and_generate is passing and test_run_and_generate_no_compile is failing. For 1.6.latest and prior, both are failing.

Failed test runs can be seen here, passing previously and then failing across the board.

Expected Behavior

We should be able to run docs generate --no-compile.

Steps To Reproduce

For each release branch, I chose a bundle from dbt-core-bundles and took the bundle_requirements_mac_3.8.txt file and used it as a constraints file for dbt-bigquery. I needed to unpin pytz because it was pinned in both the constraints file and dev-requirements.txt. Since the constraints file reflects what would actually get shipped, I chose to unpin in dev-requirements.txt.

Taking 1.7.latest as an example:

Use bundle 1.7.55
Unpin pytz~=2023.3 in dev-requirements.txt to pytz

Install locally against the constraints file:

pip install -e . -r dev-requirements.txt -c bundle_requirements_mac_3.8.txt

Run the offending test class:

pytest tests/functional/adapter/test_basic.py -k "TestDocsGenerateBigQuery"

test_run_and_generate_no_compile fails and test_run_and_generate passes for 1.7.latest and 1.8.0b1 or fails as well

Here is a summary of each scenario for each release branch:

branch/tag	bundle	failed
`main`	`main`	--no-compile
`1.7.latest`	`1.7.55`	--no-compile
`1.6.latest`	`1.6.72`	both
`1.5.latest`	`1.5.78`	both
`1.4.latest`	`1.4.64`	both
`1.3.latest`	`1.3.71`	both

Relevant log output

# this is happening because `catalog.json` is empty, which can be confirmed by manually viewing it

for key in "nodes", "sources":
    for unique_id, expected_node in expected_catalog[key].items():
        found_node = catalog[key][unique_id]
        KeyError: 'model.test.model'

Environment

- OS: all
- Python: all
- dbt-core: all
- dbt-bigquery: all

Additional Context

This is able to be reproduced using a constraint file of hard pins that pre-dates the integration test failures. This suggests the change that caused the issue is not in dbt-bigquery nor its dependencies. It's either an OS thing (unlikely since it's across platforms) or a change in BigQuery itself.

We were relying on INFORMATION_SCHEMA.__TABLES__ in the catalog query, which is not recommended. However, fixing that seemed to still generate the same error.

Since we see a change in behavior between 1.6 and 1.7, it's worth looking at the diff there for both dbt-bigquery and dbt-core, keeping in mind that it's a change that affects runs without --no-compile, but not with --no-compile.

While debugging, I found that the --no-compile route ran through BigQueryAdapter._get_catalog_schemas but not BigQueryAdapter._catalog_filter_table, perhaps because it failed before getting to the latter? However, when --no-compile was not used, BigQueryAdapter._get_catalog_schemas was never used and BigQueryAdapter._catalog_filter_table was used. In the --no-compile scenario, BigQueryAdapter._get_catalog_schemas received two candidate schemas, which feels odd since there should be a single test schema.

It's worth noting we have 10,249 schemas in the database. We should probably drop all of the test#######_test_% schemas to make sure we're not seeing something funny because of that.

I think it's pagination because we broke 10K schemas, and not a functional bug. If so, we still should look at what happens when we're in a database with 10K schemas as we're not properly handling that scenario.

The text was updated successfully, but these errors were encountered:

mikealfare · 2024-05-01T16:55:30Z

We went beyond the pagination setting for the SDK and were not finding the test schema that was just created, hence the catalog was empty. The CI database was cleared out and tests are now passing. This turns out to be a bug related to pagination, which has been captured in the this ticket.

mikealfare added bug Something isn't working High Severity bug with significant impact that should be resolved in a reasonable timeframe labels May 1, 2024

mikealfare mentioned this issue May 1, 2024

[Bug] docs generate does not find all schemas when there are more than 10K schemas #1218

Closed

mikealfare closed this as completed May 1, 2024

mikealfare self-assigned this May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] `docs generate` appears to be returning no table metadata when run with the `--no-compile` option #1216

[Bug] `docs generate` appears to be returning no table metadata when run with the `--no-compile` option #1216

mikealfare commented May 1, 2024 •

edited

Loading

mikealfare commented May 1, 2024

[Bug] docs generate appears to be returning no table metadata when run with the --no-compile option #1216

[Bug] docs generate appears to be returning no table metadata when run with the --no-compile option #1216

Comments

mikealfare commented May 1, 2024 • edited Loading

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Additional Context

mikealfare commented May 1, 2024

[Bug] `docs generate` appears to be returning no table metadata when run with the `--no-compile` option #1216

[Bug] `docs generate` appears to be returning no table metadata when run with the `--no-compile` option #1216

mikealfare commented May 1, 2024 •

edited

Loading