Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table is overwritten if failed to retrieve tables list #266

Closed
anikolaienko opened this issue Jan 31, 2023 · 2 comments · Fixed by #270
Closed

Table is overwritten if failed to retrieve tables list #266

anikolaienko opened this issue Jan 31, 2023 · 2 comments · Fixed by #270
Assignees
Labels
bug Something isn't working

Comments

@anikolaienko
Copy link

anikolaienko commented Jan 31, 2023

Describe the bug

We are running Databricks on AWS and using dbt with dbt-databricks.
The process is quite standard, we don't have any custom macros in our case, just running incremental model that writes to delta table.
Sometimes we receive error on step show table extended in {{ relation }} like '*' and it's infrastructure error.
Databricks logs are saying Command failed because warehouse {warehouse_id} was stopped.

The problem is that DatabricksAdapter is handling this exception and returns empty list of tables and further logic decides to create or replace table ... which leads to table being overwritten and all previous partitions are lost.

Workaround: is to revert table to previous version and run model again, but this requires constant monitoring that all our models are not accidentally overwritten.

Steps To Reproduce

  1. Run dbt model on Databricks AWS
  2. Somehow emulate failure on show table extended in {{ relation }} like '*' step. Alternative would be to modify macros and raise exception manually:
{% macro spark__list_relations_without_caching(relation) %}
  {% call statement('list_relations_without_caching', fetch_result=True) -%}
    show table extended in {{ relation }} like '*'
    {% do exceptions.raise_database_error("Failed retrieve tables from database.") %}
  {% endcall %}

  {% do return(load_result('list_relations_without_caching').table) %}
{% endmacro %}

Expected behavior

Exception should raised and allow process to fail.
Another option, retry few times to retrieve tables list and after couple of failures raise exception.

System information

The output of dbt --version:

Core:
  - installed: 1.2.2
  - latest:    1.4.1 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - databricks: 1.2.2 - Update available!
  - spark:      1.2.0 - Update available!

The operating system you're using:
macOS Ventura

The output of python --version:
Python 3.8.12

Additional context

The problem is in line DatabricksAdapter line 135. If replace return [] with raise e for my case that would work, but I am not sure if anyone can rely on this behaviour.
As alternative, maybe create a flag in adapter config and raise exception on flag is True.
Let me know what do you think.

@anikolaienko anikolaienko added the bug Something isn't working label Jan 31, 2023
@andrefurlan-db
Copy link
Collaborator

Hi @anikolaienko , thanks for the detailed bug report. This is quite unfortunate. I'll deal with this with urgency and get back to you.

@andrefurlan-db
Copy link
Collaborator

Some more info on this: dbt-labs/dbt-spark#54

andrefurlan-db added a commit that referenced this issue Feb 17, 2023
Because of a AWS Glue issue, list releations was set to handle any exception and returning empty list of tables. The problem is that further logic decides to create or replace table, which leads to table being overwritten and all previous partitions are lost.

I believe that the problem the previous fix solves is no where near as important to Databricks, and the problem it causes is very bad.

resolves #266

Signed-off-by: Andre Furlan <andre.furlan@databricks.com>
andrefurlan-db added a commit that referenced this issue Feb 17, 2023
Because of a AWS Glue issue, list releations was set to handle any exception and returning empty list of tables. The problem is that further logic decides to create or replace table, which leads to table being overwritten and all previous partitions are lost.

I believe that the problem the previous fix solves is no where near as important to Databricks, and the problem it causes is very bad.

resolves #266

Signed-off-by: Andre Furlan <andre.furlan@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants