The dbt-hive
adapter allows you to use dbt along with Apache Hive and Cloudera Data Platform
- Install dbt
- Read the introduction and viewpoint
The initial adapter code was developed by bachng2017 who agreed to transfer the ownership and continue active development. This code base is now being actively developed and maintained by Cloudera.
Current version of dbt-hive uses dbt-core 1.8.*. We are actively working on supporting the next available version of dbt-core.
Python >= 3.8 dbt-core ~= 1.8.* impyla >= 0.18
pip3 install --user dbt-hive
demo_project:
target: dev
outputs:
dev:
type: hive
auth_type: LDAP
user: [username]
password: [password]
schema: [schema]
host: [hive-meta-store-host]
port: 443
http_path: [http-path]
thread: 1
Name | Supported | Iceberg |
---|---|---|
Materialization: View | Yes | N/A |
Materialization: Table | Yes | Yes |
Materialization: Table with Partitions | Yes | Yes |
Materialization: Incremental - Append | Yes | Yes |
Materialization: Incremental - Append with Partitions | Yes | Yes |
Materialization: Incremental - Insert+Overwrite with Partitions | Yes | No |
Materialization: Incremental - Merge | Yes | Yes |
Materialization: Incremental - Merge with Partitions | No | Yes* |
Materialization: Ephemeral | No | No |
Seeds | Yes | Yes |
Tests | Yes | Yes |
Snapshots | No | No |
Documentation | Yes | No |
Authentication: LDAP | Yes | Yes |
Authentication: Kerberos | Yes | Yes |
Incremental models are explained in dbt documentation. This section covered the details about the incremental strategy supported by the dbt-hive.
Strategy | ACID Table | Iceberg Table |
---|---|---|
Incremental Full-Refresh | Yes | Yes |
Incremental Append | Yes | Yes |
Incremental Append with Partitions | Yes | Yes |
Incremental Insert Overwrite | Not recommended without Partitions* | Not recommended without Partitions* |
Incremental Insert Overwrite with Partitions | Yes | No |
Incremental Merge | Yes | Yes* (only v2) |
Incremental Merge with Partitions | No* | Yes* (only v2) |
Note*:
- Incremental Insert overwrite without the partition columns results into completely overwriting the full table and may result in the data-loss. Hence it is not recommended to used. This can happen for Hive ACID, Iceberg v1 & v2 tables.
- Incremental Merge for iceberg v1 table is not supported because Iceberg v1 tables are not transactional.
- Incremental Merge with partition columns is not supported because Hive ACID tables doesn't support updating values of partition columns.
Support for On-Schema Change strategy in dbt-hive:
Strategy | ACID Table | Iceberg Table |
---|---|---|
ignore (default) | Supported | Supported |
fail | Supported | Supported |
append_new_columns | Adds new columns | Adds new columns |
sync_all_columns | Adds new columns and updates datatypes but doesn't remove existing columns | Adds new columns, updates datatypes and removes existing columns |
Name | Base | Iceberg |
---|---|---|
Materialization: View | Yes | N/A |
Materialization: Table | Yes | Yes |
Materialization: Table with Partitions | Yes | Yes |
Materialization: Incremental - Append | Yes | Yes |
Materialization: Incremental - Append with Partitions | Yes | Yes |
Materialization: Incremental - Insert+Overwrite with Partitions | Yes | No |
Materialization: Incremental - Merge | No | No |
Materialization: Ephemeral | No | No |
Seeds | Yes | Yes |
Tests | Yes | Yes |
Snapshots | No | No |
Documentation | Yes | No |
Authentication: LDAP | Yes | Yes |
Authentication: Kerberos | Yes | Yes |
Note: Kerberos is only qualified on Unix platform.