Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Databricks CATALOG as a DATABASE in DBT compilations #95

Closed
ewengillies opened this issue May 15, 2022 · 0 comments · Fixed by #105
Closed

Support for Databricks CATALOG as a DATABASE in DBT compilations #95

ewengillies opened this issue May 15, 2022 · 0 comments · Fixed by #105
Labels
enhancement New feature or request

Comments

@ewengillies
Copy link

Describe the feature

Unity Catalog for Databricks now supports a three-level namespace via CATALOG in analogy to how most SQL dialects support DATABASES: https://docs.databricks.com/data-governance/unity-catalog/queries.html#three-level-namespace-notation

It would be excellent if DBT was able to treat DATABASE as CATALOG when compiling SQL for databricks. This is done in analogy to custom databases in DBT. In fact, the BigQuery connector seems to already support a similar configuration via project configuration: https://docs.getdbt.com/docs/building-a-dbt-project/building-models/using-custom-databases

Describe alternatives you've considered

We would have to not use DATABASE or custom databases at all with Databricks, or we would have to do so using a pre-build hook that issues a USE CATALOG statement.

Additional context

PRs #89 and #94 are already working on this, I added this in support of this work!

Who will this benefit?

Anyone who wants to use custom "databases" feature of DBT in databricks.

Are you interested in contributing this feature?

Would love to help any way I can.

@ewengillies ewengillies added the enhancement New feature or request label May 15, 2022
ueshin added a commit that referenced this issue Jun 14, 2022
resolves #95

### Description

Supports multi-catalog.

Enables `catalog` or `database` config to use a different catalog for models.

- model `alternative_catalog`

```sql
{{ config(
    catalog = 'alternative',
    materialized = 'table'
) }}

select * from {{ ref('seed') }}
```

Also enables to run cross catalog queries.

```sql
select * from {{ ref('seed') }}
union all select * from {{ ref('alternative_catalog')}}
```

Note: mixing Unity Catalog and Hive metastore tables is not recommened:

> org.apache.spark.sql.AnalysisException: Non-Unity-Catalog object `hive_metastore`.`schema_1`.`table_a` can't be referenced in Unity Catalog objects

> org.apache.spark.sql.AnalysisException: Create a persistent view that references both unity catalog and Hive metastore objects is not supported in Unity Catalog

Co-authored-by: allisonwang-db <allison.wang@databricks.com>
ueshin added a commit to ueshin/dbt-databricks that referenced this issue Jun 14, 2022
resolves databricks#95

### Description

Supports multi-catalog.

Enables `catalog` or `database` config to use a different catalog for models.

- model `alternative_catalog`

```sql
{{ config(
    catalog = 'alternative',
    materialized = 'table'
) }}

select * from {{ ref('seed') }}
```

Also enables to run cross catalog queries.

```sql
select * from {{ ref('seed') }}
union all select * from {{ ref('alternative_catalog')}}
```

Note: mixing Unity Catalog and Hive metastore tables is not recommened:

> org.apache.spark.sql.AnalysisException: Non-Unity-Catalog object `hive_metastore`.`schema_1`.`table_a` can't be referenced in Unity Catalog objects

> org.apache.spark.sql.AnalysisException: Create a persistent view that references both unity catalog and Hive metastore objects is not supported in Unity Catalog

Co-authored-by: allisonwang-db <allison.wang@databricks.com>
ueshin added a commit that referenced this issue Jun 15, 2022
resolves #95

### Description

Supports multi-catalog.

Enables `catalog` or `database` config to use a different catalog for models.

- model `alternative_catalog`

```sql
{{ config(
    catalog = 'alternative',
    materialized = 'table'
) }}

select * from {{ ref('seed') }}
```

Also enables to run cross catalog queries.

```sql
select * from {{ ref('seed') }}
union all select * from {{ ref('alternative_catalog')}}
```

Note: mixing Unity Catalog and Hive metastore tables is not recommened:

> org.apache.spark.sql.AnalysisException: Non-Unity-Catalog object `hive_metastore`.`schema_1`.`table_a` can't be referenced in Unity Catalog objects

> org.apache.spark.sql.AnalysisException: Create a persistent view that references both unity catalog and Hive metastore objects is not supported in Unity Catalog

Co-authored-by: allisonwang-db <allison.wang@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant