diff --git a/_redirects b/_redirects index c6b8538404d..0bfa920992e 100644 --- a/_redirects +++ b/_redirects @@ -336,3 +336,6 @@ https://tutorial.getdbt.com/* https://docs.getdbt.com/:splat 301! /docs/guides/getting-help /guides/legacy/getting-help 302 /docs/guides/migration-guide/* /guides/migration/versions/:splat 301! /docs/guides/* /guides/legacy/:splat 301! +docs/contributing/building-a-new-adapter /docs/contributing/adapter-development/3-building-a-new-adapter 302 +docs/contributing/testing-a-new-adapter /docs/contributing/adapter-development/4-testing-a-new-adapter 302 +docs/contributing/documenting-a-new-adapter /docs/contributing/adapter-development/5-documenting-a-new-adapter 302 \ No newline at end of file diff --git a/website/docs/docs/available-adapters.md b/website/docs/docs/available-adapters.md index eff1ac6eca9..edd487ec3f5 100644 --- a/website/docs/docs/available-adapters.md +++ b/website/docs/docs/available-adapters.md @@ -5,7 +5,7 @@ id: "available-adapters" dbt connects to and runs SQL against your database, warehouse, platform, or query engine. It works by using a dedicated **adapter** for each technology. All the adapters listed below are open source and free to use, just like dbt. -If you have a new adapter, please add it to this list using a pull request! See [Documenting your adapter](/docs/contributing/documenting-a-new-adapter.md) for more information. +If you have a new adapter, please add it to this list using a pull request! See [Documenting your adapter](5-documenting-a-new-adapter) for more information. ### Installation diff --git a/website/docs/docs/contributing/adapter-development/1-what-are-adapters.md b/website/docs/docs/contributing/adapter-development/1-what-are-adapters.md new file mode 100644 index 00000000000..dd450f97fdc --- /dev/null +++ b/website/docs/docs/contributing/adapter-development/1-what-are-adapters.md @@ -0,0 +1,100 @@ +--- +title: "What are adapters? Why do we need them?" +id: "1-what-are-adapters" +--- + +Adapters are an essential component of dbt. At their most basic level, they are how dbt Core connects with the various supported data platforms. At a higher-level, dbt Core adapters strive to give analytics engineers more transferrable skills as well as standardize how analytics projects are structured. Gone are the days where you have to learn a new language or flavor of SQL when you move to a new job that has a different data platform. That is the power of adapters in dbt Core. + + Navigating and developing around the nuances of different databases can be daunting, but you are not alone. Visit [#adapter-ecosystem](https://getdbt.slack.com/archives/C030A0UF5LM) Slack channel for additional help beyond the documentation. + +## All databases are not the same + +There's a tremendous amount of work that goes into creating a database. Here is a high-level list of typical database layers (from the outermost layer moving inwards): +- SQL API +- Client Library / Driver +- Server Connection Manager +- Query parser +- Query optimizer +- Runtime +- Storage Access Layer +- Storage + +There's a lot more there than just SQL as a language. Databases (and data warehouses) are so popular because you can abstract away a great deal of the complexity from your brain to the database itself. This enables you to focus more on the data. + +dbt allows for further abstraction and standardization of the outermost layers of a database (SQL API, client library, connection manager) into a framework that both: + - Opens database technology to less technical users (a large swath of a DBA's role has been automated, similar to how the vast majority of folks with websites today no longer be be "[webmasters](https://en.wikipedia.org/wiki/Webmaster)"). + - Enables more meaningful conversations about how data warehousing should be done. + +This is where dbt adapters become critical. + +## What needs to be adapted? + +dbt adapters are responsible for _adapting_ dbt's standard functionality to a particular database. Our prototypical database and adapter are PostgreSQL and dbt-postgres, and most of our adapters are somewhat based on the functionality described in dbt-postgres. + +Connecting dbt to a new database will require a new adapter to be built or an existing adapter to be extended. + +The outermost layers of a database map roughly to the areas in which the dbt adapter framework encapsulates inter-database differences. + +### SQL API + +Even amongst ANSI-compliant databases, there are differences in the SQL grammar. +Here are some categories and examples of SQL statements that can be constructed differently: + + +| Category | Area of differences | Examples | +|----------------------------------------------|--------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Statement syntax | The use of `IF EXISTS` |
  • `IF EXISTS, DROP TABLE`
  • `DROP
  • IF EXISTS` | +| Workflow definition & semantics | Incremental updates |
  • `MERGE`
  • `DELETE; INSERT`
  • | +| Relation and column attributes/configuration | Database-specific materialization configs |
  • `DIST = ROUND_ROBIN` (Synapse)
  • `DIST = EVEN` (Redshift)
  • | +| Permissioning | Grant statements that can only take one grantee at a time vs those that accept lists of grantees |
  • `grant SELECT on table dinner.corn to corn_kid, everyone`
  • `grant SELECT on table dinner.corn to corn_kid; grant SELECT on table dinner.corn to everyone`
  • | + +### Python Client Library & Connection Manager + +The other big category of inter-database differences comes with how the client connects to the database and executes queries against the connection. To integrate with dbt, a data platform must have a pre-existing python client library or support ODBC, using a generic python library like pyodbc. + +| Category | Area of differences | Examples | +|------------------------------|-------------------------------------------|-------------------------------------------------------------------------------------------------------------| +| Credentials & authentication | Authentication |
  • Username & password
  • MFA with `boto3` or Okta token
  • | +| Connection opening/closing | Create a new connection to db |
  • `psycopg2.connect(connection_string)`
  • `google.cloud.bigquery.Client(...)`
  • | +| Inserting local data | Load seed .`csv` files into Python memory |
  • `google.cloud.bigquery.Client.load_table_from_file(...)` (BigQuery)
  • `INSERT ... INTO VALUES ...` prepared statement (most other databases)
  • | + + +## How dbt encapsulates and abstracts these differences + +Differences between databases are encoded into discrete areas: + +| Components | Code Path | Function | +|------------------|---------------------------------------------------|-------------------------------------------------------------------------------| +| Python Classes | `adapters/` | Configuration (See above [Python classes](##python classes) | +| Macros | `include//macros/adapters/` | SQL API & statement syntax (for example, how to create schema or how to get table info) | +| Materializations | `include//macros/materializations/` | Table/view/snapshot/ workflow definitions | + + +### Python Classes + +These classes implement all the methods responsible for: +- Connecting to a database and issuing queries. +- Providing dbt with database-specific configuration information. + +| Class | Description | +|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| AdapterClass | High-level configuration type conversion and any database-specific python methods needed | +| AdapterCredentials | Typed dictionary of possible profiles and associated methods | +| AdapterConnectionManager | All the methods responsible for connecting to a database and issuing queries | +| AdapterRelation | How relation names should be rendered, printed, and quoted. Do relation names use all three parts? `catalog.model_name` (two-part name) or `database.schema.model_name` (three-part name) | +| AdapterColumn | How names should be rendered, and database-specific properties | + +### Macros + +A set of *macros* responsible for generating SQL that is compliant with the target database. + +### Materializations + +A set of *materializations* and their corresponding helper macros defined in dbt using jinja and SQL. They codify for dbt how model files should be persisted into the database. + +## Adapter Architecture + + +Below is a diagram of how dbt-postgres, the adapter at the center of dbt-core, works. + + diff --git a/website/docs/docs/contributing/adapter-development/2-prerequisites-for-a-new-adapter.md b/website/docs/docs/contributing/adapter-development/2-prerequisites-for-a-new-adapter.md new file mode 100644 index 00000000000..271108a620c --- /dev/null +++ b/website/docs/docs/contributing/adapter-development/2-prerequisites-for-a-new-adapter.md @@ -0,0 +1,52 @@ +--- +title: "Prerequisites for a new adapter" +id: "2-prerequisites-for-a-new-adapter" +--- + +To learn what an adapter is and they role they serve, see [What are adapters?](1-what-are-adapters) + +It is very important that make sure that you have the right skills, and to understand the level of difficulty required to make an adapter for your data platform. + +## Pre-Requisite Data Warehouse Features + +The more you can answer Yes to the below questions, the easier your adapter development (and user-) experience will be. See the [New Adapter Information Sheet wiki](https://github.com/dbt-labs/dbt-core/wiki/New-Adapter-Information-Sheet) for even more specific questions. + +### Training +- the developer (and any product managers) ideally will have substantial experience as an end-user of dbt. If not, it is highly advised that you at least take the [dbt Fundamentals](https://courses.getdbt.com/courses/fundamentals) and [Advanced Materializations](https://courses.getdbt.com/courses/advanced-materializations) course. + +### Database +- Does the database complete transactions fast enough for interactive development? +- Can you execute SQL against the data platform? +- Is there a concept of schemas? +- Does the data platform support ANSI SQL, or at least a subset? +### Driver / Connection Library +- Is there a Python-based driver for interacting with the database that is db API 2.0 compliant (e.g. Psycopg2 for Postgres, pyodbc for SQL Server) +- Does it support: prepared statements, multiple statements, or single sign on token authorization to the data platform? + +### Open source software +- Does your organization have an established process for publishing open source software? + + +It is easiest to build an adapter for dbt when the following the /platform in question has: +- a conventional ANSI-SQL interface (or as close to it as possible), +- a mature connection library/SDK that uses ODBC or Python DB 2 API, and +- a way to enable developers to iterate rapidly with both quick reads and writes + + +## Maintaining your new adapter + +When your adapter becomes more popular, and people start using it, you may quickly become the maintainer of an increasingly popular open source project. With this new role, comes some unexpected responsibilities that not only include code maintenance, but also working with a community of users and contributors. To help people understand what to expect of your project, you should communicate your intentions early and often in your adapter documentation or README. Answer questions like, Is this experimental work that people should use at their own risk? Or is this production-grade code that you're committed to maintaining into the future? + +### Keeping the code compatible with dbt Core + +New minor version releases of `dbt-core` may include changes to the Python interface for adapter plugins, as well as new or updated test cases. The maintainers of `dbt-core` will clearly communicate these changes in documentation and release notes, and they will aim for backwards compatibility whenever possible. + +Patch releases of `dbt-core` will _not_ include breaking changes to adapter-facing code. For more details, see ["About dbt Core versions"](core-versions). + +### Versioning and releasing your adapter + +We strongly encourage you to adopt the following approach when versioning and releasing your plugin: +- The minor version of your plugin should match the minor version in `dbt-core` (e.g. 1.1.x). +- Aim to release a new version of your plugin for each new minor version of `dbt-core` (once every three months). +- While your plugin is new, and you're iterating on features, aim to offer backwards compatibility and deprecation notices for at least one minor version. As your plugin matures, aim to leave backwards compatibility and deprecation notices in place until the next major version (dbt Core v2). +- Release patch versions of your plugins whenever needed. These patch releases should contain fixes _only_. diff --git a/website/docs/docs/contributing/building-a-new-adapter.md b/website/docs/docs/contributing/adapter-development/3-building-a-new-adapter.md similarity index 61% rename from website/docs/docs/contributing/building-a-new-adapter.md rename to website/docs/docs/contributing/adapter-development/3-building-a-new-adapter.md index a4d351b5242..9856f4ab893 100644 --- a/website/docs/docs/contributing/building-a-new-adapter.md +++ b/website/docs/docs/contributing/adapter-development/3-building-a-new-adapter.md @@ -1,46 +1,18 @@ --- title: "Building a new adapter" -id: "building-a-new-adapter" +id: "3-building-a-new-adapter" --- -## What are adapters? +:::tip +Before you build your adapter, we strongly encourage you to first learn dbt as an end user, learn [what an adapter is and they role they serve]((1-what-are-adapters)), as well as [data platform prerequisites](2-prerequisites-for-a-new-adapter) +::: -dbt "adapters" are responsible for _adapting_ dbt's functionality to a given database. If you want to make dbt work with a new database, you'll probably need to build a new adapter, or extend an existing one. Adapters are comprised of three layers: -1. At the lowest level: An *adapter class* implementing all the methods responsible for connecting to a database and issuing queries. -2. In the middle: A set of *macros* responsible for generating SQL that is compliant with the target database. -3. (Optional) At the highest level: A set of *materializations* that tell dbt how to turn model files into persisted objects in the database. - -This guide will walk you through the first two steps, and provide some resources to help you validate that your new adapter is working correctly. Once the adapter is passing most of the functional tests (see ["Testing a new adapter"](testing-a-new-adapter) -), please let the community know that is available to use by adding the adapter to the [Available Adapters](docs/available-adapters) page by following the steps given in [Documenting your adapter](docs/contributing/documenting-a-new-adapter). +This guide will walk you through the first creating the necessary adapter classes and macros, and provide some resources to help you validate that your new adapter is working correctly. Once the adapter is passing most of the functional tests (see ["Testing a new adapter"](4-testing-a-new-adapter) +), please let the community know that is available to use by adding the adapter to the [Available Adapters](available-adapters) page by following the steps given in [Documenting your adapter](5-documenting-a-new-adapter). For any questions you may have, don't hesitate to ask in the [#adapter-ecosystem](https://getdbt.slack.com/archives/C030A0UF5LM) Slack channel. The community is very helpful and likely has experienced a similar issue as you. -## Pre-Requisite Data Warehouse Features - -The more you can answer Yes to the below questions, the easier your adapter development (and user-) experience will be. See the [New Adapter Information Sheet wiki](https://github.com/dbt-labs/dbt-core/wiki/New-Adapter-Information-Sheet) for even more specific questions. - -### Training -- the developer (and any product managers) ideally will have substantial experience as an end-user of dbt. If not, it is highly advised that you at least take the [dbt Fundamentals](https://courses.getdbt.com/courses/fundamentals) and [Advanced Materializations](https://courses.getdbt.com/courses/advanced-materializations) course. - -### Database -- Does the database complete transactions fast enough for interactive development? -- Can you execute SQL against the data platform? -- Is there a concept of schemas? -- Does the data platform support ANSI SQL, or at least a subset? -### Driver / Connection Library -- Is there a Python-based driver for interacting with the database that is db API 2.0 compliant (e.g. Psycopg2 for Postgres, pyodbc for SQL Server) -- Does it support: prepared statements, multiple statements, or single sign on token authorization to the data platform? - -### Open source software -- Does your organization have an established process for publishing open source software? - - -It is easiest to build an adapter for dbt when the following the /platform in question has: -- a conventional ANSI-SQL interface (or as close to it as possible), -- a mature connection library/SDK that uses ODBC or Python DB 2 API, and -- a way to enable developers to iterate rapidly with both quick reads and writes - ## Scaffolding a new adapter To create a new adapter plugin from scratch, you can use the [dbt-database-adapter-scaffold](https://github.com/dbt-labs/dbt-database-adapter-scaffold) to trigger an interactive session which will generate a scaffolding for you to build upon. @@ -57,17 +29,32 @@ One of the most important choices you will make during the cookiecutter generati - Most adapters do fall under SQL adapters which is why we chose it as the default `True` value. - It is very possible to build out a fully functional `BaseAdapter`. This will require a little more ground work as it doesn't come with some prebuilt methods the `SQLAdapter` class provides. See `dbt-bigquery` as a good guide. -### Editing setup.py +## Implementation Details + +Regardless if you decide to use the cookiecutter template or manually create the plugin, this section will go over each method that is required to be implemented. The table below provides a high-level overview of the classes, methods, and macros you may have to define for your data platform. + +| file | component | purpose | +|---------------------------------------------------|-------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `./setup.py` | `setup()` function | adapter meta-data (package name, version, author, homepage, etc) | +| `myadapter/dbt/adapters/myadapter/__init__.py` | `AdapterPlugin` | bundle all the information below into a dbt plugin | +| `myadapter/dbt/adapters/myadapter/connections.py` | `MyAdapterCredentials` class | parameters to connect to and configure the database, via a the chosen Python driver | +| `myadapter/dbt/adapters/myadapter/connections.py` | `MyAdapterConnectionManager` class | telling dbt how to interact with the database w.r.t opening/closing connections, executing queries, and fetching data. Effectively a wrapper around the db API or driver. | +| `myadapter/dbt/include/bigquery/` | a dbt project of macro "overrides" in the format of "myadapter__" | any differences in SQL syntax for regular db operations will be modified here from the global_project (e.g. "Create Table As Select", "Get all relations in the current schema", etc) | +| `myadapter/dbt/adapters/myadapter/impl.py` | `MyAdapterConfig` | database- and relation-level configs and | +| `myadapter/dbt/adapters/myadapter/impl.py` | `MyAdapterAdapter` | for changing _how_ dbt performs operations like macros and other needed Python functionality | +| `myadapter/dbt/adapters/myadapter/column.py` | `MyAdapterColumn` | for defining database-specific column such as datatype mappings | + +### Editing `setup.py` Edit the file at `myadapter/setup.py` and fill in the missing information. -You can skip this step if you passed the arguments for `email`, `url`, `author`, and `dependencies` to the script. If you plan on having nested macro folder structures, you may need to add entries to `package_data` so your macro source files get installed. +You can skip this step if you passed the arguments for `email`, `url`, `author`, and `dependencies` to the cookiecutter template script. If you plan on having nested macro folder structures, you may need to add entries to `package_data` so your macro source files get installed. ### Editing the connection manager Edit the connection manager at `myadapter/dbt/adapters/myadapter/connections.py`. This file is defined in the sections below. -### The Credentials class +#### The Credentials class The credentials class defines all of the database-specific credentials (e.g. `username` and `password`) that users will need in the [connection profile](configure-your-profile) for your new adapter. Each credentials contract should subclass dbt.adapters.base.Credentials, and be implemented as a python dataclass. @@ -137,17 +124,18 @@ class MyAdapterCredentials(Credentials): Then users can use `collection` OR `database` in their `profiles.yml`, `dbt_project.yml`, or `config()` calls to set the database. -### Connection methods +#### `ConnectionManager` class methods Once credentials are configured, you'll need to implement some connection-oriented methods. They are enumerated in the SQLConnectionManager docstring, but an overview will also be provided here. **Methods to implement:** -- open -- get_response -- cancel -- exception_handler +- `open` +- `get_response` +- `cancel` +- `exception_handler` +- `standardize_grants_dict` -#### open(cls, connection) +##### `open(cls, connection)` `open()` is a classmethod that gets a connection object (which could be in any state, but will have a `Credentials` object with the attributes you defined above) and moves it to the 'open' state. @@ -158,11 +146,11 @@ Generally this means doing the following: - on success: - set connection.state to `'open'` - set connection.handle to the handle object - - this is what must have a cursor() method that returns a cursor! + - this is what must have a `cursor()` method that returns a cursor! - on error: - set connection.state to `'fail'` - set connection.handle to `None` - - raise a dbt.exceptions.FailedToConnectException with the error and any other relevant information + - raise a `dbt.exceptions.FailedToConnectException` with the error and any other relevant information For example: @@ -192,7 +180,7 @@ For example: -#### get_response(cls, cursor) +##### `get_response(cls, cursor)` `get_response` is a classmethod that gets a cursor object and returns adapter-specific information about the last executed command. The return value should be an `AdapterResponse` object that includes items such as `code`, `rows_affected`, `bytes_processed`, and a summary `_message` for logging to stdout. @@ -213,9 +201,9 @@ For example: -#### cancel(self, connection) +##### `cancel(self, connection)` -cancel is an instance method that gets a connection object and attempts to cancel any ongoing queries, which is database dependent. Some databases don't support the concept of cancellation, they can simply implement it via 'pass' and their adapter classes should implement an `is_cancelable` that returns False - On ctrl+c connections may remain running. This method must be implemented carefully, as the affected connection will likely be in use in a different thread. +`cancel` is an instance method that gets a connection object and attempts to cancel any ongoing queries, which is database dependent. Some databases don't support the concept of cancellation, they can simply implement it via 'pass' and their adapter classes should implement an `is_cancelable` that returns False - On ctrl+c connections may remain running. This method must be implemented carefully, as the affected connection will likely be in use in a different thread. @@ -231,9 +219,9 @@ cancel is an instance method that gets a connection object and attempts to cance -#### exception_handler(self, sql, connection_name='master') +##### `exception_handler(self, sql, connection_name='master')` -exception_handler is an instance method that returns a context manager that will handle exceptions raised by running queries, catch them, log appropriately, and then raise exceptions dbt knows how to handle. +`exception_handler` is an instance method that returns a context manager that will handle exceptions raised by running queries, catch them, log appropriately, and then raise exceptions dbt knows how to handle. If you use the (highly recommended) `@contextmanager` decorator, you only have to wrap a `yield` inside a `try` block, like so: @@ -258,16 +246,46 @@ If you use the (highly recommended) `@contextmanager` decorator, you only have t +##### `standardize_grants_dict(self, grants_table: agate.Table) -> dict` + +`standardize_grants_dict` is an method that returns the dbt-standardized grants dictionary that matches how users configure grants now in dbt. The input is the result of `SHOW GRANTS ON {{model}}` call loaded into an agate table. + +If there's any massaging of agate table containing the results, of `SHOW GRANTS ON {{model}}`, that can't easily be accomplished in SQL, it can be done here. For example, the SQL to show grants *should* filter OUT any grants TO the current user/role (e.g. OWNERSHIP). If that's not possible in SQL, it can be done in this method instead. + + + +```python + @available + def standardize_grants_dict(self, grants_table: agate.Table) -> dict: + """ + :param grants_table: An agate table containing the query result of + the SQL returned by get_show_grant_sql + :return: A standardized dictionary matching the `grants` config + :rtype: dict + """ + grants_dict: Dict[str, List[str]] = {} + for row in grants_table: + grantee = row["grantee"] + privilege = row["privilege_type"] + if privilege in grants_dict.keys(): + grants_dict[privilege].append(grantee) + else: + grants_dict.update({privilege: [grantee]}) + return grants_dict +``` + + + ### Editing the adapter implementation Edit the connection manager at `myadapter/dbt/adapters/myadapter/impl.py` -Very little is required to implement the adapter itself. On some adapters, you will not need to override anything. On others, you'll likely need to override some of the convert_* classmethods, or override the `is_cancelable` classmethod on others to return False. +Very little is required to implement the adapter itself. On some adapters, you will not need to override anything. On others, you'll likely need to override some of the ``convert_*`` classmethods, or override the `is_cancelable` classmethod on others to return `False`. -#### datenow() +#### `datenow()` -This classmethod provides the adapter's canonical date function. This is not used but is required anyway on all adapters. +This classmethod provides the adapter's canonical date function. This is not used but is required– anyway on all adapters. @@ -283,23 +301,24 @@ This classmethod provides the adapter's canonical date function. This is not use dbt implements specific SQL operations using jinja macros. While reasonable defaults are provided for many such operations (like `create_schema`, `drop_schema`, `create_table`, etc), you may need to override one or more of macros when building a new adapter. -### Required macros +#### Required macros The following macros must be implemented, but you can override their behavior for your adapter using the "dispatch" pattern described below. Macros marked (required) do not have a valid default implementation, and are required for dbt to operate. -- `alter_column_type` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L140)) -- `check_schema_exists` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L224)) -- `create_schema` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L21)) -- `drop_relation` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L164)) -- `drop_schema` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L31)) -- `get_columns_in_relation` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L95)) (required) -- `list_relations_without_caching` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L240)) (required) -- `list_schemas` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L210)) -- `rename_relation` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L185)) -- `truncate_relation` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L175)) -- `current_timestamp` ([source](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/adapters/common.sql#L269)) (required) - -### Adapter dispatch +- `alter_column_type` ([source](https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/columns.sql#L37-L55)) +- `check_schema_exists` ([source](https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/metadata.sql#L43-L55)) +- `create_schema` ([source](https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/schema.sql#L1-L9)) +- `drop_relation` ([source](https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/relation.sql#L34-L42)) +- `drop_schema` ([source](https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/schema.sql#L12-L20)) +- `get_columns_in_relation` ([source](https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/columns.sql#L1-L8)) (required) +- `list_relations_without_caching` ([source](https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/metadata.sql#L58-L65)) (required) +- `list_schemas` ([source](hhttps://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/metadata.sql#L29-L40)) +- `rename_relation` ([source](https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/relation.sql#L56-L65)) +- `truncate_relation` ([source](https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/relation.sql#L45-L53)) +- `current_timestamp` ([source](https://github.com/dbt-labs/dbt-core/blob/f988f76fccc1878aaf8d8631c05be3e9104b3b9a/core/dbt/include/global_project/macros/adapters/freshness.sql#L1-L8)) (required) +- `copy_grants` + +#### Adapter dispatch Most modern databases support a majority of the standard SQL spec. There are some databases that _do not_ support critical aspects of the SQL spec however, or they provide their own nonstandard mechanisms for implementing the same functionality. To account for these variations in SQL support, dbt provides a mechanism called [multiple dispatch](https://en.wikipedia.org/wiki/Multiple_dispatch) for macros. With this feature, macros can be overridden for specific adapters. This makes it possible to implement high-level methods (like "create ") in a database-specific way. @@ -344,7 +363,7 @@ The `adapter.dispatch()` macro takes a second argument, `packages`, which repres - "Shim" package examples: [`spark-utils`](https://github.com/dbt-labs/spark-utils), [`tsql-utils`](https://github.com/dbt-msft/tsql-utils) - [`adapter.dispatch` docs](dispatch) -### Overriding adapter methods +#### Overriding adapter methods While much of dbt's adapter-specific functionality can be modified in adapter macros, it can also make sense to override adapter methods directly. In this example, assume that a database does not support a `cascade` parameter to `drop schema`. Instead, we can implement an approximation where we drop each relation and then drop the schema. @@ -363,6 +382,9 @@ While much of dbt's adapter-specific functionality can be modified in adapter ma +#### Grants Macros + +See [this GitHub discussion](https://github.com/dbt-labs/dbt-core/discussions/5468) for information on the macros required for `GRANT` statements: ### Other files #### `profile_template.yml` @@ -381,31 +403,14 @@ To assure that `dbt --version` provides the latest dbt core version the adapter It should be noted that both of these files are included in the bootstrapped output of the `dbt-database-adapter-scaffold` so when using the scaffolding, these files will be included. -### Testing your new adapter - -This has moved to its own page: ["Testing a new adapter"](testing-a-new-adapter) - -### Documenting your new adapter - -Many community members maintain their adapter plugins under open source licenses. If you're interested in doing this, we recommend: -- Hosting on a public git provider (e.g. GitHub, GitLab) -- Publishing to [PyPi](https://pypi.org/) -- Adding to the list of ["Available Adapters"](available-adapters#community-supported) - -### Maintaining your new adapter - -When your adapter becomes more popular, and people start using it, you may quickly become the maintainer of an increasingly popular open source project. With this new role, comes some unexpected responsibilities that not only include code maintenance, but also working with a community of users and contributors. To help people understand what to expect of your project, you should communicate your intentions early and often in your adapter documentation or README. Answer questions like, Is this experimental work that people should use at their own risk? Or is this production-grade code that you're committed to maintaining into the future? +## Testing your new adapter -#### Keeping the code compatible with dbt Core +This has moved to its own page: ["Testing a new adapter"](4-testing-a-new-adapter) -New minor version releases of `dbt-core` may include changes to the Python interface for adapter plugins, as well as new or updated test cases. The maintainers of `dbt-core` will clearly communicate these changes in documentation and release notes, and they will aim for backwards compatibility whenever possible. +## Documenting your new adapter -Patch releases of `dbt-core` will _not_ include breaking changes to adapter-facing code. For more details, see ["About dbt Core versions"](core-versions). +This has moved to its own page: ["Documenting a new adapter"](5-documenting-a-new-adapter) -#### Versioning and releasing your adapter +## Maintaining your new adapter -We strongly encourage you to adopt the following approach when versioning and releasing your plugin: -- The minor version of your plugin should match the minor version in `dbt-core` (e.g. 1.1.x). -- Aim to release a new version of your plugin for each new minor version of `dbt-core` (once every three months). -- While your plugin is new, and you're iterating on features, aim to offer backwards compatibility and deprecation notices for at least one minor version. As your plugin matures, aim to leave backwards compatibility and deprecation notices in place until the next major version (dbt Core v2). -- Release patch versions of your plugins whenever needed. These patch releases should contain fixes _only_. +This has moved to a new spot: ["Maintaining your new adapter"](2-prerequisites-for-a-new-adapter##maintaining-your-new-adapter) \ No newline at end of file diff --git a/website/docs/docs/contributing/testing-a-new-adapter.md b/website/docs/docs/contributing/adapter-development/4-testing-a-new-adapter.md similarity index 99% rename from website/docs/docs/contributing/testing-a-new-adapter.md rename to website/docs/docs/contributing/adapter-development/4-testing-a-new-adapter.md index 96e96dc06d8..2fa0b3aaba3 100644 --- a/website/docs/docs/contributing/testing-a-new-adapter.md +++ b/website/docs/docs/contributing/adapter-development/4-testing-a-new-adapter.md @@ -1,6 +1,6 @@ --- title: "Testing a new adapter" -id: "testing-a-new-adapter" +id: "4-testing-a-new-adapter" --- :::info diff --git a/website/docs/docs/contributing/documenting-a-new-adapter.md b/website/docs/docs/contributing/adapter-development/5-documenting-a-new-adapter.md similarity index 87% rename from website/docs/docs/contributing/documenting-a-new-adapter.md rename to website/docs/docs/contributing/adapter-development/5-documenting-a-new-adapter.md index c74f38904b1..e659d272130 100644 --- a/website/docs/docs/contributing/documenting-a-new-adapter.md +++ b/website/docs/docs/contributing/adapter-development/5-documenting-a-new-adapter.md @@ -1,9 +1,16 @@ --- title: "Documenting a new adapter" -id: "documenting-a-new-adapter" +id: "5-documenting-a-new-adapter" --- -If you've already [built](/docs/contributing/building-a-new-adapter.md), and [tested](/docs/contributing/testing-a-new-adapter.md) your adapter, it's time to document it so the dbt community will know that it exists and how to use it! +If you've already [built](3-building-a-new-adapter), and [tested](4-testing-a-new-adapter) your adapter, it's time to document it so the dbt community will know that it exists and how to use it. + +## Making your adapter available + +Many community members maintain their adapter plugins under open source licenses. If you're interested in doing this, we recommend: +- Hosting on a public git provider (for example, GitHub or Gitlab) +- Publishing to [PyPi](https://pypi.org/) +- Adding to the list of ["Available Adapters"](available-adapters#community-supported) (more info below) ## General Guidelines diff --git a/website/docs/docs/contributing/adapter-development/6-promoting-a-new-adapter.md b/website/docs/docs/contributing/adapter-development/6-promoting-a-new-adapter.md new file mode 100644 index 00000000000..eca75adbbe0 --- /dev/null +++ b/website/docs/docs/contributing/adapter-development/6-promoting-a-new-adapter.md @@ -0,0 +1,119 @@ +--- +title: "Promoting a new adapter" +id: "6-promoting-a-new-adapter" +--- + +## Model for engagement in the dbt community + +The most important thing here is recognizing that people are successful in the community when they join, first and foremost, to engage authentically. + +What does authentic engagement look like? It’s challenging to define explicit rules. One good rule of thumb is to treat people with dignity and respect. + +Contributors to the community should think of contribution *as the end itself,* not a means toward other business KPIs (leads, community members, etc.). [We believe that profits are exhaust.](https://www.getdbt.com/dbt-labs/values/#:~:text=Profits%20are%20exhaust.) Some ways to know if you’re authentically engaging: + +- Is an engagement’s *primary* purpose of sharing knowledge and resources or building brand engagement? +- Imagine you didn’t work at the org you do — can you imagine yourself still writing this? +- Is it written in formal / marketing language, or does it sound like you, the human? + +## Who should join the dbt community slack + +### People who have insight into what it means to do hands-on [analytics engineering](https://www.getdbt.com/analytics-engineering/) work + +The dbt Community Slack workspace is fundamentally a place for analytics practitioners to interact with each other — the closer the users are in the community to actual data/analytics engineering work, the more natural their engagement will be (leading to better outcomes for partners and the community). + +### DevRel practitioners with strong focus + +DevRel practitioners often have a strong analytics background and a good understanding of the community. It’s essential to be sure they are focused on *contributing,* not on driving community metrics for partner org (such as signing people up for their slack or events). The metrics will rise naturally through authentic engagement. + +### Founder and executives who are interested in directly engaging with the community + +This is either incredibly successful or not at all depending on the profile of the founder. Typically, this works best when the founder has a practitioner-level of technical understanding and is interested in joining not to promote, but to learn and hear from users. + +### Software Engineers at partner products that are building and supporting integrations with either dbt Core or dbt Cloud + +This is successful when the engineers are familiar with dbt as a product or at least have taken our training course. The Slack is often a place where end-user questions and feedback is initially shared, so it is recommended that someone technical from the team be present. There are also a handful of channels aimed at those building integrations, which tend to be a font of knowledge. + +### Who might struggle in the dbt community +#### People in marketing roles +dbt Slack is not a marketing channel. Attempts to use it as such invariably fall flat and can even lead to people having a negative view of a product. This doesn’t mean that dbt can’t serve marketing objectives, but a long-term commitment to engagement is the only proven method to do this sustainably. + +#### People in product roles +The dbt Community can be an invaluable source of feedback on a product. There are two primary ways this can happen — organically (community members proactively suggesting a new feature) and via direct calls for feedback and user research. Immediate calls for engagement must be done in your dedicated #tools channel. Direct calls should be used sparingly, as they can overwhelm more organic discussions and feedback. + +## Who is the audience for an adapter release + +A new adapter is likely to drive huge community interest from several groups of people: +- People who are currently using the database that the adapter is supporting +- People who may be adopting the database in the near future. +- People who are interested in dbt development in general. + +The database users will be your primary audience and the most helpful in achieving success. Engage them directly in the adapter’s dedicated Slack channel. If one does not exist already, reach out in #channel-requests, and we will get one made for you and include it in an announcement about new channels. + +The final group is where non-slack community engagement becomes important. Twitter and LinkedIn are both great places to interact with a broad audience. A well-orchestrated adapter release can generate impactful and authentic engagement. + +## How to message the initial rollout and follow-up content + +Tell a story that engages dbt users and the community. Highlight new use cases and functionality unlocked by the adapter in a way that will resonate with each segment. + +### Existing users of your technology who are new to dbt + - Provide a general overview of the value dbt will deliver to your users. This can lean on dbt's messaging and talking points which are laid out in the [dbt viewpoint.](https://docs.getdbt.com/docs/about/viewpoint) + - Give examples of a rollout that speaks to the overall value of dbt and your product. + +### Users who are already familiar with dbt and the community +- Consider unique use cases or advantages your adapter provide over existing adapters. Who will be excited for this? +- Contribute to the dbt Community and ensure that dbt users on your adapter are well supported (tutorial content, packages, documentation, etc). +- Example of a rollout that is compelling for those familiar with dbt: [Firebolt](https://www.linkedin.com/feed/update/urn:li:activity:6879090752459182080/) + +## Tactically manage distribution of content about new or existing adapters + +There are tactical pieces on how and where to share that help ensure success. + +### On slack: +- #i-made-this channel — this channel has a policy against “marketing” and “content marketing” posts, but it should be successful if you write your content with the above guidelines in mind. Even with that, it’s important to post here sparingly. +- Your own database / tool channel — this is where the people who have opted in to receive communications from you and always a great place to share things that are relevant to them. + +### On social media: +- Twitter +- LinkedIn +- Social media posts *from the author* or an individual connected to the project tend to have better engagement than posts from a company or organization account. +- Ask your partner representative about: + - Retweets and shares from the official dbt Labs accounts. + - Flagging posts internally at dbt Labs to get individual employees to share. + +## Measuring engagement + +You don’t need 1000 people in a channel to succeed, but you need at least a few active participants who can make it feel lived in. If you’re comfortable working in public, this could be members of your team, or it can be a few people who you know that are highly engaged and would be interested in participating. Having even 2 or 3 regulars hanging out in a channel is all that’s needed for a successful start and is, in fact, much more impactful than 250 people that never post. + +## How to announce a new adapter + +We’d recommend *against* boilerplate announcements and encourage finding a unique voice. That being said, there are a couple of things that we’d want to include: + +- A summary of the value prop of your database / technology for users who aren’t familiar. +- The personas that might be interested in this news. +- A description of what the adapter *is*. For example: + > With the release of our new dbt adapter, you’ll be able to to use dbt to model and transform your data in [name-of-your-org] +- Particular or unique use cases or functionality unlocked by the adapter. +- Plans for future / ongoing support / development. +- The link to the documentation for using the adapter on the dbt Labs docs site. +- An announcement blog. + +## Announcing new release versions of existing adapters + +This can vary substantially depending on the nature of the release but a good baseline is the types of release messages that [we put out in the #dbt-releases](https://getdbt.slack.com/archives/C37J8BQEL/p1651242161526509) channel. + +![Full Release Post](/img/adapter-guide/0-full-release-notes.png) + +Breaking this down: + +- Visually distinctive announcement - make it clear this is a release + +- Short written description of what is in the release + +- Links to additional resources + +- Implementation instructions: + +- Future plans + +- Contributor recognition (if applicable) + diff --git a/website/docs/docs/contributing/adapter-development/7-verifying-a-new-adapter.md b/website/docs/docs/contributing/adapter-development/7-verifying-a-new-adapter.md new file mode 100644 index 00000000000..bccb8afc5a9 --- /dev/null +++ b/website/docs/docs/contributing/adapter-development/7-verifying-a-new-adapter.md @@ -0,0 +1,41 @@ +--- +title: "Verifying a new adapter" +id: "7-verifying-a-new-adapter" +--- + +## Why verify an adapter? + +The very first data platform dbt supported was Redshift followed quickly by Postgres (([dbt-core#174](https://github.com/dbt-labs/dbt-core/pull/174)). In 2017, back when dbt Labs (née Fishtown Analytics) was still a data consultancy, we added support for Snowflake and BigQuery. We also turned dbt's database support into an adapter framework ([dbt-core#259](https://github.com/dbt-labs/dbt-core/pull/259/)), and a plugin system a few years later. For years, dbt Labs specialized in those four data platforms and became experts in them. However, the surface area of all possible databases, their respective nuances, and keeping them up-to-date and bug-free is a Herculean and/or Sisyphean task that couldn't be done by a single person or even a single team! Enter the dbt community which enables dbt Core to work on more than 30 different databases (32 as of Sep '22)! + +Free and open-source tools for the data professional are increasingly abundant. This is by-and-large a *good thing*, however it requires due dilligence that wasn't required in a paid-license, closed-source software world. Before taking a dependency on an open-source projet is is important to determine the answer to the following questions: + +1. Does it work? +2. Does it meet my team's specific use case? +3. Does anyone "own" the code, or is anyone liable for ensuring it works? +4. Do bugs get fixed quickly? +5. Does it stay up-to-date with new Core features? +6. Is the usage substantial enough to self-sustain? +7. What risks do I take on by taking a dependency on this library? + +These are valid, important questions to answer—especially given that `dbt-core` itself only put out its first stable release (major version v1.0) in December 2021! Indeed, up until now, the majority of new user questions in database-specific channels are some form of: +- "How mature is `dbt-`? Any gotchas I should be aware of before I start exploring?" +- "has anyone here used `dbt-` for production models?" +- "I've been playing with `dbt-` -- I was able to install and run my initial experiments. I noticed that there are certain features mentioned on the documentation that are marked as 'not ok' or 'not tested'. What are the risks? +I'd love to make a statement on my team to adopt DBT [sic], but I'm pretty sure questions will be asked around the possible limitations of the adapter or if there are other companies out there using dbt [sic] with Oracle DB in production, etc." + +There has been a tendency to trust the dbt Labs-maintained adapters over community- and vendor-supported adapters, but repo ownership is only one among many indicators of software quality. We aim to help our users feel well-informed as to the caliber of an adapter with a new program. + +## Verified by dbt Labs + +The adapter verification program aims to quickly indicate to users which adapters can be trusted to use in production. Previously, doing so was uncharted territory for new users and complicated making the business case to their leadership team. We plan to give quality assurances by: +1. appointing a key stakeholder for the adapter repository, +2. ensuring that the chosen stakeholder fixes bugs and cuts new releases in a timely manner see maintainer your adapter (["Maintaining your new adapter"](2-prerequisites-for-a-new-adapter#maintaining-your-new-adapter)), +3. demonstrating that it passes our adapter pytest suite tests, +4. assuring that it works for us internally and ideally an existing team using the adapter in production . + + +Every major & minor version of a adapter will be verified internally and given an official :white_check_mark: (custom emoji coming soon), on the [Available Adapters](available-adapters) page. + +## How to get an adapter verified? + +We envision that data platform vendors will be most interested in having their adapter versions verified, however we are open to community adapter verification. If interested, please reach out either to the `partnerships` at `dbtlabs.com` or post in the [#adapter-ecosystem Slack channel](https://getdbt.slack.com/archives/C030A0UF5LM). \ No newline at end of file diff --git a/website/sidebars.js b/website/sidebars.js index 1f068e8f61d..11280801f7c 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -56,11 +56,27 @@ const sidebarSettings = { items: [ "docs/contributing/oss-expectations", "docs/contributing/contributor-license-agreements", - "docs/contributing/building-a-new-adapter", - "docs/contributing/testing-a-new-adapter", - "docs/contributing/documenting-a-new-adapter", "docs/contributing/slack-rules-of-the-road", "docs/contributing/long-lived-discussions-guidelines", + { + type: "category", + label: "Adapter development", + link: { + type: 'generated-index', + title: 'Adapter Development', + description: 'Learn what an adapter is what what\'s required to make one. Also how to build, test, document, promote, and verify your new adapter. Visit the [#adapter-ecosystem](https://getdbt.slack.com/archives/C030A0UF5LM) Slack channel for additional help beyond this section.', + + }, + items: [ + 'docs/contributing/adapter-development/1-what-are-adapters', + 'docs/contributing/adapter-development/2-prerequisites-for-a-new-adapter', + 'docs/contributing/adapter-development/3-building-a-new-adapter', + 'docs/contributing/adapter-development/4-testing-a-new-adapter', + 'docs/contributing/adapter-development/5-documenting-a-new-adapter', + 'docs/contributing/adapter-development/6-promoting-a-new-adapter', + 'docs/contributing/adapter-development/7-verifying-a-new-adapter' + ] + } ], }, { diff --git a/website/static/img/adapter-guide/0-full-release-notes.png b/website/static/img/adapter-guide/0-full-release-notes.png new file mode 100644 index 00000000000..7acb59d8ffa Binary files /dev/null and b/website/static/img/adapter-guide/0-full-release-notes.png differ diff --git a/website/static/img/adapter-guide/1-announcement.png b/website/static/img/adapter-guide/1-announcement.png new file mode 100644 index 00000000000..32f5cd6ba5d Binary files /dev/null and b/website/static/img/adapter-guide/1-announcement.png differ diff --git a/website/static/img/adapter-guide/2-short-description.png b/website/static/img/adapter-guide/2-short-description.png new file mode 100644 index 00000000000..547b856ebb0 Binary files /dev/null and b/website/static/img/adapter-guide/2-short-description.png differ diff --git a/website/static/img/adapter-guide/3-additional-resources.png b/website/static/img/adapter-guide/3-additional-resources.png new file mode 100644 index 00000000000..575157b9d54 Binary files /dev/null and b/website/static/img/adapter-guide/3-additional-resources.png differ diff --git a/website/static/img/adapter-guide/4-installation.png b/website/static/img/adapter-guide/4-installation.png new file mode 100644 index 00000000000..c728ff6952b Binary files /dev/null and b/website/static/img/adapter-guide/4-installation.png differ diff --git a/website/static/img/adapter-guide/5-coming-up.png b/website/static/img/adapter-guide/5-coming-up.png new file mode 100644 index 00000000000..4681ee87a1b Binary files /dev/null and b/website/static/img/adapter-guide/5-coming-up.png differ diff --git a/website/static/img/adapter-guide/6-thank-contribs.png b/website/static/img/adapter-guide/6-thank-contribs.png new file mode 100644 index 00000000000..b2db6df4856 Binary files /dev/null and b/website/static/img/adapter-guide/6-thank-contribs.png differ diff --git a/website/static/img/adapter-guide/adapter architecture - postgres.png b/website/static/img/adapter-guide/adapter architecture - postgres.png new file mode 100644 index 00000000000..d64dbc95026 Binary files /dev/null and b/website/static/img/adapter-guide/adapter architecture - postgres.png differ