-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbt Constraints / model contracts #6271
Changes from all commits
e879cdd
bd1afc7
4b2a881
e8b52e0
01a07d3
3b31a15
1a1f46a
ebaa54c
fd7a47a
7e3a9be
6df02da
bc3c5bc
4b901fd
e29b571
7d085dc
e6559d4
ca89141
2c4a4cf
1975c6b
f088a03
7421caa
a7395bb
5d06524
380bd96
e6e490d
9c498ef
bed1fec
8c466b0
547ad9e
52bd35b
7582531
7291094
b87f57d
5529334
00f12c2
76bf69c
2e51246
5891eb3
5d2867f
801b2fd
5d59cc1
92d2ea7
4f747b0
eba0b6d
ae56da1
de653e4
cfc53b0
fc7230b
e1c72ac
b215b6c
d550de4
9d87463
ce85c96
49a4120
8ffb654
f2f2707
096f3fd
eae4e76
b8c3812
bb1a6c3
b6dbcf6
751cdc8
6bbd797
d364eeb
4a58ece
ac795dd
d452cae
baf18f0
76c0e4f
10ab3cb
6253ed0
307809d
ab4f396
b99e9be
5935201
f59c9dd
7e28a31
3ddd666
fabe2ce
ffec7d7
d338f33
e34c467
44b2f18
17b1f8e
c652367
f9e020d
926e555
c6bd674
bcc35fc
880ed43
426789e
3d61eda
f163b2c
bf45243
dbef42b
c4de8f3
ad07ced
a501c29
59b0298
e46066b
e80b8cd
257eacd
391bd3b
0441417
e0bcb25
dcf7062
903a2cb
c40ee92
111683a
f8b16bc
97f0c6b
2a33baf
6806a7c
1256e7b
0304dbf
b5b1699
d338faa
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
kind: Features | ||
body: Data type constraints are now native to SQL table materializations. Enforce | ||
columns are specific data types and not null depending on database functionality. | ||
time: 2022-11-18T14:11:20.868062-08:00 | ||
custom: | ||
Author: sungchun12 | ||
Issue: "6079" | ||
PR: "6271" |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -93,6 +93,8 @@ class HasDocs(AdditionalPropertiesMixin, ExtensibleDbtClassMixin, Replaceable): | |
description: str = "" | ||
meta: Dict[str, Any] = field(default_factory=dict) | ||
data_type: Optional[str] = None | ||
constraints: Optional[List[str]] = None | ||
constraints_check: Optional[str] = None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Clarifying Q that I'm sure has been answered elsewhere: Which among our most popular data platforms actually support the totally flexible All good things we'll want to get documented! In addition to, which constraints are actually enforced. For most of our platforms, it's only really There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, just postgres + databricks. We'll include adapter-specific disclaimers in this PR: dbt-labs/docs.getdbt.com#2601 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [For follow-up issues, out of scope for this PR] This didn't occur to me on the first read-through, but is there a reason why we want this to be a separate configuration, rather than just a subset of the available Answering my own question: It looks like it's because Databricks requires that I still think I might prefer a configuration like: constraints: Optional[List[Union[str, Dict[str, str]]]] = None columns:
- name: price
data_type: numeric
constraints:
- not null
- check: price > 0
name: positive_price # Postgres supports naming constraints Postgres + Databricks also support table-level constraints that are composed of multiple columns. We should also support |
||
docs: Docs = field(default_factory=Docs) | ||
_extra: Dict[str, Any] = field(default_factory=dict) | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
{%- macro get_columns_spec_ddl() -%} | ||
{{ adapter.dispatch('get_columns_spec_ddl', 'dbt')() }} | ||
{%- endmacro -%} | ||
|
||
{% macro default__get_columns_spec_ddl() -%} | ||
{{ return(columns_spec_ddl()) }} | ||
{%- endmacro %} | ||
|
||
{% macro columns_spec_ddl() %} | ||
{# loop through user_provided_columns to create DDL with data types and constraints #} | ||
{%- set user_provided_columns = model['columns'] -%} | ||
( | ||
{% for i in user_provided_columns %} | ||
mikealfare marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{% set col = user_provided_columns[i] %} | ||
{% set constraints = col['constraints'] %} | ||
{% set constraints_check = col['constraints_check'] %} | ||
{{ col['name'] }} {{ col['data_type'] }} {% for x in constraints %} {{ x or "" }} {% endfor %} {% if constraints_check -%} check {{ constraints_check or "" }} {%- endif %} {{ "," if not loop.last }} | ||
{% endfor %} | ||
) | ||
{% endmacro %} | ||
|
||
{%- macro get_assert_columns_equivalent(sql) -%} | ||
{{ adapter.dispatch('get_assert_columns_equivalent', 'dbt')(sql) }} | ||
{%- endmacro -%} | ||
|
||
{% macro default__get_assert_columns_equivalent(sql) -%} | ||
{{ return(assert_columns_equivalent(sql)) }} | ||
{%- endmacro %} | ||
|
||
{% macro assert_columns_equivalent(sql) %} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This macro is failing in Snowflake because
I could dispatch&replicate the macro and handle upper/lower-case conversion but I am wondering if it should be handled in Core directly and not in the adapter. In addition, shouldn't we check if columns are quoted to know if we should perform lower/upper-case conversion or not? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @b-per Great point. And that (Sorry everyone: #2986) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @b-per Great feedback. I'll update the macro to uppercase all the columns across both the schema and sql files. I'll also figure out quote columns! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated this to uppercase all columns and added extra uppercase checks and quotes to the columns in the SQL file and added a
sungchun12 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{#- loop through user_provided_columns to get column names -#} | ||
{%- set user_provided_columns = model['columns'] -%} | ||
{%- set column_names_config_only = [] -%} | ||
{%- for i in user_provided_columns -%} | ||
{%- set col = user_provided_columns[i] -%} | ||
{%- set col_name = col['name'] -%} | ||
{%- set column_names_config_only = column_names_config_only.append(col_name) -%} | ||
{%- endfor -%} | ||
{%- set sql_file_provided_columns = get_columns_in_query(sql) -%} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Making an assumption here that column ordering is consistent (and deterministic) in all our data platforms. Follow-on TODO: We prefer option 2 in https://github.com/dbt-labs/dbt-core/pull/6271/files#r1069332715, let's make sure dbt always ensures order itself, and doesn't make yaml-ordering matter so much There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @MichelleArk for opening #6975! |
||
|
||
{#- uppercase both schema and sql file columns -#} | ||
{%- set column_names_config_upper= column_names_config_only|map('upper')|join(',') -%} | ||
{%- set column_names_config_formatted = column_names_config_upper.split(',') -%} | ||
{%- set sql_file_provided_columns_upper = sql_file_provided_columns|map('upper')|join(',') -%} | ||
{%- set sql_file_provided_columns_formatted = sql_file_provided_columns_upper.split(',') -%} | ||
|
||
{%- if column_names_config_formatted != sql_file_provided_columns_formatted -%} | ||
{%- do exceptions.raise_compiler_error('Please ensure the name, order, and number of columns in your `yml` file match the columns in your SQL file.\nSchema File Columns: ' ~ column_names_config_formatted ~ '\nSQL File Columns: ' ~ sql_file_provided_columns_formatted ~ ' ' ) %} | ||
{%- endif -%} | ||
|
||
{% endmacro %} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,6 +25,10 @@ | |
|
||
create {% if temporary: -%}temporary{%- endif %} table | ||
{{ relation.include(database=(not temporary), schema=(not temporary)) }} | ||
{% if config.get('constraints_enabled', False) %} | ||
{{ get_assert_columns_equivalent(sql) }} | ||
{{ get_columns_spec_ddl() }} | ||
{% endif %} | ||
Comment on lines
+28
to
+31
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Follow-up (related to ongoing discussion in #6750): We might want to distinguish between the "contract" and the "constraints." That distinction would make it possible for users to define (e.g.) What ought to be part of the contract (for detecting breaking changes, #6869)? "Data shape" for sure (column names, types, nullability). Additional constraints? Tests? We need to decide There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Assigning myself to #6750 for further refinement, before we pick this up for implementation |
||
as ( | ||
{{ sql }} | ||
); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,7 +18,7 @@ | |
from dbt.contracts.graph.manifest import Manifest | ||
from dbt.contracts.graph.nodes import ManifestNode, BaseNode | ||
from dbt.contracts.graph.unparsed import UnparsedNode, Docs | ||
from dbt.exceptions import DbtInternalError, ConfigUpdateError, DictParseError | ||
from dbt.exceptions import DbtInternalError, ConfigUpdateError, DictParseError, ParsingError | ||
from dbt import hooks | ||
from dbt.node_types import NodeType, ModelLanguage | ||
from dbt.parser.search import FileBlock | ||
|
@@ -306,6 +306,19 @@ def update_parsed_node_config( | |
else: | ||
parsed_node.docs = Docs(show=docs_show) | ||
|
||
# If we have constraints_enabled in the config, copy to node level, for backwards | ||
# compatibility with earlier node-only config. | ||
if config_dict.get("constraints_enabled", False): | ||
parsed_node.constraints_enabled = True | ||
|
||
parser_name = type(self).__name__ | ||
if parser_name == "ModelParser": | ||
original_file_path = parsed_node.original_file_path | ||
error_message = "\n `constraints_enabled=true` can only be configured within `schema.yml` files\n NOT within a model file(ex: .sql, .py) or `dbt_project.yml`." | ||
raise ParsingError( | ||
f"Original File Path: ({original_file_path})\nConstraints must be defined in a `yml` schema configuration file like `schema.yml`.\nOnly the SQL table materialization is supported for constraints. \n`data_type` values must be defined for all columns and NOT be null or blank.{error_message}" | ||
) | ||
Comment on lines
+309
to
+320
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Confirming that this is necessary only:
As @gshank commented earlier, parse time is the right time for this sort of validation, versus compile-time, per our slightly nonstandard meanings of "parsing" + "compilation." I do see the downside of, we return parse-time errors as soon as we hit them, rather than batching up all the errors we can and returning them all. If it were a warning instead, then we could return them all together! FWIW - I'm aligned with If we can't resolve that tech debt before the v1.5 release, I'd probably opt for removing the parse-time check, and letting the database return its own validation error at runtime. It's a less-nice UX to see the error later in the dev cycle, but I think it's better to offer the full flexibility around configuring There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are correct on why this is a Parsing Error today. I'm fine with the less-nice UX in favor of |
||
|
||
# unrendered_config is used to compare the original database/schema/alias | ||
# values and to handle 'same_config' and 'same_contents' calls | ||
parsed_node.unrendered_config = config.build_config_dict( | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -119,6 +119,8 @@ def add( | |
column: Union[HasDocs, UnparsedColumn], | ||
description: str, | ||
data_type: Optional[str], | ||
constraints: Optional[List[str]], | ||
constraints_check: Optional[str], | ||
meta: Dict[str, Any], | ||
): | ||
tags: List[str] = [] | ||
|
@@ -132,6 +134,8 @@ def add( | |
name=column.name, | ||
description=description, | ||
data_type=data_type, | ||
constraints=constraints, | ||
constraints_check=constraints_check, | ||
meta=meta, | ||
tags=tags, | ||
quote=quote, | ||
|
@@ -144,8 +148,10 @@ def from_target(cls, target: Union[HasColumnDocs, HasColumnTests]) -> "ParserRef | |
for column in target.columns: | ||
description = column.description | ||
data_type = column.data_type | ||
constraints = column.constraints | ||
constraints_check = column.constraints_check | ||
meta = column.meta | ||
refs.add(column, description, data_type, meta) | ||
refs.add(column, description, data_type, constraints, constraints_check, meta) | ||
return refs | ||
|
||
|
||
|
@@ -914,6 +920,75 @@ def parse_patch(self, block: TargetBlock[NodeTarget], refs: ParserRef) -> None: | |
self.patch_node_config(node, patch) | ||
|
||
node.patch(patch) | ||
self.validate_constraints(node) | ||
|
||
def validate_constraints(self, patched_node): | ||
error_messages = [] | ||
if ( | ||
patched_node.resource_type == "model" | ||
and patched_node.config.constraints_enabled is True | ||
): | ||
validators = [ | ||
self.constraints_schema_validator(patched_node), | ||
self.constraints_materialization_validator(patched_node), | ||
self.constraints_language_validator(patched_node), | ||
self.constraints_data_type_validator(patched_node), | ||
] | ||
error_messages = [validator for validator in validators if validator != "None"] | ||
|
||
if error_messages: | ||
original_file_path = patched_node.original_file_path | ||
raise ParsingError( | ||
f"Original File Path: ({original_file_path})\nConstraints must be defined in a `yml` schema configuration file like `schema.yml`.\nOnly the SQL table materialization is supported for constraints. \n`data_type` values must be defined for all columns and NOT be null or blank.{self.convert_errors_to_string(error_messages)}" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Good on you for having the explicit callout here! This does not work for:
Unless!:
I lean toward option 2, as a way to expand this support beyond just tables. In a future where we support project-level governance rules ("all public models must have constraints enabled"), without that support, we'd also be requiring that every public model be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For Option 2, I'm aligned on approach to get There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Happy to have this in a follow-up issue/PR! I don't think we need it for the initial cut / enough to unlock a beta experience There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [For follow-up issues, out of scope for this PR] After a bit more investigation, we'll need to use a modified version of |
||
) | ||
|
||
def convert_errors_to_string(self, error_messages: List[str]): | ||
n = len(error_messages) | ||
if not n: | ||
return "" | ||
if n == 1: | ||
return error_messages[0] | ||
error_messages_string = "".join(error_messages[:-1]) + f"{error_messages[-1]}" | ||
return error_messages_string | ||
|
||
def constraints_schema_validator(self, patched_node): | ||
schema_error = False | ||
if patched_node.columns == {}: | ||
schema_error = True | ||
schema_error_msg = "\n Schema Error: `yml` configuration does NOT exist" | ||
schema_error_msg_payload = f"{schema_error_msg if schema_error else None}" | ||
return schema_error_msg_payload | ||
|
||
def constraints_materialization_validator(self, patched_node): | ||
materialization_error = {} | ||
if patched_node.config.materialized != "table": | ||
materialization_error = {"materialization": patched_node.config.materialized} | ||
materialization_error_msg = f"\n Materialization Error: {materialization_error}" | ||
materialization_error_msg_payload = ( | ||
f"{materialization_error_msg if materialization_error else None}" | ||
) | ||
return materialization_error_msg_payload | ||
|
||
def constraints_language_validator(self, patched_node): | ||
language_error = {} | ||
language = str(patched_node.language) | ||
if language != "sql": | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [For follow-up issues, out of scope for this PR] It should be possible to support dbt-py models, by saving the Python model in a temp table, and then inserting it into a preexisting table (created per contract). The downside of that approach is, the full (potentially expensive) transformation has already taken place before the contract could be validated. As a general rule, let's treat all of the "validators" defined here as potential TODOs. We'll either want to remove them as limitations, or keep raising explicit errors (with potentially reworked wording). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jtcohen6 I'm on board with this example config, feels a lot more elegant compared to the first draft: columns:
- name: price
data_type: numeric
constraints:
- not null
- check: price > 0
name: positive_price # Postgres supports naming constraints I'm aligned with you on I'm aligned on python models as long as we document/warn the user of performance hits as a result. I'm aligned with you on "validators". My main goal is to provide maximally useful terminal logs that invoke clear and quick actions to fix. This makes iteration a delight rather than a chore. Thanks for such a thorough review and eye towards the future. I'm juiced up to finish these PRs! |
||
language_error = {"language": language} | ||
language_error_msg = f"\n Language Error: {language_error}" | ||
language_error_msg_payload = f"{language_error_msg if language_error else None}" | ||
return language_error_msg_payload | ||
|
||
def constraints_data_type_validator(self, patched_node): | ||
data_type_errors = set() | ||
for column, column_info in patched_node.columns.items(): | ||
if column_info.data_type is None: | ||
data_type_error = {column} | ||
data_type_errors.update(data_type_error) | ||
data_type_errors_msg = ( | ||
f"\n Columns with `data_type` Blank/Null Errors: {data_type_errors}" | ||
) | ||
data_type_errors_msg_payload = f"{data_type_errors_msg if data_type_errors else None}" | ||
return data_type_errors_msg_payload | ||
|
||
|
||
class TestablePatchParser(NodePatchParser[UnparsedNodeUpdate]): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,7 +9,14 @@ | |
{%- elif unlogged -%} | ||
unlogged | ||
{%- endif %} table {{ relation }} | ||
as ( | ||
{% if config.get('constraints_enabled', False) %} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The initial piece (lines 8-12) is the same for both conditions. This if block should be moved down to just the piece that is the difference. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
{{ get_assert_columns_equivalent(sql) }} | ||
{{ get_columns_spec_ddl() }} ; | ||
insert into {{ relation }} {{ get_column_names() }} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On at least Postgres (at least recent versions), you do not need to provide the list of column names when inserting:
On other databases (e.g. Redshift), it may indeed be required There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good to know! I guess this is where dbt's opinions aim for explicit over implicit in compilation |
||
{% else %} | ||
as | ||
{% endif %} | ||
( | ||
{{ sql }} | ||
); | ||
{%- endmacro %} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
{% macro postgres__get_columns_spec_ddl() %} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is any part of this (+ other pg-specific code here) reusable? Any sense in making it the "default" implementation, within the "global project," instead of just for Postgres? Asking just to confirm — I'm sure you've already given it thought, across the many adapter-specific implementations! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, this was a toughie to discern. I biased towards the global macro default bundling the data type DDL and wrapping the select statement because other adapters may benefit from it(think: bigquery, snowflake, etc.). At the same time, the I recommend keeping it as is given the above, but given your team maintains this and you have a bigger picture view of how adapters will use this, I'm happy to make the |
||
{# loop through user_provided_columns to create DDL with data types and constraints #} | ||
{%- set user_provided_columns = model['columns'] -%} | ||
( | ||
{% for i in user_provided_columns %} | ||
{% set col = user_provided_columns[i] %} | ||
{% set constraints = col['constraints'] %} | ||
{% set constraints_check = col['constraints_check'] %} | ||
{{ col['name'] }} {{ col['data_type'] }} {% for x in constraints %} {{ x or "" }} {% endfor %} {% if constraints_check -%} check {{ constraints_check or "" }} {%- endif %} {{ "," if not loop.last }} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A few things here that are interesting, and worth a second look — at least to document, if we don't decide to actually do anything about them. When an explicit column list is provided, both on "older-school" databases that need to use
-- models/important_model.sql
select
1 as id,
10000 as revenue,
100 as num_customers version: 2
models:
- name: important_model
config:
materialized: table
constraints_enabled: True
columns:
- name: id
data_type: numeric
- name: num_customers
data_type: int
- name: revenue
data_type: int
Average revenue per customer is going to look a bit off! Two options here:
insert into ...
(id, num_customers, revenue)
select id, num_customers, revenue from -- I added this
(
select
1 as id,
10000 as revenue,
100 as num_customers
) model_subq; -- postgres requires named subqueries The latter strategy is more or less what we do in incremental models, to handle cases where an incremental model's columns have been reordered since first run. The good news: In all cases, we are guaranteed that the column names, types, and order in the resulting table match up exactly with the columns defined in the config. That is better, since it's static structured data that we know in advance, record in the manifest, and can use to communicate the "API" of this table to downstream consumers. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I vote option 1 because it has less database performance hits because option 2 using subqueries depends a lot more on the database to optimize it(which varies). I'll make this change! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure we create any temp tables in the current implementation, so where would we do the mid-transaction check in option 1? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dave-connors-3 The way we replace preexisting views/tables on Postgres + Redshift is by creating a "temp" relation (with a suffix So we could try to stick another check in the middle of that process. But I actually think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
{% endfor %} | ||
) | ||
{% endmacro %} | ||
|
||
{% macro get_column_names() %} | ||
{# loop through user_provided_columns to get column names #} | ||
{%- set user_provided_columns = model['columns'] -%} | ||
( | ||
{% for i in user_provided_columns %} | ||
{% set col = user_provided_columns[i] %} | ||
{{ col['name'] }} {{ "," if not loop.last }} | ||
{% endfor %} | ||
) | ||
{% endmacro %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with this nomenclature for now. I'll be curious to hear feedback from beta testers.
One thing that's potentially misleading: In addition to enabling
constraints
(if they're defined), this also enables (requires) the verification of the number, order, and data types of columns.A long time ago, there was an idea to call this
strict
: #1570. (While searching for this issue, I was proud to find that I still know @jwerderits' GitHub handle by memory.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to changing it after community feedback! There are exceptions that I flow through into
DatabaseErrors
vs.ParsingErrors
because we get error handling for free(think: data type validation, number of columns mismatch SQL).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[For follow-up issues, out of scope for this PR]
After a bit more discussion, I'm thinking about renaming
constraints_enabled
to either:stable: true|false
contracted: true|false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If it has a default value
False
then it doesn't need to be anOption type
, it could just bebool
.It's a shame the python typing lib uses the word
Optional
as a keyword; it's the default value that makes providing the attribute optional when instantiating the class.On another note, a bool with a default value of
None
forces the use ofOptional[bool]
which only makes sense if you must have a lenient (public) API or if there's inheritance compatibility issues.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davehowell Good catch! It looks like @gshank already had the same thought, and implemented this change over in #7002, along with the config rename :)