-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremental table hints and incremental in resource decorator #2033
Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steinitzu pls take a look at:
- IMO it is way better if we set Incremental on the column level in Schema and ignore all non-fitting cases
- I think the code that injects Incremental should use existing mechanism (apply_hints) OR I do not understand some edge cases you had
dlt/extract/resource.py
Outdated
@@ -102,6 +103,7 @@ def __init__( | |||
section: str = None, | |||
args_bound: bool = False, | |||
SPEC: Type[BaseConfiguration] = None, | |||
incremental: Optional[TIncrementalConfig] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please pass it in hint
. Resource hints may hold incremental and you can extend make_hints
to set it up:
class TResourceHints(TResourceHintsBase, total=False):
name: TTableHintTemplate[str]
# description: TTableHintTemplate[str]
# table_sealed: Optional[bool]
columns: TTableHintTemplate[TTableSchemaColumns]
incremental: Incremental[Any]
dlt/extract/decorators.py
Outdated
@@ -642,7 +648,9 @@ def make_resource(_name: str, _section: str, _data: Any) -> TDltResourceImpl: | |||
selected, | |||
cast(DltResource, data_from), | |||
True, | |||
incremental=incremental, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should extend make_hints
to accept incremental config. then we'll be able to use existing mechanism that injects incremental into functions
dc917a3
to
867ee67
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- let's save just
incremental
: True into the schema. see my comment why - I added a test that makes sure we can add incremental during extraction (but limited...)
- we should also add "primary_key" to IncrementalArgs. mind that empty key ([] or ()) is also a valid hint. (this key is used for boundary deduplication)
tests:
Do we detect nested incremental columns ie $.item.id will map into "item__id"? if you allow for that then please test it
dlt/extract/hints.py
Outdated
@@ -86,6 +86,7 @@ def make_hints( | |||
table_format: TTableHintTemplate[TTableFormat] = None, | |||
file_format: TTableHintTemplate[TFileFormat] = None, | |||
references: TTableHintTemplate[TTableReferenceParam] = None, | |||
incremental: Incremental[Any] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we allow TIncrementalConfig
here as well?
dlt/extract/hints.py
Outdated
@@ -221,7 +228,7 @@ def apply_hints( | |||
columns: TTableHintTemplate[TAnySchemaColumns] = None, | |||
primary_key: TTableHintTemplate[TColumnNames] = None, | |||
merge_key: TTableHintTemplate[TColumnNames] = None, | |||
incremental: Incremental[Any] = None, | |||
incremental: Union[Incremental[Any], IncrementalArgs] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is TIncrementalConfig
dlt/common/jsonpath.py
Outdated
E.g. the jsonpath `$.foo` or `foo` would return `foo`. | ||
The jsonpath `$.foo.items[*]` would return None. | ||
""" | ||
path = compile_path(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @burnash added very similar function. let's unify our code and make sure this is tested
dlt/extract/hints.py
Outdated
def _merge_incremental_column_hint(dict_: Dict[str, Any]) -> None: | ||
incremental = dict_.pop("incremental") | ||
if isinstance(incremental, Incremental): | ||
incremental_hint = incremental.to_table_hint() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the ticket asked to put the full incremental info into table now when I think of it - it will be confusing. it describes the behavior of the data source, not the data in the actual table. my take is to just set incremental to True
if it is present on a column,
it is sufficient for what we plan upstream. we could add row order later if we really need it (using x-
annotation)
I do not thing this impacts this PR too much
dlt/extract/incremental/__init__.py
Outdated
@@ -166,6 +178,17 @@ def __init__( | |||
self._bound_pipe: SupportsPipe = None | |||
"""Bound pipe""" | |||
|
|||
def to_table_hint(self) -> Optional[IncrementalArgs]: | |||
"""Table hint is only returned when all properties are serializable""" | |||
if self.last_value_func not in (min, max): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we we do not need to care. we just return True if we can localize incremental to a column
* Extract incremental settings to a dict in table schema * Support passing incremental settings to @resource decorator
8b45dfb
to
8a0c020
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Description
incremental
hint to table schema, containing all incremental settings used in extractincremental
in@resource
decorator. Either as dict (IncrementalArgs
typeddict or incremental instance)Related Issues
Additional Context