Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lag / attribution window to incremental #970

Closed
rudolfix opened this issue Feb 14, 2024 · 2 comments · Fixed by #1957
Closed

add lag / attribution window to incremental #970

rudolfix opened this issue Feb 14, 2024 · 2 comments · Fixed by #1957
Assignees
Labels
community This issue came from slack community workspace support This issue is monitored by Solution Engineer

Comments

@rudolfix
Copy link
Collaborator

Background
In many cases certain portion of data should be reacquired during incremental loading. Ie we want to always capture last 7 days of data when getting daily analytics report. Or we want to refresh slack message replies and we have a moving window of 7 days for that.
Technically we would always pass start_date +/- the lag to the function accepting incremental

Requirements

    • add new optional filed to Incremental class that will hold the lag.
    • lag should be expressed as float and will be interpreted depending on the type of the cursor. for datetimes it is a lag value in seconds, for any other type use + / - operator depending on the last_value_func
    • we support only min and max for last_value_func. for customs we do not have "+" operator defined.
@VioletM VioletM added the community This issue came from slack community workspace label Feb 27, 2024
@rudolfix rudolfix moved this from Todo to Planned in dlt core library Apr 22, 2024
@rudolfix rudolfix moved this from Planned to Todo in dlt core library Jun 3, 2024
@VioletM VioletM added the support This issue is monitored by Solution Engineer label Oct 6, 2024
@rudolfix rudolfix moved this from Planned to In Progress in dlt core library Oct 14, 2024
@axellpadilla
Copy link
Contributor

Hi, I was just looking for a way to add lag to the incremental process (for example, some distributed sources can have unsync datetimes where an "older" record is added seconds before the last cursor)

Any example of use the future use?

Thanks!

@donotpush
Copy link
Collaborator

@axellpadilla this will be released soon (including docs)

@github-project-automation github-project-automation bot moved this from In Progress to Done in dlt core library Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community This issue came from slack community workspace support This issue is monitored by Solution Engineer
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants