-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests for staging #100
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment - How do the recency tests and freshness checks relate to each other? Are we running both freshness and recency tests on source?
If we are only running recency tests, I suggest we move the datepart and interval to something more aggressive so that we are alerted earlier to issues. In theory, the data should never be older than 20 min in the upstream tables.
My other nit is not null tests for asset_code
and asset_issuer
aren't super effective in my opinion. These fields are always blank for native asset, which is essentially null.
- asset_code | ||
- asset_issuer | ||
- liquidity_pool_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think you could use ledger_key
instead of these three columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like an expensive test and taking a while to run on 86B records. Wondering should run these tests only on newer data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 agree. Thank you for catching this. Generally, these tests will be expensive on state tables. This being one of the more expensive tests because of table size. Let's just run on newer data, almost in incremental mode until we come up with a better solution.
The plan is to have tests at two level:
In this PR we are addressing staging part. sources are being separately. freshness checks and recency tests are almost similar looking for stale data to flag, the difference being recency tests are triggered as part of
Will do |
Thanks for the explanation, it helps clarify the differences between the two tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few nits, my only other comment is tests that perform full table scans may need adjustment to scan incremental only data. Especially on larger tables like, trust_lines
, history_operations
and history_transactions
remove quote
3267ba0
to
aecc5c8
Compare
89488b7
to
a2553fa
Compare
a5dc772
to
43f579c
Compare
43f579c
to
2b78a7c
Compare
stringify
9142e7d
to
95e2ffe
Compare
models/staging/stg_contract_data.yml
Outdated
- incremental_accepted_values: | ||
date_column_name: "closed_at" | ||
greater_than_equal_to: "2 day" | ||
values: ["credit_alphanum4", "credit_alphanum12", "native"] | ||
quote: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to remove this test for now. Context in https://stellarorg.atlassian.net/browse/HUBBLE-574
fa93501
to
2cf340c
Compare
update update update deps
2cf340c
to
886670f
Compare
PR Checklist
PR Structure
otherwise).
Thoroughness
Release planning
semver, and I've changed the name of the BRANCH to release/* , feature/* or patch/* .
What
[TODO: Short statement about what is changing.]
Why
[TODO: Why this change is being made. Include any context required to understand the why.]
Known limitations
[TODO or N/A]