-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(reports): Clear last value when state is WORKING #19941
fix(reports): Clear last value when state is WORKING #19941
Conversation
@@ -94,35 +94,39 @@ def __init__( | |||
self._start_dttm = datetime.utcnow() | |||
self._execution_id = execution_id | |||
|
|||
def set_state_and_log( | |||
def update_report_schedule_and_log( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I felt that update_report_schedule_and_log
was more reflective of the function logic.
self._report_schedule.last_state = state | ||
self._report_schedule.last_eval_dttm = dttm | ||
self._report_schedule.last_eval_dttm = datetime.utcnow() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to pass in the timestamp which isn't used elsewhere.
superset/reports/commands/execute.py
Outdated
@@ -132,7 +136,7 @@ def create_log( | |||
end_dttm=datetime.utcnow(), | |||
value=self._report_schedule.last_value, | |||
value_row_json=self._report_schedule.last_value_row_json, | |||
state=state, | |||
state=self._report_schedule.state, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to pass in the state given the report schedule record has been updated with the new state.
Codecov Report
@@ Coverage Diff @@
## master #19941 +/- ##
==========================================
- Coverage 66.52% 66.33% -0.20%
==========================================
Files 1714 1712 -2
Lines 65052 64049 -1003
Branches 6722 6722
==========================================
- Hits 43279 42489 -790
+ Misses 20061 19848 -213
Partials 1712 1712
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
39d2abf
to
6903c48
Compare
Is the report scheduler currently adding two execution logs for every execution, one in WORKING state, one in COMPLETE? Was the original intention to update current WORKING execution to COMPLETE (or FAILED) once an execution is done? |
.all() | ||
): | ||
schedule.last_value = None | ||
schedule.last_value_row_json = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@john-bodley do you think there's a chance that if there is a report that's in a working state, that we could potentially wind up altering the value after the report has completed? Just based off your migration timing of around a minute and most of these reports take under a minute to run, we may wind up with some reports that wind up with a completed state and no value, i.e., the commit happens here after the state has changed on the report side. Just wondering if it's safer not to change reports that are in progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eschutho I guess with any migration there's a risk of a race condition happening whilst the migration is running. I've replicated the session.commit()
to ensure that there's also a commit directly after the ReportSchedule
which likely only has a handful of reports in WORKING
state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Just one question, but not a blocker.
Yes, the logs are immutable, so you'll get two logs, one for each state.
No, I don't believe this was the intention. |
@john-bodley I suppose it could give the user a start time. It's usually the same as the scheduled time, but if the report is taking a long time to run, it could indicate to the user that something is happening. |
Then I don't understand why we have an |
I think this PR makes things less wrong so it's directionally correct. There's likely merit in rethinking what the desirable UX is from a reporting standpoint, though that can be handled in a different PR. |
* fix(reports): Clear last value when state is WORKING * Update cbe71abde154_fix_report_schedule_and_log.py Co-authored-by: John Bodley <john.bodley@airbnb.com>
SUMMARY
This PR fixes an issue with the alert feature, specifically the execution log reports different alert values for different states within the same execution run. Per this Slack thread the underlying issue is the report schedule synopsis—from which the execution log is created, i.e., a point-in-time snapshot—is when the state is
WORKING
thelast_value
andlast_value_row_json
need to be cleared otherwise they incorrectly reflect the values of the previous execution run.I’m actually not sure whether there’s any value logging the
WORKING
state to the execution log. Does it provide any relevant information to the user?BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Before Migration
After Migration
TESTING INSTRUCTIONS
Added unit tests and migration. The migration successfully ran in under a minute on a replica of Airbnb's production database with ~ 50,000 execution logs.
ADDITIONAL INFORMATION