-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Remove note from grid view response #51225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The grid view is not using the note, and the attempt at joining it didn't make it "cheap" to include, so let's just drop it. For the specific scenario I'm testing, this reduces the response time from ~9s to ~7s.
| tis_of_dag_runs, _ = paginated_select( | ||
| statement=select(TaskInstance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tis_of_dag_runs, _ = paginated_select( | |
| statement=select(TaskInstance) | |
| tis_of_dag_runs, _ = paginated_select( | |
| statement=select(TaskInstance) | |
| .options(selectinload(TaskInstance.task_instance_note)) |
Instead of dropping the note column entirely, I wonder if we can get back most of the performance win by switching from OUTER JOIN to a direct select‐in load on the note relationship. https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#sqlalchemy.orm.selectinload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be something like this:
statement = (
select(TaskInstance)
.options(selectinload(TaskInstance.task_instance_note))
.where(TaskInstance.dag_id == dag.dag_id)
.where(TaskInstance.run_id.in_([dr.run_id for dr in dag_runs]))
)
tis_of_dag_runs, _ = paginated_select(
statement=statement,
filters=[],
order_by=SortParam(allowed_attrs=["task_id", "run_id"], model=TaskInstance).set_value("task_id"),
offset=offset,
limit=None,
)
This will emit two smaller queries. One for all TIs, one for their notes, rather than one giant join. But we'll still have ti.note populated. Can we try this as well and see if it improves time better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically as performant. I was also aiming for a smaller response, but this field is worth keeping (I have another follow up cutting a lot more out, keep an eye out for that one 😄). Thanks!
|
I think the note was added in the past to add the marker on top of the grid icon that for a task a note is existing. That was nice but if it is a big performance hit, then I assume it is for the time being better without - also also maybe notes can be retrieved async in another call. |
jscheffl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am approving but comments from @sunank200 make sense to check before merge.
Yes, we used to do that. But when we added task state icons, I wasn't confident that the "task has a note" indicator in the grid view was still useful or too noisy. Happy to try Ankit's suggestion, if not, we can go ahead with removing it and figuring out notes later. |
|
It's close enough, it's worth keeping. Moved to #51247. |
The grid view is not using the note, and the attempt at joining it didn't make it "cheap" to include, so let's just drop it.
For the specific scenario I'm testing, this reduces the response time from ~9s to ~7s.
Related to #50928