Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(sessions): add dataloaders for session queries #5222

Merged
merged 10 commits into from
Oct 29, 2024

Conversation

RogerHYang
Copy link
Contributor

No description provided.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Oct 29, 2024
@Arize-ai Arize-ai deleted a comment from review-notebook-app bot Oct 29, 2024
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Oct 29, 2024
@@ -68,6 +71,10 @@ class DataLoaders:
latency_ms_quantile: LatencyMsQuantileDataLoader
min_start_or_max_end_times: MinStartOrMaxEndTimeDataLoader
record_counts: RecordCountDataLoader
session_first_inputs: SessionFirstInputLastOutputsDataLoader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
session_first_inputs: SessionFirstInputLastOutputsDataLoader
session_inputs: SessionIODataLoader

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be ambiguous, since it can imply a list of inputs per session, whereas first_input matches the graphql field, which is more explicit

src/phoenix/server/api/dataloaders/session_num_traces.py Outdated Show resolved Hide resolved
Comment on lines 27 to 32
stmt = (
select(models.Trace.project_session_rowid.label("id_"))
.join_from(models.Span, models.Trace)
.where(models.Span.parent_id.is_(None))
.where(models.Trace.project_session_rowid.isnot(None))
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to handle the case of multiple root spans in a single trace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're punting on that for now, since there' no good way to display them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the query plan look with the row number/ filter on rank approach?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's the query plan from postgres

Screenshot 2024-10-29 at 3 19 35 PM

src/phoenix/server/api/dataloaders/session_num_traces.py Outdated Show resolved Hide resolved
Comment on lines 35 to 36
models.Span.attributes[INPUT_VALUE].label("value"),
models.Span.attributes[INPUT_MIME_TYPE].label("mime_type"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same non-blocking comment as before.

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Oct 29, 2024
@RogerHYang RogerHYang merged commit e220ed2 into sessions2 Oct 29, 2024
30 checks passed
@RogerHYang RogerHYang deleted the dataloaders-for-sessions branch October 29, 2024 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants