-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: split events lookup from exception grouping query #24711
Conversation
def parse_embedded_events_and_persons(self, columns: list[str], events: list): | ||
person_indices: list[int] = [] | ||
for index, col in enumerate(columns): | ||
if col == "person": | ||
person_indices.append(index) | ||
|
||
if len(person_indices) > 0 and len(events) > 0: | ||
person_idx = person_indices[0] | ||
distinct_ids = list({event[person_idx] for event in events}) | ||
persons = get_persons_by_distinct_ids(self.team.pk, distinct_ids) | ||
persons = persons.prefetch_related(Prefetch("persondistinctid_set", to_attr="distinct_ids_cache")) | ||
distinct_to_person: dict[str, Person] = {} | ||
for person in persons: | ||
if person: | ||
for person_distinct_id in person.distinct_ids: | ||
distinct_to_person[person_distinct_id] = person | ||
|
||
for column_index in person_indices: | ||
for index, result in enumerate(events): | ||
distinct_id: str = result[column_index] | ||
events[index] = list(result) | ||
if distinct_to_person.get(distinct_id): | ||
person = distinct_to_person[distinct_id] | ||
events[index][column_index] = { | ||
"uuid": person.uuid, | ||
"created_at": person.created_at, | ||
"properties": person.properties or {}, | ||
"distinct_id": distinct_id, | ||
} | ||
else: | ||
events[index][column_index] = { | ||
"distinct_id": distinct_id, | ||
} | ||
|
||
return [dict(zip(columns, value)) for value in events] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get to remove all of this which was copied over from the EventsQuery
anyways
Size Change: 0 B Total Size: 1.11 MB ℹ️ View Unchanged
|
📸 UI snapshots have been updated2 snapshot changes in total. 0 added, 2 modified, 0 deleted:
Triggered by this commit. |
const where = [ | ||
`has(${stringifyFingerprints( | ||
fingerprints | ||
)}, JSONExtract(ifNull(properties.$exception_fingerprint,'[]'),'Array(String)'))`, | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like writing HogQL on the frontend and having to parse it on the backend
📸 UI snapshots have been updated2 snapshot changes in total. 0 added, 2 modified, 0 deleted:
Triggered by this commit. |
@@ -21,10 +21,10 @@ const meta: Meta = { | |||
post: { | |||
'/api/projects/:team_id/query': async (req, res, ctx) => { | |||
const query = (await req.clone().json()).query | |||
if (query.fingerprint) { | |||
return res(ctx.json(errorTrackingGroupQueryResponse)) | |||
if (query.kind === 'ErrorTrackingQuery') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should ErrorTrackingQuery
be an enum somewhere so we apply defensive programming?
…PostHog/posthog into dn-fix/split-error-event-lookup
Problem
Inspired by #24511
The
ErrorTrackingQuery
to fetch the errors for a given fingerprint was super slow because we were usinggroupArray(tuple(...))
for all the events in the period.While we could have rewritten the events lookup as a subquery we would then have two levels of a "limit" in the query (one at the group level and one at the events level). It all just felt like the one query was doing too much
Changes
Split things into two queries...
ErrorTrackingQuery
that relates to individual eventsEventsQuery
Does this work well for both Cloud and self-hosted?
Yes
How did you test this code?
Tests should remain exactly the same except for the removed ones