-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event funnel property handling exceeding resource limits #304
Comments
Yeah this isn't totally surprising, #278 made these already complex queries more complex. I think we probably need to adjust it so that we just look for the event character (and not that of the properties) in the case that properties aren't specified. As things are right now, it will look at all properties individually. Unfortunately I suspect this might be a bit complicated to actually implement. The short term solution I'd propose would be to use a |
This has been implemented in #497, and should be deployed tomorrow. |
I also have an idea about removing unnecessary regular expression matching, which should reduce CPU resources required. Currently the funnel SQL generation results in queries like this: SELECT
COUNT(
CASE
WHEN REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string],True))
THEN 1
ELSE NULL
END
) AS step_1,
COUNT(
CASE
WHEN (REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string],True)))
AND (REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string,step_2.match_string],True)))
THEN 1
ELSE NULL
END
) AS step_2,
COUNT(
CASE
WHEN (REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string],True)))
AND (REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string,step_2.match_string],True)))
AND (REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string,step_2.match_string,step_3.match_string],True)))
THEN 1
ELSE NULL
END
) AS step_3,
COUNT(
CASE
WHEN (REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string],True)))
AND (REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string,step_2.match_string],True)))
AND (REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string,step_2.match_string,step_3.match_string],True)))
AND (REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string,step_2.match_string,step_3.match_string,step_4.match_string],True)))
THEN 1
ELSE NULL
END
) AS step_4 However, I believe the additional regular expressions for each subsequent step are supersets of the previous steps' regular expressions, so also matching all those previous regular expressions should be unnecessary, and we could have simpler queries like this: SELECT
COUNT(
CASE
WHEN REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string],True))
THEN 1
ELSE NULL
END
) AS step_1,
COUNT(
CASE
WHEN REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string,step_2.match_string],True))
THEN 1
ELSE NULL
END
) AS step_2,
COUNT(
CASE
WHEN REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string,step_2.match_string,step_3.match_string],True))
THEN 1
ELSE NULL
END
) AS step_3,
COUNT(
CASE
WHEN REGEXP_CONTAINS(funnel_analysis.events, mozfun.event_analysis.create_funnel_regex([step_1.match_string,step_2.match_string,step_3.match_string,step_4.match_string],True))
THEN 1
ELSE NULL
END
) AS step_4 Assigning this to myself. |
It looks like the changes in #278 caused the generated queries to get more complex and requiring more CPU resources. This has caused some dashboard queries (specifically of this dashboard: https://mozilla.cloud.looker.com/dashboards-next/270) to fail with:
@wlach You probably have the most context here. Any ideas how this could be fixed?
The text was updated successfully, but these errors were encountered: