Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#862 much faster version of aggregate_15_min_mvt #863

Merged
merged 3 commits into from
Feb 13, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -15,70 +15,96 @@ DECLARE

BEGIN

WITH aggregate_insert AS (
INSERT INTO miovision_api.volumes_15min_mvt(
intersection_uid, datetime_bin, classification_uid, leg, movement_uid, volume
)
WITH temp AS (
-- Cross product of dates, intersections, legal movement for cars, bikes, and peds to aggregate
SELECT
im.intersection_uid,
dt.datetime_bin,
im.classification_uid,
im.leg,
im.movement_uid,
CASE
--set unacceptable gaps as nulls
WHEN un.datetime_bin IS NOT NULL THEN NULL
--gap fill with zeros (restricted to certain modes in having clause)
ELSE (COALESCE(SUM(v.volume), 0))
END AS volume
-- Cross product of dates, intersections, legal movement for cars, bikes, and peds to aggregate
0 AS volume
FROM miovision_api.intersection_movements AS im
CROSS JOIN generate_series(
start_date,
end_date - interval '15 minutes',
interval '15 minutes'
) AS dt(datetime_bin)
JOIN miovision_api.intersections AS mai USING (intersection_uid)
--To avoid aggregating unacceptable gaps
LEFT JOIN miovision_api.unacceptable_gaps AS un ON
un.intersection_uid = im.intersection_uid
--remove the 15 minute bin containing any unacceptable gaps
AND dt.datetime_bin = un.datetime_bin
--To get 1min bins
LEFT JOIN miovision_api.volumes AS v ON
--help query choose correct partition
WHERE
--0 padding for certain modes (padding)
im.classification_uid IN (1,2,6,10)
AND im.intersection_uid = ANY(target_intersections)

UNION ALL

--real volumes
SELECT
v.intersection_uid,
datetime_bin_15(v.datetime_bin) AS datetime_bin,
v.classification_uid,
v.leg,
v.movement_uid,
SUM(volume)
FROM miovision_api.volumes AS v
--only aggregate common movements
JOIN miovision_api.intersection_movements USING (
intersection_uid, classification_uid, leg, movement_uid
)
WHERE
v.datetime_bin >= start_date
AND v.datetime_bin < end_date
AND v.datetime_bin >= dt.datetime_bin
AND v.datetime_bin < dt.datetime_bin + interval '15 minutes'
AND v.intersection_uid = im.intersection_uid
AND v.classification_uid = im.classification_uid
AND v.leg = im.leg
AND v.movement_uid = im.movement_uid
AND v.intersection_uid = ANY(target_intersections)
GROUP BY
v.intersection_uid,
datetime_bin_15(v.datetime_bin),
v.classification_uid,
v.leg,
v.movement_uid
),

aggregate_insert AS (
INSERT INTO miovision_api.volumes_15min_mvt(
intersection_uid, datetime_bin, classification_uid, leg, movement_uid, volume
)
SELECT DISTINCT ON (
v.intersection_uid, v.datetime_bin, v.classification_uid, v.leg, v.movement_uid
)
v.intersection_uid,
v.datetime_bin,
v.classification_uid,
v.leg,
v.movement_uid,
CASE
--set unacceptable gaps as nulls
WHEN un.datetime_bin IS NOT NULL THEN NULL
--gap fill with zeros (restricted to certain modes in temp CTE)
ELSE v.volume
END AS volume
FROM temp AS v
JOIN miovision_api.intersections AS i USING (intersection_uid)
--set unacceptable gaps as null
LEFT JOIN miovision_api.unacceptable_gaps AS un USING (
intersection_uid, datetime_bin
)
WHERE
-- Only include dates during which intersection is active
-- (excludes entire day it was added/removed)
dt.datetime_bin > mai.date_installed + interval '1 day'
v.datetime_bin >= i.date_installed + interval '1 day'
AND (
mai.date_decommissioned IS NULL
OR (dt.datetime_bin < mai.date_decommissioned - interval '1 day')
i.date_decommissioned IS NULL
OR (v.datetime_bin < i.date_decommissioned - interval '1 day')
)
--exclude movements already aggregated
AND v.volume_15min_mvt_uid IS NULL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this excluded cause we have the clearing function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes exactly! it is redundant - we always run clear + aggregate together whether through run-api or airflow.

AND im.intersection_uid = ANY(target_intersections)
GROUP BY
im.intersection_uid,
dt.datetime_bin,
im.classification_uid,
im.leg,
im.movement_uid,
un.datetime_bin
HAVING
--retain 0s for certain modes (padding)
im.classification_uid IN (1,2,6,10)
OR SUM(v.volume) > 0
ORDER BY
v.intersection_uid,
v.datetime_bin,
v.classification_uid,
v.leg,
v.movement_uid,
--select real value instead of padding value if available
v.volume DESC
RETURNING intersection_uid, volume_15min_mvt_uid, datetime_bin, classification_uid, leg, movement_uid, volume
)

--To update foreign key for 1min bin table
UPDATE miovision_api.volumes AS v
SET volume_15min_mvt_uid = a_i.volume_15min_mvt_uid
Expand Down Expand Up @@ -107,4 +133,4 @@ GRANT EXECUTE ON FUNCTION miovision_api.aggregate_15_min_mvt(date, date, integer
TO miovision_api_bot;

GRANT EXECUTE ON FUNCTION miovision_api.aggregate_15_min_mvt(date, date, integer [])
TO miovision_admins;
TO miovision_admins;
Loading