Session windows #243

mwylde · 2023-08-11T02:01:13Z

This PR adds support for session windows, addressing #239.

Session windows divide time up by a minimum gap size and are useful for sessionization. For example, a session window could be defined on clickstream data to determine metrics for a particular usage period.

Session windows can be used like our existing time window functions (hop and tumble). For example:

select bid.auction, count(*), avg(bid.price), session(interval '30 seconds')
from nexmark
where bid is not null
group by bid.auction, session(interval '30 seconds')

This PR introduces only an unoptimized implementation of session windows without support for partial aggregates.

This PR also includes a large rework of how windows are handled in SQL. This was necessary because the existing implementation relied on windows being fixed width in order to determine the window output.

Previously, we would compile a window-groupby into two nodes: a window aggregation, and a merge node. The window aggregation was responsible for outputting all of the aggregate fields (in the above query, that would be the count(*)), while the subsequent merge was responsible for combining the aggregates with the group by fields. Key fields (like bid.auction) would be pulled from the key, while window fields (session(interval '30 seconds')) would be computed statically from the timestamp and the window size. However, this doesn't work with dynamically-sized windows.

Now instead each aggregation closure takes both a key and the window, and is responsible for outputting the full struct. In addition to allowing us to support non-fixed sized windows like session windows, it also saves a (non-fused) node graph and simplifies the code significantly.

jacksonrnewhouse

The timestamp used for session windows needs to be fixed. I described the behavior of that field in the comments.

jacksonrnewhouse · 2023-08-11T23:58:00Z

arroyo-sql/src/operators.rs

-    pub field_names: Vec<Column>,
-    pub field_computations: Vec<AggregationExpression>,
+    pub aggregates: Vec<(Column, AggregationExpression)>,
+    pub group_bys: Vec<(Column, syn::Expr, TypeDef)>,


I don't love how this leaks generating code into this phase of the planning, but everything is complex enough right now I don't think it is worth trying to fix this right now. Will be a goal for reworking how code is generated.

arroyo-worker/src/operators/windows.rs

jacksonrnewhouse · 2023-08-12T00:36:34Z

arroyo-worker/src/operators/windows.rs

+                .get_key_state('s')
+                .await
+                // I have no idea if this timestamp is correct -- this datastructure does not make sense to me
+                .insert(timestamp, key, windows.unwrap())


I think this is wrong. Given that events can arrive out of order, imagine you have a 24 hour window in the future, with gaps of 30 minutes, it all fills in at 00:00, but the last event you process is at 00:10. This'd get written with that 00:10 timestamp. Then, if you restarted at 1:00, it would expire the record of your window. This should be the max of all timestamps you've seen for the key, I believe.

You can compute this by maxing over the windows, or storing it along with the vec of Windows.

Good point, this works for hop/tumble windows because they're fixed width, so the watermark + the safe_retention_micros is enough to ensure we don't delete data we might need in the future.

However I'm not sure your approach would work either, since at insertion time we can't know how long we need to store it—every time the window gets extended we'd have to rewrite all of the events for that key.

Instead what do you think about just opting out of the optimized delete behavior and relying on the clear_time_range calls in handle_timer?

My comments were for the 's' table, which gets rewritten every time we change the windows. For the window state table 's', you are correct that we can't use the expiration, so should be relying on clear_time_range.

From offline conversation:

For the window table 'w' we can get the correct behavior by using the max endtime of all of the windows, which ensures that we do not clear up the windows vec until all windows are finished

The state table 's' is more problematic; since we can't know at insertion time when the window that contains that record will close (since it can be arbitrarily extended) we can't use any watermark-based cleanup—currently the only type of deletion currently supported by the state backend. For now, we will introduce a max session window size of 24 hours; any sessions longer than that will get broken up. All state will be retained for 24 hours then deleted. As a follow on, we will add support for tombstone-based deletion and compaction in the s3 state backend.

mwylde · 2023-08-14T20:50:35Z

Updated with tests for the window handling logic, fixed timestamps for the window table, and a max session size to correctly handling deletion of the state data.

mwylde requested a review from jacksonrnewhouse August 11, 2023 02:01

mwylde force-pushed the session_windows branch from fe61f80 to f8330b5 Compare August 11, 2023 17:10

jacksonrnewhouse suggested changes Aug 12, 2023

View reviewed changes

mwylde added 10 commits August 14, 2023 14:04

Partial implementation of session windows

80a55be

sessions state-backed

feea136

Work on session windows

a6cdaa6

Work

cdee8b9

checkpoint -- basic windows are working but not other windows

d10a062

Updated AggregatingWindow

195249b

Tests passing

b54b7b6

Clean up unused code

4b97cd7

Working session windows

9f02e9d

Add session window tests and max session size

f31f429

mwylde force-pushed the session_windows branch from ad17a1a to f31f429 Compare August 14, 2023 21:04

jacksonrnewhouse approved these changes Aug 14, 2023

View reviewed changes

Merge branch 'master' into session_windows

069ed48

mwylde enabled auto-merge (rebase) August 14, 2023 22:06

mwylde merged commit 944ecfb into master Aug 14, 2023

mwylde mentioned this pull request Aug 16, 2023

Add support for session window #239

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Session windows #243

Session windows #243

mwylde commented Aug 11, 2023 •

edited

Loading

jacksonrnewhouse left a comment

jacksonrnewhouse Aug 11, 2023

jacksonrnewhouse Aug 12, 2023

jacksonrnewhouse Aug 12, 2023

mwylde Aug 12, 2023 •

edited

Loading

jacksonrnewhouse Aug 14, 2023

mwylde Aug 14, 2023

mwylde commented Aug 14, 2023

Session windows #243

Session windows #243

Conversation

mwylde commented Aug 11, 2023 • edited Loading

jacksonrnewhouse left a comment

Choose a reason for hiding this comment

jacksonrnewhouse Aug 11, 2023

Choose a reason for hiding this comment

jacksonrnewhouse Aug 12, 2023

Choose a reason for hiding this comment

jacksonrnewhouse Aug 12, 2023

Choose a reason for hiding this comment

mwylde Aug 12, 2023 • edited Loading

Choose a reason for hiding this comment

jacksonrnewhouse Aug 14, 2023

Choose a reason for hiding this comment

mwylde Aug 14, 2023

Choose a reason for hiding this comment

mwylde commented Aug 14, 2023

mwylde commented Aug 11, 2023 •

edited

Loading

mwylde Aug 12, 2023 •

edited

Loading