Description
I'm trying to partition a stream into windows according to a predicate on the elements. That is, implement a function like
public static IObservable<IObservable<T>> Window<T>(this IObservable<T> source, Func<T, bool> isStartOfNewWindow);
(Incidentally, I feel like this is a common use case that should be part of the library.)
I looked in the IntroToRx docs and found the recommended approach is this:
public static IObservable<IObservable<T>> Window<T>(this IObservable<T> source, Func<T, bool> isStartOfNewWindow)
{
var shared = source.Publish().RefCount();
var windowEdge = shared.Where(isStartOfNewWindow).Publish().RefCount();
return shared.Window(windowEdge, _ => windowEdge);
}
A simple test reveals this does appear to work well:
var source = Observable.Interval(TimeSpan.FromSeconds(1));
var windowed = source.Window(x => x == 0 || x % 5 == 2).SelectMany(o => o.ToList());
windowed.Subscribe(xs => Console.WriteLine(string.Join(", ", xs)));
This prints 0, 1
, 2, 3, 4, 5, 6
, 7, 8, 9, 10, 11
, etc as expected.
However, if I now prepend some items to the source sequence, it does not work correctly:
var source = Observable.Interval(TimeSpan.FromSeconds(1)).Prepend(-1); // Prepend -1
var windowed = source.Window(x => x == -1 || x % 5 == 2).SelectMany(o => o.ToList()); // Change x == 0 condition to x == -1, as that's now the first item
windowed.Subscribe(xs => Console.WriteLine(string.Join(", ", xs)));
This prints the same as the first example - ie. ignoring the added -1, even though that should now participate in the first window and the first line should be -1, 0, 1
. I observe the same behaviour with the analogous Buffer
operator. I also notice defining source
instead by
var source = Observable.Defer(async () => { return Observable.Return(-1L); }).Concat(Observable.Interval(TimeSpan.FromSeconds(1)));
has the same bad behaviour, but
var source = Observable.Defer(async () => { await Task.Delay(1); return Observable.Return(-1L); }).Concat(Observable.Interval(TimeSpan.FromSeconds(1)));
does not, and includes the -1
correctly. I obviously don't want to be introducing artificial delays into my streams though as a solution.
What is the correct way to implement the function I need, regardless of the timing of the events in the input sequence? If it's what I already did, can a fix be implemented in Window
and Buffer
for this behaviour?