Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Broadcast algorithm #214

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

phausler
Copy link
Member

This algorithm allows for the broadcasting of values to multiple consumers from one source.

Multiple instances from the same broadcast will share the values produced from iteration. These also share failures that are thrown or non-throwing termination. Cancellation is handled on a per instance basis, but if all iterations are cancelled and there are no more references to the base iteration then the base iteration is cancelled.

There are bits of room for improvement: currently each side contains a buffer to the elements yet to be consumed. This could be changed to an index into a shared buffer, however the algorithm to share that is quite complex and could be a bit tricky to implement. So as an initial draft this is being left as O(N * M) storage where N is the number of active iterations and M is the number of elements that have yet to be consumed. In degenerate cases this can be looked at as O(N^2) storage, however in ideal cases it is O(1). The shared buffer can bring it down to degenerate cases being O(N) and the normal case being O(1).

@phausler phausler added the v1.1 Post-1.0 work label Oct 14, 2022
@phausler phausler marked this pull request as draft October 14, 2022 17:11
@twittemb
Copy link
Contributor

twittemb commented Oct 15, 2022

Hi @phausler

Happy we are beginning to introduce this kind of operator for a future version :-).

I have a few questions regarding this draft though:

  • it looks like an equivalent to the Combine share operator (multicast + auto start), meaning that future consumers will miss the initial values, is that the intent ?
  • shouldn’t we think about a family of “sharing” operators with some options (like replaying values). Some could use buffers, some could rely on AsyncChannel to enforce all consumers to consume before requesting the next elements ?
  • From what I understand, the base is iterated over in a dedicated task no matter the demand from the consumers. Am I right ? Isn’t it weird ?

Thanks.

@phausler
Copy link
Member Author

So it is the intent that prior values are NOT buffered for later consumption when new downstreams are attached (else the best case storage would be O(N^2)... which imho is unexpected: and if you need that then you can slap a buffer on it maybe?)

This is by no means the only one of this category: I'm sure there are either composable variants or other distinct algorithms in the same "multicast" family. This is just the one I see more commonly requested/used.

I think (with some extra complexity) I can modify this to respond to per element demand to pump the iterator along; that is likely tied to the consolidation of the buffers I was mentioning in the pr description. This is still a WIP so my next steps are to make that incremental demand and shared buffers. (And measure the impact/complexity of storage)

@FranzBusch
Copy link
Member

Nice start! I agree that we should get rid of the Task in here and it ought to be possible. This is definitely one of the more interesting algorithms implementation wise :D

Comment on lines +135 to +146
static func task(_ state: ManagedCriticalState<State>, base: Base) -> Task<Void, Never> {
Task {
do {
for try await element in base {
State.emit(state, result: .success(element))
}
State.emit(state, result: .success(nil))
} catch {
State.emit(state, result: .failure(error))
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I just noticed after the discussion in the forums around deferred. This is actually creating multiple upstream iterators if I understand this code correctly. IMO we really need to avoid this otherwise this algorithm is not capable of transforming a unicast AsyncSequence into a broadcasted one. @phausler please correct if I am misunderstanding the code here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, this is creating 1 singular task to iterate so only 1 upstream iterator is made.

Copy link

@jdberry jdberry Dec 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm clearly being dense, because it also looks to me like this would create a new Task, at least each time broadcast is called. And for each Task there would be a separate iterator of the base sequence.

Ah, so maybe the intent is that broadcast is only called once, and then the broadcast sequence is copied to duplicate it? That's a bit awkward.

@phausler
Copy link
Member Author

@tcldr this might be of interest, a slightly different approach.

@tcldr
Copy link

tcldr commented Nov 11, 2022

@phausler Yes, this served as the inspiration for the work on shared! I originally intended to see if I could adapt this by adding history and a disposing iterator and then fell down the rabbit hole. :) (I borrowed your original test cases, too.)

The thing that begun to interest me is how this would be used with something like AsyncChannel. As this algorithm uses internal buffers, the utility of AsyncChannel is lost. I realised that if back pressure was to be transmitted from the site of consumption to the site of production, back pressure would need to be maintained throughout the pipeline.

The other thing I realised is that while a back pressure supporting algorithm can be made to work in a ‘depressurised’ way (by chaining a buffer), the inverse isn’t possible.

So I thought, maybe if I create a back pressure supporting multicast algorithm, it could be used as a primitive to support both use-cases: ones where back-pressure is a requirement and as well as those use cases where it isn’t (by chaining buffers).

Comment on lines +206 to +208
if iteration.isKnownUniquelyReferenced() {
Iteration.cancel(iteration)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something here, but if the last two iterators enter deinit at the same time (and both perform the isKnownUniquelyReferenced check before releasing iteration), could that trigger a race condition such that neither ends up calling Iteration.cancel?

@Sajjon
Copy link

Sajjon commented Dec 29, 2022

I would love to use this soon, do you plan to continue working on this in January @phausler ? :)

And or what is the relationship between this PR and #227 ?

@Sajjon
Copy link

Sajjon commented Jan 4, 2023

How does this PR relate to #242 ?

@FranzBusch
Copy link
Member

They are both proposal how to solve this. After 1.0 of this library has shipped we need to take another look and discuss what approach we want.

@Sajjon
Copy link

Sajjon commented May 15, 2024

@FranzBusch almost 18 months later, any update? :) I think multicast is a crucial missing piece in Async Swift.

@dehrom
Copy link

dehrom commented Jun 24, 2024

@FranzBusch it's really useful feature 🙏

@BrentMifsud
Copy link

any movement on this?

@Kyome22
Copy link

Kyome22 commented Sep 1, 2024

In macOS, it is common and frequent to want to use one data source in multiple windows.
Therefore, I am eagerly awaiting the implementation of a broadcast algorithm like the one proposed here.
It is required for replacing Combine's CurrentValueSubject with Swift Concurrency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v1.1 Post-1.0 work
Projects
None yet
Development

Successfully merging this pull request may close these issues.