Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Ordered ResultPool for function call order preservation #115

Closed
nolotz opened this issue May 16, 2023 · 7 comments
Closed

Comments

@nolotz
Copy link

nolotz commented May 16, 2023

Hello,

First, I would like to express my gratitude for your hard work on the conc library. The structured concurrency it brings to Go makes complex tasks a lot more manageable.

I am writing to propose a new feature that would further enrich the functionality of the library. The idea is to create a new type of entity that combines the functionalities of ResultPool and Stream. The main goal of this entity would be to provide a concurrent task runner that not only collects task results but also maintains the order of the calls to the functions. In simpler terms, it would give you a slice ordered by the call order of functions.

Currently, the ResultPool is great for running tasks concurrently and collecting the results, but it doesn't necessarily maintain the order of the functions calls. On the other hand, the Stream entity allows for processing an ordered stream of tasks in parallel but does not collect the results.

The proposed entity could be very useful in situations where you want to run tasks concurrently, collect their results, and also preserve the order of the tasks.

Please, let me know what you think about this proposal. I am also open to contributing towards the development of this feature if that would be acceptable.

Thank you for your time and consideration.

Solves #110

@camdencheek
Copy link
Member

Hi @nolotz! Have you taken a look at the iter package? In particular, iter.Map(). Because it knows the size of the set of results in advance, it can pre-allocate a slice and return results in the same order as the input set of tasks. Does that work for your use case?

@nolotz
Copy link
Author

nolotz commented Jun 1, 2023

Hi @camdencheek,

Thank you for your prompt response and suggestion. I indeed took a look at the iter package, specifically the iter.Map(). It is a powerful tool that effectively returns results in the same order as the input set of tasks.

However, the use case I am envisioning requires a blend of ResultPool and Stream functionalities. This would offer not just the ordered return of results but also allow concurrent execution of the tasks along with handling potential errors and context cancellations, similar to what ResultPool and Stream provide individually.

The proposal for a ResultStream entity is to cover scenarios where maintaining the order of tasks execution, concurrent processing, and result collection are all important, offering more flexibility and control in handling complex concurrency requirements.

I hope this provides more clarity on the proposed feature. I look forward to hearing your thoughts on this.

@camdencheek
Copy link
Member

but also allow concurrent execution of the tasks along with handling potential errors and context cancellations

Just to make sure we're on the same page, iter.Map() also executes its tasks concurrently, and there is a variant iter.MapErr() that will handle errors (though errors won't cancel the context (yet)).

One idea I'm toying around with is making ResultPool always maintain result order. It wouldn't be super expensive (it would just add some complexity), but that would make the abstraction much more useful for a lot of cases. Would that work for your use case? The only thing different between ResultPool compared to the hypothetical ResultStream would be that you don't get your ordered set of (ordered) results until you call Wait(), whereas with ResultStream, you could start operating on the stream results before everything is finished.

@sagikazarmark
Copy link

Just to add an additional use case: an ordered pool would be great in the following example:

	for _, searchPath := range f.Paths {
		searchPath := searchPath

		for _, searchName := range f.Names {
			searchName := searchName

			pool.Go(func() ([]string, error) {
				return search(fsys, searchPath, searchName)
			})
		}
	}

I basically want to search in a file tree and return results in the order of paths and file names used for searching.

map.Iter does not seem to be optimal here due to the two-dimensional nature of the lists.

I considered introducing an intermediate structure to generate a single slice and call iter.Map on that (in fact, I may still do that), but that wouldn't be that elegant IMO.

@sagikazarmark
Copy link

One idea I'm toying around with is making ResultPool always maintain result order. It wouldn't be super expensive (it would just add some complexity), but that would make the abstraction much more useful for a lot of cases. Would that work for your use case? The only thing different between ResultPool compared to the hypothetical ResultStream would be that you don't get your ordered set of (ordered) results until you call Wait(), whereas with ResultStream, you could start operating on the stream results before everything is finished.

@camdencheek is this something you are still considering?

@camdencheek
Copy link
Member

is this something you are still considering?

In fact, it is. Take a look, but I expect this to be part of the 1.0 release!

@camdencheek
Copy link
Member

@nolotz, closing because I think #126 addresses your request, but feel free leave feedback on that PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants