Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement chunked foreach/non allocating chunked iteration. #37

Open
rafaqz opened this issue Jun 19, 2021 · 2 comments
Open

Implement chunked foreach/non allocating chunked iteration. #37

rafaqz opened this issue Jun 19, 2021 · 2 comments

Comments

@rafaqz
Copy link
Collaborator

rafaqz commented Jun 19, 2021

As pointed out in #15 broadcast can do pretty much everything. But one issue is you have to call collect and allocate an array?

In GeoData.jl I'm using broadcast to iterate over arrays to e.g. calculate ranges of non nodata values for a trim method, using broadcast so the operation is chunked. But I still need to call collect on the broadcast object, which allocates.

It would be good if there was a way to iterate over the object with chunking that didn't allocate. Maybe that's possible already?

@rafaqz
Copy link
Collaborator Author

rafaqz commented Jun 21, 2021

Maybe we can achieve this in a more general way by using Iterators.Stateful to iterate over the chunks, caching the chunks along the column as we load them and swapping them out as required. An iterator would have to be in the correct order, unlike broadcast.

On the first iteration we can allocate the required memory for the number of chunks along the column, and copy to it from disk when the iterator gets to the next chunk.

This may also resolve issues with methods like replace, which currently index linearly and hang.

@meggart
Copy link
Collaborator

meggart commented Jan 13, 2022

I am still not sure I completely understand the issue here. What exactly is the use case for iteration here that can not be accomplished using reduce or reducedim? You can also do reductions over broadcasted objects, so it should not be necessary to allocate the full array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants