Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dask-awkward arrays are not copyable when Coffea has been imported #878

Closed
jrueb opened this issue Aug 16, 2023 · 2 comments
Closed

Dask-awkward arrays are not copyable when Coffea has been imported #878

jrueb opened this issue Aug 16, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@jrueb
Copy link
Contributor

jrueb commented Aug 16, 2023

Describe the bug
dask-awkward recently fixed a bug, which now enables copying of dask-awkward arrays (dask-contrib/dask-awkward#324). However, when Coffea has been imported, dask-awkward arrays are not copyable at all. I assume this is because of
https://github.com/CoffeaTeam/coffea/blob/558f853d7fcf7313ff8093a48fb1d02bef2a5980/src/coffea/__init__.py#L51C69-L51C69

I consider the ability to copy arrays essential. For example when I apply different JEC uncertainties, I create a copy of my data, adjust the jets in the copies according to the different variations and run my code on each of the copies.

To Reproduce

from copy import copy
import awkward as ak
import dask_awkward as dak
import coffea


a = dak.from_awkward(ak.Array([1]), 1)
a_copy = copy(a)

a_copy will be None, but should be a dask-awkward array (a copy of a).

Coffea version 2023.7.0rc0.

@jrueb jrueb added the bug Something isn't working label Aug 16, 2023
@lgray
Copy link
Collaborator

lgray commented Aug 16, 2023

This will be fixed when awkward arrays (and dask awkward arrays) have transient_attrs (a planned feature, scikit-hep/awkward#1391), right now they're not copyable because we need to keep a pointer to the original dask awkward array for various cross reference functionality (and if you recall your earlier issue we cannot simply store a weakref). If we copy/pickle the dask awkward array it mixes descriptions of work and actual work, which results in chaos. The best solution for now is to register a constructor to None in copyreg, which is done on import of coffea.

Unless you have a better solution for not pickling dask-awkward arrays in mind, you shouldn't really need to copy dask awkward arrays in the first place (the concept of doing so is deeply strange in the first place since dask arrays are handles to future work, a copy is the same as multiple references to the same object). I would suggest finding another way to do what you want to do.

@lgray
Copy link
Collaborator

lgray commented Dec 6, 2023

Fixed in #949

@lgray lgray closed this as completed Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants