-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix for deepcopy / pickling discrete spaces #2378
Conversation
for more information, see https://pre-commit.ci
Performance benchmarks:
|
tests fail on 3.10 because |
Finally a case in which having all these different CI runs pay off! Easiest way would be to add a warning that the cell space only works with 3.11 and above. SPEC 0 says we can drop 3.10 now. I say we do that directly after 3.0 is branched off, and require Python 3.11 for Mesa 3.1 and above. |
Using super here was me being lazy. I was surprised it worked (But not for |
for more information, see https://pre-commit.ci
Just to clarify something here, since I havent worked with getstate or setstate yet. This code changes that when I pickle a cell, its connections become an empty dictionary. And when I unpickle a CellSpace the connections the connections are restored. Is this correct? Wouldn't this mean if I pickle a cell by itself and unpickle it I lose the connections? Not sure how I feel about this. Its probably still reasonable for the time being, but it could also lead to very unexpected behaviors. |
Yes, that is a major drawback of this fix. Cells can now only exist in DiscreteSpaces, and this bugfix results in an implicit coupling between the two. It might be good to document this explicitly. However, I don't see another option for resolving this recursion error. We can't store the connections with the refs to the other cells in the return from |
Awesome, everything has drawbacks, of this helps now we can revisit later. Experimental 🚀🚀. |
I still don't fully understand why the recursion error occurs in the first place, and only for large grids. The number of connections per cell stays the same, no? But I don't think it's very important right now. While conceptually and practically it's awesome that cells are independent of the space, I would estimate in 99% of use cases they are used with a space. And the remaining use cases would still require a workflow that involves pickling. And working in large grids. This is pretty much the definition of a rare edge case. YAGNI work. |
I also don't fully understand it. It has something to do with how deepcopy/pickle tracks the number of calls. I have found various discussions online on large datastructures with many identical objects giving rise to this problem. In our case, on my machine with a max recursion depth of 3000, the error triggers for a 19 * 19 grid (361 cells) but not for an 18 * 18 grid (324 cells). 19 * 18 (so 342 cells) also fails. The critical limit seems to be 332 cells, which I found through testing with a (1 * At this point, I gave up trying to dig deeper because, as you say, people will unlikely need cells outside of discrete spaces. |
332 * (1 + 8) = 2988, which with a little overhead seems remarkably close. Almost strange this isn’t the case. |
Exactly. I am really surprised that moore vs. neuman or toroidal does not matter as far as I can tell. |
def __getstate__(self): | ||
"""Return state of the Cell with connections set to empty.""" | ||
# fixme, once we shift to 3.11, replace this with super. __getstate__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@quaquel We're now requiring Python 3.11+, so this can be updated.
As highlighted in #2373, deepcopy (and by extension) pickle breaks on discrete spaces. As a fix, in
Cell.__getstate__
, I set "connections" to an empty dict. InDiscreteSpace.__setstate__
, I rebuild the connections viaDiscreteSpace._connect_cells()
. This latter method already existed inGrid
andVoronoiGrid
, but not yet inNetwork
. Thus, I moved the method stub to DiscreteSpace and implemented it for Network.This also adds unit tests for everything. I first confirmed that the unit tests failed and reproduce the original error.