Replace use of id() with global counter-based id. #2313

levskaya · 2022-07-21T05:54:18Z

Historically we key'd on id() to record sharing relationships during
lifting and outer module adoption. This was dumb, and after recently
fixing one bad bug arising from id reuse, we should use something
sound instead.

levskaya · 2022-07-21T06:09:58Z

Seeing an odd, seemingly unrelated doctest failure in one case here that confuses me, I'll have to look into it later.
(edit - turned out to be a from an accidental print statement that snuck in from another PR via editor buffer cache.)

flax/core/lift.py

flax/ids.py

levskaya · 2022-07-22T00:21:13Z

ha - I noticed from a failing user's unit test that any construction-created/self-bound unique id scheme will break under default deepcopy (which will copy the supposedly unique id), so I need to add a custom __deepcopy__ method to fix this.

Historically we key'd on id() to record sharing relationships during lifting and outer module adoption. This was dumb, and after recently fixing one bad bug arising from id reuse, we should use something sound instead.

marcvanzee · 2022-07-22T09:43:47Z

ha - I noticed from a failing user's unit test that any construction-created/self-bound unique id scheme will break under default deepcopy (which will copy the supposedly unique id), so I need to add a custom __deepcopy__ method to fix this.

Is this actually what we want? Supposing a deepcopy will return an identical copy of the object, so maybe the id should be the same as well?

levskaya · 2022-07-22T19:39:12Z

Is this actually what we want? Supposing a deepcopy will return an identical copy of the object, so maybe the id should be the same as well?

Probably not since it would break the previous behavior, we really want the semantics to be "share by reference" which meant "key on object instance identity" following what we intended w. the old id()-based sharing. In any case there are internal code/tests that assume the previous behavior: deepcopy should return a new non-shared instance. (some people seem to use it as a hack to clone layer definitions)

jheek · 2022-07-25T14:28:10Z

deepcopy should return a new non-shared instance.

So is id() the right approach after all? Or perhaps we should error out on deep copies?

levskaya · 2022-07-25T18:20:42Z

It's pretty easy to maintain the per-instance behavior (non shared ids) by defining __copy__, __deep_copy__ on an wrapped, hashable id type. I'm hesitant to break user code immediately, though we could choose to have Module have error-raising copy, deepcopy implementations to prevent its use and encourage use of clone instead in the few existing use sites.

I would just rather never have to worry about id() lifetime issues again, it's a bit silly to introduce lifetime concerns into python of all languages!

levskaya · 2022-07-25T21:23:54Z

Oops, this was accidentally merged early because of a miscommunication, we can roll it back if this change is ultimately unwanted (it should have no user-visible effects either way).

marcvanzee · 2022-08-01T14:18:04Z

I have no problem with this change, your explanation makes sense to me so I am fine with the change as it is.

For reference: this is a strict improvement over the id() approach which was flawed since it could result in multiple objects having the same id. This is because id() just returns the address of the C pointer to the object instance, which can be reused after garbage collection. This can lead to bugs.

levskaya requested a review from jheek July 21, 2022 05:55

marcvanzee reviewed Jul 21, 2022

View reviewed changes

flax/core/lift.py Outdated Show resolved Hide resolved

flax/ids.py Outdated Show resolved Hide resolved

levskaya force-pushed the idfix branch 3 times, most recently from 60dae36 to 831ee54 Compare July 21, 2022 20:43

levskaya added the pull ready label Jul 21, 2022

Replace use of id() with global counter-based id.

963fa4a

Historically we key'd on id() to record sharing relationships during lifting and outer module adoption. This was dumb, and after recently fixing one bad bug arising from id reuse, we should use something sound instead.

levskaya force-pushed the idfix branch from 831ee54 to 963fa4a Compare July 22, 2022 04:58

copybara-service bot merged commit db4a74b into google:main Jul 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace use of id() with global counter-based id. #2313

Replace use of id() with global counter-based id. #2313

levskaya commented Jul 21, 2022

levskaya commented Jul 21, 2022 •

edited

Loading

levskaya commented Jul 22, 2022 •

edited

Loading

marcvanzee commented Jul 22, 2022

levskaya commented Jul 22, 2022

jheek commented Jul 25, 2022

levskaya commented Jul 25, 2022 •

edited

Loading

levskaya commented Jul 25, 2022

marcvanzee commented Aug 1, 2022

Replace use of id() with global counter-based id. #2313

Replace use of id() with global counter-based id. #2313

Conversation

levskaya commented Jul 21, 2022

levskaya commented Jul 21, 2022 • edited Loading

levskaya commented Jul 22, 2022 • edited Loading

marcvanzee commented Jul 22, 2022

levskaya commented Jul 22, 2022

jheek commented Jul 25, 2022

levskaya commented Jul 25, 2022 • edited Loading

levskaya commented Jul 25, 2022

marcvanzee commented Aug 1, 2022

levskaya commented Jul 21, 2022 •

edited

Loading

levskaya commented Jul 22, 2022 •

edited

Loading

levskaya commented Jul 25, 2022 •

edited

Loading