Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix randomness for threading #7925

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

marcus-wirtz-snkeos
Copy link

Description

Fixes #7922 by updating the random state of the Randomizable transform BEFORE copying the transforms. In the current implementation self.randomizable() is only called within the __call__() function and thus only updated inside the copy.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

@YanxuanLiu
Copy link
Collaborator

/build

if isinstance(_transform, ThreadUnsafe):
if isinstance(_transform, Randomizable):
# update the random state before deepcopy, otherwise there is no randomness
_transform.randomize(data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can definitely update the random state here, but I guess the issue here is that if the transform is thread unsafe, we can't guarantee that the same transform will be performed on all keys, which may cause problems.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of my understanding, the state is frozen for a single thread after the subsequent deepcopy of the Transform. Since all keys are processed by this copied Transform, a consistent state is guaranteed.

Copy link
Author

@marcus-wirtz-snkeos marcus-wirtz-snkeos Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I realized that .randomize() is not necessarily updating the random state self.R (cf. monai.transforms.transform.RandomizableTranform)
image

Therefore the correct way here would be to call the _transform.set_random_state() which is implemented in the Randomizable base class und updates self.R

def set_random_state(self, seed: int | None = None, state: np.random.RandomState | None = None) -> Randomizable:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KumoLiu are there more transforms which inherit directly from ThreadUnsafe? I can only find Randomizable in the monai codebase, which would be covered here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, only Randomizable but all random transform inherit from RandomizableTransform. I'm not sure whether this change can also works well with invert. May also need to check that.

@ericspod
Copy link
Member

I'd like @atbenmurray to have a chance to review this before approving please.

@atbenmurray
Copy link
Contributor

Hi folks. Thanks @marcus-wirtz-snkeos for taking the time to raise the issue and PR. I need to take a careful look at this fix. From a design standpoint, we are very much focused on an "as if the pytorch team wrote it" design philosophy and I need to destruct test the change from this standpoint.

Signed-off-by: marcus.wirtz <marcus.wirtz@snkeos.com>
@johnzielke
Copy link
Contributor

Thanks everyone for the amazing work on Monai.
Seeing this, and having looked at parts of Monai random generation before, this is my humble opinion on this topic:
While the docs do mention issues with randomness and threading, I would not expect them to have these consequences. If I recall correctly, it used to be that there would be errors when calling transforms from multiple threads, but the deep copying every iteration was introduced in March of last year if I read the git history correctly.

In my opinion this should be forbidden by default and throw an error that needs to be disabled with a flag to prevent users from accidentally stumbling on this. The problem with the proposed solution is that there would be no reproducibility since a new randomstate is used every time. That is fine in my opinion if users have to use a flag to manually enable this behavior and will turn off threading when they need reproducibility.

But in the future, the whole random generation of Monai needs a refactor that solves the problem of multi-threading and randomness (see #7582 ) .
This could be completed together with the move to a new random generator api (see this PR that I tried my hand at, but realized that without discussion with the core team would need to make too many breaking changes).

@atbenmurray
Copy link
Contributor

Thanks everyone for the amazing work on Monai. Seeing this, and having looked at parts of Monai random generation before, this is my humble opinion on this topic: While the docs do mention issues with randomness and threading, I would not expect them to have these consequences. If I recall correctly, it used to be that there would be errors when calling transforms from multiple threads, but the deep copying every iteration was introduced in March of last year if I read the git history correctly.

In my opinion this should be forbidden by default and throw an error that needs to be disabled with a flag to prevent users from accidentally stumbling on this. The problem with the proposed solution is that there would be no reproducibility since a new randomstate is used every time. That is fine in my opinion if users have to use a flag to manually enable this behavior and will turn off threading when they need reproducibility.

But in the future, the whole random generation of Monai needs a refactor that solves the problem of multi-threading and randomness (see #7582 ) . This could be completed together with the move to a new random generator api (see this PR that I tried my hand at, but realized that without discussion with the core team would need to make too many breaking changes).

Thanks for bringing this up @johnzielke. I'll take a look at these items also.

@marcus-wirtz-snkeos
Copy link
Author

@johnzielke thanks for the feedback, fully agreeing. This fix can only be a temporary one, since the earlier introduced deepcopy() is problematic per se.

I verified with local batch generation that there is no randomness for the threading=True code as it is right now and also that there is randomness (though not deterministic) with my proposed fix.

Originally I tried to use _transform.randomize(data) rather than _transform.set_random_state(), which would only iterate the random state and therefore maintain reproducibility. I experienced some issues though with certain transforms (maybe not implementing self.randomize() correctly). I'll have a look on that again and keep you posted!

@marcus-wirtz-snkeos
Copy link
Author

marcus-wirtz-snkeos commented Jul 26, 2024

Should work now, the issue was in some of my custom Transforms indeed not implementing .randomize() correctly. @atbenmurray can you run the destruction checks to bring this as a temporary workaround?

@lukas-folle-snkeos
Copy link

@atbenmurray @ericspod what is the state of this PR?

@atbenmurray
Copy link
Contributor

atbenmurray commented Oct 4, 2024

@atbenmurray @ericspod what is the state of this PR?

@lukas-folle-snkeos, I'm refamiliarizing myself with it. Ideally, we'd like to do more to improve the randomness for threading, but if this change isn't breaking any scenarios, then we can go ahead with it and think about that subsequently.

@johnzielke
Copy link
Contributor

I don't think anyone relies on the current non-randomization behavior. I think there is an issue with the proposed approach, which I think are both part of randomize() not being the "correct" function in this case

  1. Calling randomize is usually done inside the "call" function already, so this would call it twice.
  2. Since the .randomize() is called on the transform shared across all threads, this approach might lead to race conditions with multiple threads calling it.
    I see two solutions for this:
    A. Use set_random_state() instead of randomize(). E.g.
if threading and isinstance(_transform, ThreadUnsafe):
            _shared_transform = _transform
            _transform = deepcopy(_transform)
            if isinstance(_transform, Randomizable):
                seed = _shared_transform.R.randint(0, 2**32 - 1) # Max value allowed as seed for numpy.random.RandomState 
                _transform.set_random_state(seed)

This makes sure that each iteration uses a different randomstate. You do not have reproducibility though, since the inidividual threads might not be calling this in a reproducible order
B. Use some kind of thread-local variable to keep individual "persistent" individual instances of the transforms.
While I think this is the most future-proof option, I think it needs a bit more thinking to prevent memory and other possible issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No randomness for threading=True
7 participants