Skip to content
This repository has been archived by the owner on Apr 14, 2021. It is now read-only.

with_clean_env / with_original_env is not threadsafe? #4232

Closed
jjb opened this issue Jan 24, 2016 · 11 comments
Closed

with_clean_env / with_original_env is not threadsafe? #4232

jjb opened this issue Jan 24, 2016 · 11 comments

Comments

@jjb
Copy link
Contributor

jjb commented Jan 24, 2016

I have a script that fires off many thread which each run a shell commands. Here is the code which invokes the shell commands:

Bundler.with_clean_env { `#{command}` }

On one of my runs I got this error:

RuntimeError: can't add a new key into hash during iteration
                        []= at org/jruby/RubyHash.java:1007
                    replace at org/jruby/RubyHash.java:1744
          with_original_env at /Users/john/.rbenv/versions/jruby-9.0.4.0/lib/ruby/gems/shared/gems/bundler-1.11.2/lib/bundler.rb:198
             with_clean_env at /Users/john/.rbenv/versions/jruby-9.0.4.0/lib/ruby/gems/shared/gems/bundler-1.11.2/lib/bundler.rb:205

The suspicious bundler code is here. Line 198 refers to ENV.replace(ORIGINAL_ENV):

    def with_original_env
      bundled_env = ENV.to_hash
      ENV.replace(ORIGINAL_ENV)
      yield
    ensure
      ENV.replace(bundled_env.to_hash)
    end

    def with_clean_env
      with_original_env do
        ENV["MANPATH"] = ENV["BUNDLE_ORIG_MANPATH"]
        ENV.delete_if {|k, _| k[0, 7] == "BUNDLE_" }

        if ENV.key?("RUBYOPT")
          ENV["RUBYOPT"] = ENV["RUBYOPT"].sub "-rbundler/setup", ""
        end

        if ENV.key?("RUBYLIB")
          rubylib = ENV["RUBYLIB"].split(File::PATH_SEPARATOR)
          rubylib.delete(File.expand_path("..", __FILE__))
          ENV["RUBYLIB"] = rubylib.join(File::PATH_SEPARATOR)
        end

        yield
      end
    end

Subsequent runs of my script never caused the problem again.

I tried writing a script to stress test this race condition and reproduce the problem but I wasn't able to.

I'm not entirely sure where in that code (or elsewhere) the problem is originating. Is it simply that any two threads in a runtime can't iterate over and add a new key to a hash at the same time? Or does the violating key addition have to happen within the actual iteration block? If the former then I think it's simply a matter of a low-probability race condition. If the latter, then I have no idea where this could be occurring.

@indirect
Copy link
Member

It's impossible for those methods to be threadsafe. They modify global variables. Adding a mutex wouldn't help because that simply ensures that the threads have to take turns modifying global state. You'll need to handle the thread safety yourself if you call those methods.

@indirect
Copy link
Member

Oops, after reading your ticket again, I think a mutex would prevent this particular exception. My bad, sorry.

However, the underlying problem would still be there, even though the exception would not be. If you're using these methods in threads, you need to make sure they have exclusive execution because they modify global state.

@segiddins
Copy link
Member

@indirect since the methods take blocks, we could add our own mutex inside them, a BUNDLER_ENV_MUTEX of sorts

@jjb
Copy link
Contributor Author

jjb commented Jan 25, 2016

the underlying problem would still be there, even though the exception would not be. If you're using these methods in threads, you need to make sure they have exclusive execution because they modify global state

ahh, gotcha. I just rewrote my script so that there is only one invocation of Bundler.with_clean_env in which everything is run, and that works fine.

@indirect
Copy link
Member

@segiddins think we should? the mutex will be held the entire time the subcommand is running, which effectively undoes threading....

@indirect
Copy link
Member

I guess we could release the mutex by forking, setting the env back, and then releasing the lock? That sounds... much more complicated than what we have now. 😝

@njam
Copy link
Contributor

njam commented Feb 15, 2016

Maybe there should be a Bundler.clean_env method that just returns the clean/original environment?

Then other ruby applications could retrieve this and pass it on for when launching new processes (e.g. with Process.exec(env, command)), and we wouldn't change the state of any environment variables in the parent process.

(there is Bundler.ORIGINAL_ENV - but is it public API? And what is the difference between "clean_env" and "original_env"?)

@segiddins
Copy link
Member

I think a Bundler.clean_env method would be fine?

@njam
Copy link
Contributor

njam commented Feb 15, 2016

Agree!
And what is the difference between "clean_env" and "original_env"? Shouldn't there be just 1 mechanism that resets the environment to how it was before bundler modified it? If the previous environment contained bundler-related variables then I think there's no reliable way to revert those back.
(sorry for the off topic)

@indirect
Copy link
Member

At one time, the difference was that one went up one layer of env, while the other removed all Bundler values. I'm not sure if that's still true today. A Bundler.clean_env method seems great to me.

homu added a commit that referenced this issue Feb 23, 2016
Introduce "Bundler.clean_env"

Via #4232
Which should return the environment as it was *before* bundler modified it.

Should this use "original_env" or "clean_env" or both?
> At one time, the difference was that one went up one layer of env, while the other removed all Bundler values. I'm not sure if that's still true today.

You can assign this ticket to me if you want.
@RochesterinNYC
Copy link
Contributor

Closed by #4303.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants