Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using and import only load on current process #19073

Closed
wants to merge 2 commits into from
Closed

Conversation

amitmurthy
Copy link
Contributor

As discussed in JuliaLang/Distributed.jl#20, this PR changes the behavior of using to only load on the current process.

cc: @vtjnash

@vchuravy vchuravy added the parallelism Parallel or distributed computation label Oct 24, 2016
@vchuravy
Copy link
Member

Since this is breaking it should have a line in NEWS.md

@StefanKarpinski StefanKarpinski added the needs news A NEWS entry is required for this change label Oct 24, 2016
@amitmurthy
Copy link
Contributor Author

Updated. Would like a review by @vtjnash before merging.

@@ -221,7 +220,8 @@ Module file paths
-----------------

The global variable LOAD_PATH contains the directories Julia searches for
modules when calling ``require``. It can be extended using ``push!``::
modules when using ``using`` (or ``import``) is invoked.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate using ?

@@ -170,15 +170,19 @@ the following code::
Starting Julia with ``julia -p 2``, you can use this to verify the following:

- :func:`include("DummyModule.jl") <include>` loads the file on just a single process (whichever one executes the statement).
- ``using DummyModule`` causes the module to be loaded on all processes; however, the module is brought into scope only on the one executing the statement.
- ``@everywhere import DummyModule`` causes the module to be loaded on all processes. However, it is not brought into scope on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the module not brought into scope, or is this referring to the exports of the module (using vs import distinction) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the exports. Good catch.

@amitmurthy
Copy link
Contributor Author

Also, currently @everywhere using Foo results in Foo being loaded N times on each process (N = number of processes). There is code in loading.jl to avoid this, so this behavior seems to be a bug.

@amitmurthy
Copy link
Contributor Author

Bump @vtjnash

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this will be a nice simplification to _require_from_serialized!

else
content = remotecall_fetch(open, node, read, path_to_try)
restored = _include_from_serialized(path_to_try)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove the toplevel_load bool and just always use this codepath. The other path is only relevant if it needed to also broadcast the file to all workers.

else
eval(Main, :(Base.include_from_node1($path)))
end
eval(Main, :(Base.include_from_node1($path)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This risks making the node module state inconsistent since the behavior of __precompile__ is not uniform across nodes, resulting in a race condition that this block of code previously tried to heuristically avoid. To solve, I think we need to remove the myid test from __precompile__, but also to make it possible to ignore a __precompile__ statement (perhaps some global dictionary of sourcepath names for which __precompile__ is temporarily disabled?). Then when we catch a precompilableerror, we can first request the cache compile file from node 1 (the package_locks synchronization might be able to help synchronize these requests). Once we know if that will succeed for fail, we can proceed with either loading the cache file, or include it with __precompile__ disabled for that file. (this should also be changed to be the case on node 1, not just worker nodes, as I think this logic should be more robust – ignoring the cache whenever it is unusable – but just hadn't been implemented yet)

@bjarthur
Copy link
Contributor

would be great to have this in 0.6.

@amitmurthy
Copy link
Contributor Author

@bjarthur #19910 should bring in major changes making this PR moot. Leaving it open so that the issue is not forgotten.

@andreasnoack
Copy link
Member

@vtjnash Is this now obsolete with #22588 merged?

@amitmurthy amitmurthy closed this Jul 17, 2017
@amitmurthy amitmurthy deleted the amitm/using branch July 17, 2017 13:37
@amitmurthy
Copy link
Contributor Author

This was incomplete. Closed it.

@vtjnash
Copy link
Member

vtjnash commented Jul 17, 2017

#22588 didn't address the @everywhere using question (which is now implemented simply by the builtin load hook at https://github.com/JuliaLang/julia/pull/22588/files#diff-a03d5ce5729a81495000698e643fce27R320), but it does obsolete the diff in this PR.

@KristofferC KristofferC removed the needs news A NEWS entry is required for this change label Nov 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parallelism Parallel or distributed computation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants