Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using and import only load on current process #19073

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@ This section lists changes that do not have deprecation warnings.
* `broadcast` now handles tuples, and treats any argument that is not a tuple
or an array as a "scalar" ([#16986]).

* `using` and `import` now only loads on the calling process. Use `@everywhere using ModName` or
`@everywhere import ModName` to load a module/package on all processes.


Library improvements
--------------------

Expand Down
60 changes: 14 additions & 46 deletions base/loading.jl
Original file line number Diff line number Diff line change
Expand Up @@ -151,42 +151,19 @@ function _include_from_serialized(path::String)
end

# returns an array of modules loaded, or an Exception that describes why it failed
# and also attempts to load the same file across all nodes (if toplevel_node and myid() == master)
# and it reconnects the Base.Docs.META
function _require_from_serialized(node::Int, mod::Symbol, path_to_try::String, toplevel_load::Bool)
function _require_from_serialized(mod::Symbol, path_to_try::String, toplevel_load::Bool)
local restored = nothing
local content::Vector{UInt8}
if toplevel_load && myid() == 1 && nprocs() > 1
# broadcast top-level import/using from node 1 (only)
if node == myid()
if myid() == 1
if toplevel_load
content = open(read, path_to_try)
restored = _include_from_serialized(content)
else
content = remotecall_fetch(open, node, read, path_to_try)
restored = _include_from_serialized(path_to_try)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove the toplevel_load bool and just always use this codepath. The other path is only relevant if it needed to also broadcast the file to all workers.

end
restored = _include_from_serialized(content)
isa(restored, Exception) && return restored
others = filter(x -> x != myid(), procs())
refs = Any[
(p, @spawnat(p,
let m = try
_include_from_serialized(content)
catch ex
isa(ex, Exception) ? ex : ErrorException(string(ex))
end
isa(m, Exception) ? m : nothing
end))
for p in others ]
for (id, ref) in refs
m = fetch(ref)
if m !== nothing
warn("Node state is inconsistent: node $id failed to load cache from $path_to_try. Got:")
warn(m, prefix="WARNING: ")
end
end
elseif node == myid()
restored = _include_from_serialized(path_to_try)
else
content = remotecall_fetch(open, node, read, path_to_try)
content = remotecall_fetch(open, 1, read, path_to_try)
restored = _include_from_serialized(content)
end

Expand All @@ -203,18 +180,18 @@ end
# returns `true` if require found a precompile cache for this mod, but couldn't load it
# returns `false` if the module isn't known to be precompilable
# returns the set of modules restored if the cache load succeeded
function _require_search_from_serialized(node::Int, mod::Symbol, sourcepath::String, toplevel_load::Bool)
if node == myid()
function _require_search_from_serialized(mod::Symbol, sourcepath::String, toplevel_load::Bool)
if 1 == myid()
paths = find_all_in_cache_path(mod)
else
paths = @fetchfrom node find_all_in_cache_path(mod)
paths = @fetchfrom 1 find_all_in_cache_path(mod)
end

for path_to_try in paths::Vector{String}
if stale_cachefile(sourcepath, path_to_try)
continue
end
restored = _require_from_serialized(node, mod, path_to_try, toplevel_load)
restored = _require_from_serialized(mod, path_to_try, toplevel_load)
if isa(restored, Exception)
if isa(restored, ErrorException) && endswith(restored.msg, " uuid did not match cache file.")
# can't use this cache due to a module uuid mismatch,
Expand Down Expand Up @@ -399,7 +376,7 @@ function require(mod::Symbol)
# attempt to load the module file via the precompile cache locations
doneprecompile = false
if JLOptions().use_compilecache != 0
doneprecompile = _require_search_from_serialized(1, mod, path, last)
doneprecompile = _require_search_from_serialized(mod, path, last)
if !isa(doneprecompile, Bool)
return # success
end
Expand All @@ -422,7 +399,7 @@ function require(mod::Symbol)
# spawn off a new incremental pre-compile task from node 1 for recursive `require` calls
# or if the require search declared it was pre-compiled before (and therefore is expected to still be pre-compilable)
cachefile = compilecache(mod)
m = _require_from_serialized(1, mod, cachefile, last)
m = _require_from_serialized(mod, cachefile, last)
if isa(m, Exception)
warn("The call to compilecache failed to create a usable precompiled cache file for module $name. Got:")
warn(m, prefix="WARNING: ")
Expand All @@ -435,23 +412,14 @@ function require(mod::Symbol)
# just load the file normally via include
# for unknown dependencies
try
if last && myid() == 1 && nprocs() > 1
# include on node 1 first to check for PrecompilableErrors
eval(Main, :(Base.include_from_node1($path)))

# broadcast top-level import/using from node 1 (only)
refs = Any[ @spawnat p eval(Main, :(Base.include_from_node1($path))) for p in filter(x -> x != 1, procs()) ]
for r in refs; wait(r); end
else
eval(Main, :(Base.include_from_node1($path)))
end
eval(Main, :(Base.include_from_node1($path)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This risks making the node module state inconsistent since the behavior of __precompile__ is not uniform across nodes, resulting in a race condition that this block of code previously tried to heuristically avoid. To solve, I think we need to remove the myid test from __precompile__, but also to make it possible to ignore a __precompile__ statement (perhaps some global dictionary of sourcepath names for which __precompile__ is temporarily disabled?). Then when we catch a precompilableerror, we can first request the cache compile file from node 1 (the package_locks synchronization might be able to help synchronize these requests). Once we know if that will succeed for fail, we can proceed with either loading the cache file, or include it with __precompile__ disabled for that file. (this should also be changed to be the case on node 1, not just worker nodes, as I think this logic should be more robust – ignoring the cache whenever it is unusable – but just hadn't been implemented yet)

catch ex
if doneprecompile === true || JLOptions().use_compilecache == 0 || !precompilableerror(ex, true)
rethrow() # rethrow non-precompilable=true errors
end
# the file requested `__precompile__`, so try to build a cache file and use that
cachefile = compilecache(mod)
m = _require_from_serialized(1, mod, cachefile, last)
m = _require_from_serialized(mod, cachefile, last)
if isa(m, Exception)
warn(m, prefix="WARNING: ")
# TODO: disable __precompile__(true) error and do normal include instead of error
Expand Down
12 changes: 6 additions & 6 deletions doc/manual/modules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,7 @@ Relative and absolute module paths

Given the statement ``using Foo``, the system looks for ``Foo``
within ``Main``. If the module does not exist, the system
attempts to ``require("Foo")``, which typically results in loading
code from an installed package.
attempts to load code, typically from an installed package.

However, some modules contain submodules, which means you sometimes
need to access a module that is not directly available in ``Main``.
Expand Down Expand Up @@ -221,7 +220,8 @@ Module file paths
-----------------

The global variable LOAD_PATH contains the directories Julia searches for
modules when calling ``require``. It can be extended using ``push!``::
modules when ``using`` (or ``import``) is invoked.
It can be extended using ``push!``::

push!(LOAD_PATH, "/Path/To/My/Module/")

Expand Down Expand Up @@ -272,8 +272,8 @@ For file dependencies, a change is determined by examining whether the modificat
of each file loaded by ``include`` or added explicity by ``include_dependency`` is unchanged,
or equal to the modification time truncated to the nearest second
(to accommodate systems that can't copy mtime with sub-second accuracy).
It also takes into account whether the path to the file chosen by the search logic in ``require``
matches the path that had created the precompile file.
It also takes into account whether the path to the file chosen by the search logic (during
module/package loading) matches the path that had created the precompile file.

It also takes into account the set of dependencies already loaded into the current process
and won't recompile those modules, even if their files change or disappear,
Expand Down Expand Up @@ -312,7 +312,7 @@ into a full-compilation process.

In particular, if you define a ``function __init__()`` in a module,
then Julia will call ``__init__()`` immediately *after* the module is
loaded (e.g., by ``import``, ``using``, or ``require``) at runtime for
loaded (e.g., by ``import`` or ``using``) at runtime for
the *first* time (i.e., ``__init__`` is only called once, and only
after all statements in the module have been executed). Because it is
called after the module is fully imported, any submodules or other
Expand Down
8 changes: 6 additions & 2 deletions doc/manual/parallel-computing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -170,15 +170,19 @@ the following code::
Starting Julia with ``julia -p 2``, you can use this to verify the following:

- :func:`include("DummyModule.jl") <include>` loads the file on just a single process (whichever one executes the statement).
- ``using DummyModule`` causes the module to be loaded on all processes; however, the module is brought into scope only on the one executing the statement.
- ``@everywhere import DummyModule`` causes the module to be loaded on all processes. Only the module (not its exports) is
brought into scope on all processes.
- As long as ``DummyModule`` is loaded on process 2, commands like ::

using DummyModule # Load and bring module and module exports into scope locally.
@everywhere import DummyModule # Load and bring module (not exports) into scope everywhere.
rr = RemoteChannel(2)
put!(rr, MyType(7))

allow you to store an object of type ``MyType`` on process 2 even if ``DummyModule`` is not in scope on process 2.
- ``@everywhere using DummyModule`` will load and and bring into scope ``DummyModule`` on all processes.

You can force a command to run on all processes using the :obj:`@everywhere` macro.
As seen previously, you can force a command to run on all processes using the :obj:`@everywhere` macro.
For example, :obj:`@everywhere` can also be used to directly define a function on all processes::

julia> @everywhere id = myid()
Expand Down
2 changes: 1 addition & 1 deletion test/compile.jl
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ try
# the module doesn't reload from the image:
t = redirected_stderr("WARNING: replacing module Foo4b3a94a1a081a8cb.\nWARNING: Method definition ")
try
@test isa(Base._require_from_serialized(myid(), Foo_module, cachefile, #=broadcast-load=#false), Array{Any,1})
@test isa(Base._require_from_serialized(Foo_module, cachefile, #=broadcast-load=#false), Array{Any,1})
finally
close(STDERR)
redirect_stderr(olderr)
Expand Down