Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code loading docs #26787

Merged
merged 6 commits into from
Apr 13, 2018
Merged

code loading docs #26787

merged 6 commits into from
Apr 13, 2018

Conversation

StefanKarpinski
Copy link
Sponsor Member

No description provided.

)
```

Note that for efficiency reasons, `roots` maps are not actually materialized as dictionaries when loading code and are instead queried through internal APIs. Given this `roots` map, in the code a `App` the statement `import Priv` will cause Julia to look up `roots[:Priv]`, which yields `ba13f791-ae1d-465a-978b-69c3ad90f72b`, the UUID of the `Priv` package that is to be loaded in that context. This UUID identifies which `Priv` package to load and use when the main application evaluates `import Priv`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to make sense of "in the code a App the statement ...". Is "a" intended to be "of"?


Note that for efficiency reasons, `roots` maps are not actually materialized as dictionaries when loading code and are instead queried through internal APIs. Given this `roots` map, in the code a `App` the statement `import Priv` will cause Julia to look up `roots[:Priv]`, which yields `ba13f791-ae1d-465a-978b-69c3ad90f72b`, the UUID of the `Priv` package that is to be loaded in that context. This UUID identifies which `Priv` package to load and use when the main application evaluates `import Priv`.

**The depedency graph** of a project environment is determined by the contents of the manifest file, if present, or if there is no manifest file, `graph` is empty. A manifest file contains a stanza for each direct or indirect dependency of a project, including for each one, its UUID and exact version information and optionally an explicit path to to its source code. Consider the following example manifest file for `App`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The depedency graph". Misspelling of dependency.


## Conclusion

Federated package management and precise software reproducibility are difficult but wothy goals in a package system. In combination, these goals lead to a more complex package loading mechanism than most dynamic languages have, but it also yields scalability and reproducibility that is more commonly associated with static languages. Fortunately, most Julia users can remain oblivious to the technical details of code loading and simply use the built-in package manager to add a package `X` to the appropriate project and manifest files and then write `import X` to load `X` without a further thought.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misspelling: "wothy goals"

Copy link
Sponsor Member

@KristofferC KristofferC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done Stefan! I like the examples very much.

)
```

Note that for efficiency reasons, `roots` maps are not actually materialized as dictionaries when loading code and are instead queried through internal APIs. Given this `roots` map, in the code a `App` the statement `import Priv` will cause Julia to look up `roots[:Priv]`, which yields `ba13f791-ae1d-465a-978b-69c3ad90f72b`, the UUID of the `Priv` package that is to be loaded in that context. This UUID identifies which `Priv` package to load and use when the main application evaluates `import Priv`.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that for efficiency reasons ...

Maybe this sentence can be removed? Describing how things are implemented internally feels a bit out of place here?

)
```

Again, for efficicency reasons, Julia doesn't actually materialize this graph during package loading, instead it computes portions of the graph as needed. Given this dependency `graph`, when Julia sees `import Priv` in the `Pub` package—which has UUID `ba13f791-ae1d-465a-978b-69c3ad90f72b`—it looks up:
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, for efficicency reasons...

Same here as above.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m just worried that people are going to go look for these APIs which don’t actually exist anywhere. Maybe they should, but currently they don’t.


In the example manifest file above, to find the path of the first `Priv` package—the one with UUID `ba13f791-ae1d-465a-978b-69c3ad90f72b`—Julia looks for its stanza in the manifest file, sees that it has a `path` entry, looks at `deps/Priv` relative to the `App` project directory—let's suppose the `App` code lives in `/home/me/projects/App`—sees that `/home/me/projects/App/deps/Priv` exists and therefore loads `Priv` from there.

If, on the other hand, Julia was loading the *other* `Priv` package—the one with UUID `2d15fe94-a1f7-436c-a4d8-07a9a496e01c`—it finds its stanza in the manifest, see that it does *not* have a `path` entry, but that it does have a `git-tree-sha1` entry. It then computes the `slug` for this UUID/SHA-1 pair, which is `HDkr` (the exact details of this computation aren't important, but it is consistent and deterministic). This means that the path to this `Priv` package will be `packages/Priv/HDkr/src/Priv.jl` in one of the package depots. Suppose the contents of `DEPOT_PATH` is `["/users/me/.julia", "/usr/local/julia"]`; then Julia will look at the following paths to see if they exist:
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just making a reminder here that we should likely increase the number of characters in the slug.


## Conclusion

Federated package management and precise software reproducibility are difficult but wothy goals in a package system. In combination, these goals lead to a more complex package loading mechanism than most dynamic languages have, but it also yields scalability and reproducibility that is more commonly associated with static languages. Fortunately, most Julia users can remain oblivious to the technical details of code loading and simply use the built-in package manager to add a package `X` to the appropriate project and manifest files and then write `import X` to load `X` without a further thought.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wothy -> worthy?


### Environment stacks

The third and final kind of environment is one that combines other environments by overlaying several of them, making the packages in each available in a single composite environment. These composite environments are called *environment stacks*. The Julia `LOAD_PATH` global defines an environment stack—the environment in which the Julia process operates. If you want your Julia process to have access only to the packages in one project or package directory, make it the only entry in `LOAD_PATH`. It is often quite useful, however, to have access to some of your favorite tools—standard libraries, profilers, debuggers, personal utilities, etc.—even if they are not depdenecies of the project you're working on. By pushing an environment containing these tools onto the load path, you immediately have access to them in top-level code without needing to add them to your project.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are not depdenecies of

dependencies


Julia has two mechanisms for loading code:

1. **Code inclusion:** e.g. `include("source.jl")`. Inclusion allows you to split a single program across multiple source files. The expression `include("source.jl")` causes the contents of the file `source.jl` to be evaluated inside of the module where the `include` call occurs, much as if the text of `source.jl` were pasted into that file in place of the `include` call. If `include("source.jl")` is called multiple times, `source.jl` is evaluated multiple times. The included path, `source.jl`, is interpreted relative to the file where the `include` call occurs. This makes it simple to relocate a subtree of source files. In the REPL, included paths are interpreted relative to the current working directory, `pwd()`.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given #26788 maybe talking a bit more about what include does might be warranted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular "much as if the text of source.jl were pasted into that file in place of the include call" is only valid if the include occurs at top level.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let someone else (maybe @vtjnash or @JeffBezanson?) write more about include since that's not really my area of expertise. But sure, more could be said about it, I just don't know what.

!!! note
You only need to read this chapter if you want to understand the technical details of package loading in Julia. If you just want to install and use packages, simply use Julia's built-in package manager to add packages to your environment and write `import X` or `using X` in your code to load packages that you've added.

A *package* is a source tree with a standard layout providing functionality that can be reused by other Julia projects. A package is loaded by `import X` or `using X` statements. These statements also make the module named `X`, which results from loading the package code, available within the module where the import statement occurs. The meaning of `import X` is context-dependent: its meaning and behavior depend on what code it occurs in. What it does depends on the answers to two questions:
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meaning of import X is context-dependent

This makes it sound like the meaning of import changes. Maybe something like "What X refers to is context-dependent...".

Each kind of environment defines these three maps differently, as detailed in the following sections.

!!! note
For clarity of exposition, the examples throughout this chapter include fully materialized data structures for `roots`, `graph` and `paths`. However, these maps are really only abstractions—for efficiency, Julia's package loading code does not actually materialize them. Instead, it quries them through internal APIs and lazily computes only as much of each structure as is necessary to load a given package.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo quries


Given this `roots` map, in the code of `App` the statement `import Priv` will cause Julia to look up `roots[:Priv]`, which yields `ba13f791-ae1d-465a-978b-69c3ad90f72b`, the UUID of the `Priv` package that is to be loaded in that context. This UUID identifies which `Priv` package to load and use when the main application evaluates `import Priv`.

**The dependency graph** of a project environment is determined by the contents of the manifest file, if present, or if there is no manifest file, `graph` is empty. A manifest file contains a stanza for each direct or indirect dependency of a project, including for each one, its UUID and exact version information and optionally an explicit path to to its source code. Consider the following example manifest file for `App`:
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated to


Package directories provide a kind of environment that approximates package loading in Julia 0.6 and earlier, and which resembles package loading in many other dynamic languages. The set of packages available in a package directory corresponds to the set of subdirectories it contains that look like packages: if `X/src/X.jl` is a file in a package directory, then `X` is considered to be a package and `X/src/X.jl` is the file you load to get `X`. Which packages can "see" each other as dependencies depends on whether they contain project files or not and what appears in the `[deps]` sections of those project files.

**The roots map** is determined by the subdirectories of a package directory for which `X/src/X.jl` exists and whether `X/Project.toml` exists and has a top-level `uuid` entry. Specifically `:x => uuid` goes in `roots` for each such `X` where `uuid` is defined as:
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the last X here should be lowercase, to refer to the placeholder in :x => uuid? Was a bit confusing along with package name X.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lowercase :x is just a typo — it should be :X.

An *environment* determines what `import X` and `using X` mean in various code contexts and what files these statements cause to be loaded. Julia understands three kinds of environments:

1. **A project environment** is a directory with a project file and an optional manifest file. The project file determines what the names and identities of the direct dependencies of a project are. The manifest file, if present, gives a complete dependency graph, including all direct and indirect dependencies, exact versions of each dependency, and sufficient information to locate and load the correct version.
2. **A package directory** is a directory containing the source trees of a set of packages as subdirectories. This kind of environment was the only kind that existed in Julia 0.6 and earlier. If `X` is a subdirectory of a package directory and `X/src/X.jl` exists, then the package `X` is available in the package directory environment and `X/src/X.jl` is the source file by which it is loaded.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to mention that packages also have project files. And, does that mean it makes sense to say that a package is also a project environment?

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each package with a project file does have an associated environment, yes. I feel like that might just be confusing things here though since that environment doesn't really come into play here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to point out a clarification that you made someday on Slack - that having a manifest file would allow one to replicate the exact state of the software at some future time thus ensuring replicability, which is what Julia's aim is about. (This was the discussion with David Anthoff IIRC.)
Having that cleared up after the bullet points where you introduce the manifest file would make its existence cogent and clear.

Copy link
Sponsor Member Author

@StefanKarpinski StefanKarpinski Apr 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, good idea. The documentation of this is a bit fractured because half of it—the part about how packages get loaded—belongs in the language manual since it's built into the language itself, while the other half—the part about how to use Pkg3 to create and manipulate project and manifest files doesn't since it's an external tool. But a little bit of repetition won't hurt.

@JeffBezanson
Copy link
Sponsor Member

Excellent!


This example map includes three different kinds of package locations:

1. The private `Priv` package is "[vendored](https://stackoverflow.com/a/35109534/659248)" inside inside of `App` repository.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

derp on double inside

@miguelraz
Copy link
Contributor

@StefanKarpinski I remember you also discussed about how there is a crucial abstraction to make when coding up Pkg3: that of code loading vs dependency resolution.
I now understand that argument, and see that this document reflects the design and implementation of code loading. Where should the discussion for dependency resolution go? It seems fitting that at the very end of the discussion you could have a paragraph about how that gets done, because all the tools to build up to it have been presented.

@StefanKarpinski
Copy link
Sponsor Member Author

StefanKarpinski commented Apr 13, 2018

Where should the discussion for dependency resolution go? It seems fitting that at the very end of the discussion you could have a paragraph about how that gets done, because all the tools to build up to it have been presented.

This documentation hasn't been written yet. The "owner" of the version resolution code is @carlobaldassi and it lives in the Pkg3 stdlib/repo (and an older copy lives in the Pkg stdlib). It should certainly be documented, but I also suspect it should perhaps be re-implemented based on SAT with preferences, which would be more flexible and probably somewhat easier to reason about. I've been looking at this paper as a basis for such an implementation: Solving Satisfiability Problems with Preferences. The algorithm seems quite straightforward and they get good performance by hacking it into minisat. We'd probably want a pure Julia implementation and we don't need anywhere near state of the art performance, so I think is a pretty doable project (for a future time).

@StefanKarpinski
Copy link
Sponsor Member Author

Circle CI is failing but it's hard to imagine how it could be related.

@StefanKarpinski StefanKarpinski merged commit b6d81e3 into master Apr 13, 2018
@StefanKarpinski StefanKarpinski deleted the sk/doc-code-loading branch April 13, 2018 22:10
mbauman added a commit that referenced this pull request Apr 19, 2018
* origin/master: (22 commits)
  separate `isbitstype(::Type)` from `isbits` (#26850)
  bugfix for regex matches ending with non-ASCII (#26831)
  [NewOptimizer] track inbounds state as a per-statement flag
  change default LOAD_PATH and DEPOT_PATH (#26804, fix #25709)
  Change url scheme to https (#26835)
  [NewOptimizer] inlining: Refactor todo object
  inference: enable CodeInfo method_for_inference_limit_heuristics support (#26822)
  [NewOptimizer] Fix _apply elision (#26821)
  add test case from issue #26607, cfunction with no args (#26838)
  add `do` in front-end deparser. fixes #17781 (#26840)
  Preserve CallInst metadata in LateLowerGCFrame pass.
  Improve differences from R documentation (#26810)
  reserve syntax that could be used for computed field types (#18466) (#26816)
  Add support for Atomic{Bool} (Fix #26542). (#26597)
  Remove argument restriction on dims2string and inds2string (#26799) (#26817)
  remove some unnecessary `eltype` methods (#26791)
  optimize: ensure merge_value_ssa doesn't drop PiNodes
  inference: improve tmerge for Conditional and Const
  ensure more iterators stay type-stable
  code loading docs (#26787)
  ...
graph = Dict{UUID,Dict{Symbol,UUID}}(
# Priv – the private one:
UUID("ba13f791-ae1d-465a-978b-69c3ad90f72b") => Dict{Symbol,UUID}(
:Zebra => UUID("f7a24cb4-21fc-4002-ac70-f0e3a0dd3f62"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing something, or why is not Pub included here with Zebra?

Copy link
Sponsor Member Author

@StefanKarpinski StefanKarpinski Apr 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you're right, I just missed it: #26874. Good catch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants