-
Notifications
You must be signed in to change notification settings - Fork 1k
The dep workflow: default constraints, minimal manifest? #213
Comments
First, this is an awesome writeup. Thanks for doing this! I still have some unspecified dread about it being recommended - or at least, standard operating procedure - that constraints are omitted from that manifest. I can't point to a concrete reason for it, but this is a new (to my mind) way of thinking about how we could operate given the unique properties of the tool we have. And in a complex space like package management, one with significant and permanent consequences, I view new things with some trepidation :) That said, I do want to like this direction, and I think we should consider it heavily over the coming months of experimentation. With dep/gps, we/I made a very deliberate choice to still have the import graph be queen, rather than replicating it into the manifest. That's what makes an approach like this possible at all. It's also the single property that makes this tooling very idiomatically Go, as it arises naturally from the way that import paths allow for extensible, universal deduction of upstream source location. As for the elephant in the room:
Yeah, so, this is a big deal. The problem here is constraint solving, which is an NP-hard search (as Russ demonstrated). We have to be cognizant of how the design of the tool, and the recommendations we make about its use, could create an ecosystem with pathological solving cases. I can say some simple things about this:
I've opened sdboyer/gps#165 as one way of trying to get out ahead of this problem a bit - if we can generate random graphs that help identify patterns that gps' solver has problems with, then it could give us a better empirical sense of where the risks may lie. Beyond that, a couple things, mostly nits:
First, a clarification - dep doesn't care what's vendored when it comes to making decisions about versions to select. It treats So, assuming you meant "if
Anything the solver picks on its own is going to be a rev with a corresponding tag or branch (there's one possible exception to this, but not relevant here) - it will record both the tag or branch AND the rev.
In general, this is true. |
Thanks. I'll pretend that I understand half of that, but I think I get the gist, and I appreciate the experimentation in this direction. The reason behind this line of though for me is the confusion I see when there is a CLI command to add dependencies to the manifest. That suggests a different workflow that is competing with what the import graph brings. Unions of the manifest and import graph, locking and vendoring things that aren't imported yet, I can feel it getting messy. I think dep can and should avoid that in one way or another. This isn't the only way... Making the manifest minimal puts more emphasis on just writing code and adding imports. The type compatibility check would be a pretty awesome way to enhance the solver by making the default behaviour smarter, which may not be possible if the everything is persisted to a manifest instead of allowing recalculation. If the manifest file is only machine-modified, through some series of CLI commands, outputting everything to the file is okay because it can be updated with every run, but then why have a separate manifest and lock file? It's a different story if the manifest file is intended to be a primary part of the dep UI with just a few simple CLI commands to enforce it. In the magical world where it's both human & machine editable with comments and order preserved, would it be a too magical for the tool to muck with that manifest file's constraints on behalf of the user? Perhaps not. But if so, making the manifest constraints optional seems like the best option. Here's a simple case: One of my projects uses "golang.org/x/net": {
"branch": "master"
} After What does If we can get that magical world of a human/machine editable manifest that doesn't feel surprising in use, that may be best because it provides a visual of all the immediate dependencies without having to hand write it. Otherwise, I quite like the idea of specifying the minimum necessary, and using the CLI to provide visuals of the graph (status). As far as |
Apologies for the wall of text. I'm thinking "out loud" and had another thought. Consider this not-so-uncommon diamond scenario where two packages have the same dependency. My project depends on When I go to upgrade After inspecting the code, I notice that [overrides]
"sun" = "^v2.0.0" This isn't here to say that my project depends on sun. It's an override to resolve a downstream conflict. NOTE: I really like the idea of the type compatibility check. Might later versions of dep hit the conflict and determine to use sun v2 without user intervention? What happens when things change? Say What does my I don't know if I can explain it, but somehow the minimal manifest just feels better to me here. More organized, with less clutter. I'd go so far as to suggest that the manifest doesn't list The tools in my past (Bundler, Hex) have this a huge list of dependencies, mostly immediate, but some that are hoisted up just to get around conflicts. That happens a lot when early adopting prerelease versions before the rest of the ecosystem catches up. It's a huge jumble, and it's never really clear whether a dependency can be removed. The import graph promises to clean that up. What I don't want here is a union of the manifest and import graph. I want to know when I can drop a dependency. |
I'm sorry, I'm having trouble following all of the exposition. Do you think you could pare down the "walls of text" to the specific workflow[s] you have in mind? |
This has been an exploration of the UX of using dep in one particular direction. The gist of everything above is to have two inputs into the solver. The import graph and a human-managed "manifest". In the cases above, "manifest" is essentially optional, and can be used to override the default common case (which would be designed to be what people usually want). This makes the manifest fairly minimal in use (just the exceptions) and the CLI surface minimal too (just enforce the manifest + import graph in a controlled manner). Alternatives could include:
Happy to collapse my mutterings and reframe them in terms of the projects direction if desired, preferably after I have a better understanding of what the team's direction with the UX actually is. Thanks! |
Just to be clear, this is how it actually works now. The current project's import graph is represented in |
@sdboyer I'm concerned with how it feels to use and what expectations users are having when using the tool. How does the UX of working with the CLI and manifest (if applicable) convey how it actually works? When users see When I run It took some explaining, which can be turned into documentation, that would be good. It would be great if the documentation wasn't necessary. What dep is actually doing, thanks to the import graph, is wonderful stuff. I really ❤️ it. How can the tool's CLI best reflect it? |
I was the squeeky wheel here so I want to explain my reasons for championing having Moving forward we're going to pursue only having dep create an initial manifest during I think once these changes are implemented |
Just wanted to add my thanks to @nathany for pushing in this direction. I think it's a good outcome. |
Echoing @adg here - very glad you pushed through with this, @nathany 😄 Given this general change in direction in favor of one of the broader goals (hand-edited manifests), I'm going to close this issue. Now that we're heading at least in this general direction, this issue is too broad to be helpful for further discussion. So, let's replace it with some more specific follow ups, as needed. I've kicked that off with #233. |
Related issues include #303 and #277. Like @peterbourgon I’ve struggled to fully understand/follow this thread, largely because I’m struggling to pick up/use the language that describes various situations/cases. Some preamble; no points here, just establishing context/whether I've understood things to this point:
So now an attempt to formalise the language (please excuse the rather loose/imprecise/inaccurate set notation) used to describe
Given the above and this comment:
To my mind it’s entirely within our gift whether we want to describe such a scenario as an error or not. We can detect the existence of such cases at any point in time, describe them in terms of these sets; how we choose to handle them is up to us/ the I think the above attempt to formalise the language then helps when defining the required user commands we want. Notice I use specific commits intentionally here because I’m not interested for now in the version resolution step, I’m simply interested in # initialise a directory as the root of a vendoring process
#
dep init
# acknowledging https://github.com/golang/dep/issues/303 because I agree
# it should be possible to add to the set C without having to have declared the
# dependency in code first
# "control" a package we will import/have imported. This could (/should?) prompt
# us to ask whether we want to specify:
#
# * whether the commits of the transitive dependencies of github.com/pkg/errors
# should also be controlled by dep
# * assuming yes, whether we want to specify the commit
#
dep add github.com/pkg/errors@ff09b135c25aae272398c51a07235b90a75aa4f0
# "control" a main package we will use at some point - notice the -p flag used
# to indicate it’s a program dependency, i.e.
#
dep add -p github.com/myitcv/gopherjs/cmd/reactGen@648bf1950ae20f0ad155e4faabc276252c7f3ff9
# let’s verify that the tree d/vendor/ is identical to the state that results from replaying
# lock.json
#
dep verify
# let’s prune the repositories beneath d/vendor/ that are in the set Q (note that dep verify
# would have given us details of the superfluous repositories)
#
dep prune A user is interested in various of the sets listed above, functions of those sets and manipulating these sets. The commands the Would such an approach help to pin down the different areas/concerns and make precise what we're trying to do with various commands, the separate areas of concern? It would also help in formalising feedback. |
Yes, this thread definitely introduces some new terms and injects some additional uncertainty into the discussion. I do have formal properties like these in my mind with respect to Some notes, some corrections:
"code I have on disk" is a bit ambiguous - it could refer to, at minimum, either what's in a project's
If we're being precise here, it's better not to think of these as mappings onto "repositories", but a more generalized concept of a "source" that can produce N code trees, each of which is a version of a "project." But yes, the basic idea that what's in the manifest is "an additional level on top" is generally sound; the manifest is not strictly necessary for running a solve. That fact was largely what motivated @nathany to open this issue.
Yep, this is how they're defined in gps. The
Some disambiguation here is needed - there is no valid exit state of
If the lock contains a hash digest of the code tree, then one-sided verification is possible. This is a useful property, and much of the focus of #121.
This is a bit ambiguous, but may be a key point of departure: while such things can happen, none of them are an output state of
Can and should! But it should also be emphasized that
Having a notion of "control" like this would change
Now, there may be some benefit to an
Yes, while these are the terms I think in, I do think it would benefit broader discussion to formalize them. As evidenced by my response here, though, I think we need to be cautious that in defining these terms, we do not get distracted by a sense that we have to give the user access to manipulate all the sets we can define. |
@sdboyer - I very much appreciate the comprehensive reply, thank you!
To that end I've moved my straw man to https://docs.google.com/document/d/1xlo-fKGt5oJq8z8yQSTPShN__obQFmdBzbBS6puUhXg/edit?usp=sharing to help try and move the formalisation forward. Feel free to take a copy if you want to take control of the doc (after all the
The reason behind this choice of phrase is that the compiler or indeed any other Go tool consumes
I agree it's beyond the remit of
Is there a reason to deviate from the term repository (as seen here); is it insufficient/inadequate in some way? I'm not entirely sure what you mean by:
I've added to the document what I understand as the definition of "repository", "commit" and "version". Do you agree with the base definitions?
Agreed, that seems like a sensible decision.
By requiring that V = D for a "good state" you're saying it is an error from
O can be non-empty in the case a dependency from either A or B is satisfied by a location other than Just to be clear (and I've clarified this in the Google Doc), any dependency in O could be satisfied by a GOPATH entry or not be satisfied at all (i.e. constitute a (temporary) compile error) But again, back to the question above, is this an error? From the compiler's perspective it is not. From a user's perspective this may not be what was intended... but again
So to confirm I have understood the current position correctly:
I think this is where I see
In any case, do we need to make this decision on pruning now? Can we not have a v1 that doesn't prune? Then take lessons from there?
Excellent point.
It was intentionally non-specific 👍 But I agree with your point: that if you discount anything other than
Indeed. Just to be clear, the intention behind trying to formalise the language (clarified at the top of the Google Doc) was to make it easier to be precise about certain situations in either discussion, specification, feedback etc. The interface provided to the user via
As I'm sure has been discussed before, it's one thing telling the user when the situation occurs (and explaining how to correct it), it's quite another to automatically remove/prune by default. This is just reiterating my point from above though.
Except that these packages belong to repositories and it's the repository you're opting into. They are the ones referenced in the lock file in terms of their location and a commit.
Unless I've misunderstood things, the case brought up by #303 is a direct result of the decision to have
Agreed (per my comments earlier in this response). No intention to distract here, just trying to help pin down the language/implementation. |
Yes. Introducing the notion of a "project" - a tree of code - has far-reaching effects, one of which is to make the linked documentation too loose. Lack of definition around the differences between what the different VCS types provide with respect to how a "version" is defined and encoded is already a problem. And, while the go tool, and dep, currently only support working with version control-backed sources, that is not guaranteed to be the case long term - e.g. #175. The sooner we move to and define a more abstracted notion of a 'source', and get rigorous in terms of how it relates to import paths, the better.
They're missing a couple things (s/commit/revision/; revisions and branches/versions have a relation; differentiation between semver and non-semver versions) but I don't have the bandwidth to write up precise definitions right now.
Yes.
If you look through the discussion there, you'll see I'm very hesitant about doing any pruning automatically beyond removing nested
Yes, it will. And it'll re-add that dependency when the import graph indicates it should. It's far more costly to invent new forms of state management that allow a user to mark a removal as "temporary" (what does that even mean? how is it different than a package being
Usually yes to the network request (though for other reasons, and that could be eliminated); yes, a local cache is used.
In what way is it the repository you're opting into? The opt-in is to packages contained within a repository. If you don't need packages from a source, you don't need the source.
Let me be more pointed in directing you to a particular comment, rather than just the issue: #36 (comment). It's possible for us to support something like the |
Move LockDiff from dep into gps
This is a meta-issue after some discussion with @sdboyer on #vendor to better understand the
dep
tool and to both explain my understanding and present what I think the workflow could be like based on what I currently know. Currently I'm only tackling the simple case of building a new app.The
dep
tool is unique compared to its contemporaries because it is aware of the import graph contained in source code. That means theimport
statements in your source code are the source of truth. Bundler, Hex, NPM, etc. look to a manifest file for a list dependencies you manually maintain. This impacts what the so-named (#168) "manifest" file is actually for.So to use dep, you can start by just writing code. At some point you need to
dep init
to create the manifest and lock file. The lock file is a full transitive list of all imported code with commit SHAs that make reproducible builds possible. The "manifest" need not list all the dependencies and constraints, in fact it could be essentially empty†. Let me explain.Say you
import "foo/bar"
and rundep ensure
to ensure that everything imported is correctly vendored and tracked.The
dep
tool can be smart enough to fetch the latest stable versionv1.3.4
of "foo/bar" rather than pulling master (as withgo get
) and rather than pullingv2.0.0.beta.1
. Pre-releases are always opt-in, as the solver gives preference to stable versions.At this point there is no need for anything to exist in the manifest, but the lock file will contain a specific commit SHA for "foo/bar" which you commit along with your code (and optionally the vendored copy of "foo/bar" for the hermetic view of the world).
Now lets say you
import "baz/qux"
. Even if "foo/bar" v1.3.5 is available, runningdep ensure
should vendor "baz/qux" but not touch "foo/bar" at all.A separate command should exist to
dep update foo/bar
, which would upgrade tov1.3.5
. If no constraints are manually specified,dep update x
always gets you the latest stable version, so oncev2.0.0
is out,dep update foo/bar
would give you that.IMO, this should be fine for most cases. Upgrading versions is always done in a controlled manner and builds are reproducible thanks to the lock file. Additionally, you can use
dep status
to see what's going to happen before runningdep update x
and simply choose not to update to new major versions immediately.The "manifest" comes into play when the default behaviour isn't what you want.
^v2.0.0.beta.1
These are all exceptions to the general rule of latest stable version. In all these cases, it is helpful to include a comment in the manifest indicating why the constraint is specified.
Next time you run
dep ensure
, the constraints in the manifest will override the default behaviour (latest stable version) when solving. However, ifv2.0.0
is already vendored, you may see an error, and need to rundep update
to update to the older specified version.How does this sound so far?
If this makes sense, I'd suggest a few changes that could help direct some open issues:
init
and never modified by the tool. Comments and organization of the file are up to the humans maintaining it. The initial file should include a wall of comments as documentation on usage. To support comments and for readability, the format preferably wouldn't be JSON (Move to TOML for manifest and lock #119), but the lock file could remain JSON.dep ensure foo/bar
variant should go away in favour of a separate update command. Neither command should alter the manifest. To add a dependency, write animport
statement in your Go code, and rundep ensure
to vendor the latest stable version.dep update foo/bar v1.3.4
could override any constraints specified in the manifest too, vendoring v1.3.4. But assuming this doesn't update the manifest (not worth the trouble), maybe it's not worth supporting constraints on the CLI.The end result of all this is a manifest designed for humans but that rarely needs editing. For the common case, just write your code, run
dep ensure
, and commit your work as usual. That's as simple as it gets.† Yet to confirm that the solver can cope with a large number of unspecified constraints (empty manifest) with a preference for the latest stable version.
The text was updated successfully, but these errors were encountered: