Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request Registry: Refer to master branch if no version is defined #1634

Open
schlichtanders opened this issue Jan 26, 2020 · 33 comments

Comments

@schlichtanders
Copy link

Dear Julia developers,

because I recently switched my private development environment from Windows to Mac I needed a clean way to make my setup portable. I put everything into private git repositories and then tried to replicate everything on Mac, which turned out to be astonishingly complicated and not straightforward. This feature request is the idea I came up with after trying many different approaches.

Current ways to replicate a private development environment on another machine currently

The key problem is best described in #1005. Let me quote it here again:

Let's say we have created some local unregistered packages (A, B, C, D) in the following dependency graph:

    A
   / \
  B    C
 /
D

If we want to dev A we need to "manually" resolve the dependency graph by doing:

dev path/to/D
dev path/to/B
dev path/to/C
dev path/to/A

Custom Script

One way is to write a custom script which installs all nested dependencies IN THE CORRECT ORDER.

That is not trivial to do and finally replicates what should be the responsibility of Pkg/Registries/or some other julia package infrastructure. Hence, this should be discouraged.

[Maybe possible in the Future] Using Manifest files

When #1088 is merged it gets possible to find nested dependencies by looking into the Manifest file of a package.

This would make it easy to develop locally on one machine, however when working cross several machines it still has a couple of disadvantages:

  • the Manifest file has to be checked-in into the repository (which is rather discouraged because a Manifest file is a local thing in 99% of the case)
  • locally during development, the Manifest file will still have paths pointing to locally checked-out develop version. I.e. the working Manifest file always differs from the checked in one. And one needs to be very careful not to commit/push a Manifest file with local dependencies as this will break the setup on the other machines
  • each single package needs its own curated Manifest file with correct dependencies, i.e. you have to handle the above complication for as many Manifest files as packages you develop

Hence this should not be the recommended option for cross machine replication neither.

Using a Local Registry

Finally I experimented with private Registries. The initial setup is not perfectly easy yet, but thanks to https://github.com/GunnarFarneback/LocalRegistry.jl I could manage it.

The local registry has the advantage that only one central configuration needs to be maintained (namely the private registry) - way better than the X number of Manifest files. Also the registry is independent of Manifest files, so that no Manifest file needs to be checked-in (as usually recommended). Finally, Registries are perfectly supported, so that add, dev and everything just works.

The only disadvantage is that the current Registry implementation needs to have at least one version for every registered package. However there is yet no stable version for the packages, as they are in private development stage for now. What I did is to fix a version onto a specific git hash, however it would be far more intuitive for this stage to fix the master branch instead.

Proposal to refer to master branch when no Version is given

I think registries are the perfect place to share your development state. Whether it is for yourself or for your company, a private registry will be part of your julia infrastructure if you want to deal with your own private packages. Currently this only works well if your packages already reached at least one stable version, however before that you need to constantly maintain a dummy version.

My proposal is that if no version is given, instead of throwing an error that no version could be found, the package is installed by referring to master branch, just as if you would have installed it manually using ] add mypackage#master.

Benefits:

  • this mimics the behaviour of ] develop which checks-out the repository in master by default
  • with this you no longer need to manually curate the dummy-version to point to the current master, but Pkg update will check master directly and fetch updates if their are any (this is the standard behaviour for packages installed as branch reference)
  • no "mis-use" of Manifest.toml needed to share your development state
  • as the Versions.toml file is just empty, it is clear for everyone and any machine that the respective package has no stable version yet. In particular there is no need to point a real-version to a branch, which was discussed and discouraged on discourse.
  • you finally have one go-to place to easily and clearly share all your development, even in the state where versions are not stable yet.

It would be a pleasure for me to implement it myself (it would be my first Julia contribution), in case this feature request gets approved.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jan 27, 2020

This issue seems like it should be on the Pkg.jl repo. (Transferred).

@StefanKarpinski StefanKarpinski transferred this issue from JuliaLang/julia Jan 27, 2020
@schlichtanders
Copy link
Author

@StefanKarpinski Thanks a lot for transferring the issue.

Are others also interested in realising this feature? Any arguments speaking against it?

To summarize: I think this would simplify package development drastically by having one system (a registry) for all phases of development (no-first-version-yet as well as versioning-already-in-place)

@KristofferC
Copy link
Member

If you have a package in the registry with no registered version, wouldn't add Package#master just work?

@schlichtanders
Copy link
Author

@KristofferC this only works if you have a single package without any further dependencies in development.

As soon as your packages depend on one another, all dependencies are looked up in the registry (which is good), however then will be installed using the linked dummy version (current workaround as for today you need to add a version) and not with master branch.

Having the registry pointing to master if no version is registered fixes exactly this

@00vareladavid
Copy link
Contributor

00vareladavid commented Feb 11, 2020

@schlichtanders I think #1628 might close this issue. It seems simpler than changing how registry resolution works

@schlichtanders
Copy link
Author

@00vareladavid it is not clear to me how #1628 provides a solution for what I suggested here.

It speaks about adding a new a source config in the current project, which suggests to me that it cannot be maintained by a package developer like me, but each user has to maintain it on their own.

This feature request for the registry is meant to easily share your current setup without the need of a Manifest file in a clean, simple and intuitive manner.

@00vareladavid
Copy link
Contributor

The source table will be automatically maintained by Pkg: it will detect unregistered dependencies and add them to the table.

One way is to write a custom script which installs all nested dependencies IN THE CORRECT ORDER.

This will be automatically handled by Pkg as well.

No need for manifest files or registries, since the source table will be in the Project file.

@rapus95
Copy link

rapus95 commented Feb 12, 2020

Nevertheless the proposal described by @schlichtanders and the solution you propose @00vareladavid are orthogonal. I like the idea proposed by schlichtanders which simply proposes to use #master as the implicit initial version for any registered package. Thus, whenever someone requests to add a package to the registry but doesn't provide a hash & version the registry entry will point to HEAD of master. That way you can check-in a bleeding-edge/nightly package even before considering anything to be settled. Helps resolving dependencies and spreads a package before considering any API to be settled by any means. If i'd open a repo to discuss some new language feature (like a new interface) I'd like the registry to point to the current development state. Once all collaborators and I found common ground for the API of the first test iteration we'd release 0.1

@fredrikekre
Copy link
Member

This would make the registry mutable.

@rapus95
Copy link

rapus95 commented Feb 12, 2020

this rather depends on your perspective, as Pkg#master is resolved by looking up the location on the registry and just going by the master on the found location. That's not fixed either.

@rapus95
Copy link

rapus95 commented Feb 12, 2020

I might reformulate the proposal:
store no versions in the registry, just the location, if no hash&version are provided. And if Pkg gets an empty version return on checking with registry, default it to #master. I meant it to be a local only implementation. Not to literally point to and update the master HEAD hash in the registry.

Simply said: If there are no tagged versions, the user probably wants to checkout master. That's the best I can assume in any case.

It'd also help to discover packages and register package names before subscribing to SemVer

@schlichtanders
Copy link
Author

thanks @rapus95 for reformulating. That is exactly what I meant!

@schlichtanders
Copy link
Author

To repeat, this proposal does not want to change the registry format at all, all stays the same as before.
It only adds the default that if no version is defined in the registry, the master branch of the repository is checked-out when resolved instead of throwing an error (e.g. either directly with Pkg.add or indirectly as a dependency of another package)

To the best of my current understanding, this would only be a minor change of default behaviour

@rapus95
Copy link

rapus95 commented Feb 14, 2020

If that would still be considered a too-large-for-minor feature, it could be added with a flag.
by for example -masterfallback or -allownoversion though, the latter might be misleading

@fredrikekre
Copy link
Member

fredrikekre commented Feb 14, 2020

But that already works with add Example#master, and IMO that is much better since it is explicit that you use a fluctuating version.

@schlichtanders
Copy link
Author

@fredrikekre add Example#master works only if Example has merely properly registered dependencies. That is not the case described here, but here we are concerned with more complex dependencies.

Hence for the scenario here, if you use a LocalRegistry with dummy versions, you would have to do add Example#master; add ExampleDependency1#master; add ExampleDependency1Dependency1-master; ... add ExampleDependencyNDependencyMDependencyK...#master.
Which is tedious and error-prone, as if you forget to checkout one dependency with #master, it will instead point to the current dummy version, which might rapdily go of sync or you need to maintain them too.

If you don't use a LocalRegistry, things are even worse, because you have to make sure, you are installing packages in the proper order, from deepest to highest, i.e. add ExampleDependencyNDependencyMDependencyK...#master; add ExampleDependency1Dependency1-master; add ExampleDependency1#master; add Example#master; which is even more error-prone.

@fredrikekre
Copy link
Member

The trick is to add them all in the same command, then you don't have to worry about ordering or anything.

@schlichtanders
Copy link
Author

The trick is to add them all in the same command, then you don't have to worry about ordering or anything.

if this really works, then it improves it slightly, however I still need to construct this possibly giant add expression and keep it updated, if my package dependencies somewhere changes. Which stays a maintenance task which I better would like to have solved by a system meant for package maintenance (like Pkg). Also the user experience of installing a package is much more daunting then a simple add registry ... add Example.

To add a positive reason for this registry change: Somewhen my packages become stable enough for a version, and then I want registries anyway.

@rapus95
Copy link

rapus95 commented Feb 14, 2020

But that already works with add Example#master, and IMO that is much better since it is explicit that you use a fluctuating version.

What do you mean by fluctuating? I thought there is no automatic updating in Julia? And if updating has to be done explicitely, we can't consider it fluctuating as it references the state when it was added. Also, that only works if the package had been registered in the registry. And as I understood it, for that, you need to specify a version. So all this is about, is to enhance usability explorability of yet unversioned but already named packages. Allow registering packages without a version and make versionless packages default to master on client side (if a given flag is provided)

The trick is to add them all in the same command, then you don't have to worry about ordering or anything.

Once you get to indirect dependencies this becomes hell if you need to fetch them manually. As an end result one probably will write a crawler macro which does that construct recursively. As opposed to something along the lines of isempty(returnedversions()) && (version=:master).

@StefanKarpinski
Copy link
Member

The current way version resolution works is that you can, from the registry alone, decide all of the versions of every package you need to install. What's being proposed would change that model considerably. You would have packages that have no registered versions. That means that you have no idea what they depend on until you've installed them. So maybe we could support that. You install a package with no registered versions and we just install master and look at what's required by the master version. But it's a whole different package resolution/installation strategy that needs to be developed, tested and maintained, not just a minor feature. It would require a total refactoring of how packages get resolved and installed. It also seems like with #1628 it might be possible to just commit a manifest file and have everyone in sync without this change.

@schlichtanders
Copy link
Author

changing dependencies is a good point to consider.

As far as I understood the julia registry, dependencies are maintained in the registry via the Deps.toml file. See e.g. my little LocalRegistry, build with LocalRegistry.jl where the dependencies of one of my package have been tagged by [0] automatically.

The versions are maintained in the Versions.toml file. And for the same package, I already added two dummy versions with the help of LocalRegistry.jl, namely [0.1.0] and [0.1.1]. As you see, the are both different than the tag [0] in the Deps.toml.

This looks like that dependencies can be separately maintained from the concrete versions, and hence should also be able to be maintained without versions at all, and without changing the entire package resolution/installation strategy.

Of course then you still need to maintain the direct dependencies for every project, point taken, but this is still much better to maintain and a much better user experience compared to the current solutions with self-crawling all the nested dependencies for every project.

@StefanKarpinski
Copy link
Member

The keys are patterns which match concrete versions in the Versions.toml file. This is done consistently for Deps.toml and Compat.toml. The actual meaning of these files is that they represent data about each version in Versions.toml in a compressed format where each stanza heading is a pattern that matches all the versions to which a stanza applies (which has that set of properties) and not to any others. A header in Deps.toml or Compat.toml that doesn't match any versions in Versions.toml is nonsense—it does not mean anything. We could try to make it mean something, but that feels like a slippery slope that's undermining the well-defined and fairly logical way that these files represent information about package versions.

@rapus95
Copy link

rapus95 commented Feb 18, 2020

How about designing it as an interactive fallback which holds if resolving doesn't find a version? (Either by version restrictions or due to missing versions at all).

...resolving...
Couldn't resolve to a version.
Do you like to install the latest version(#version*)? [y/n]

where version* resolves to the latest registered version or #master if there's none. After accepting that there's an explicit entry which can be added to the Manifest.toml. The only other thing we'd need then is an argument option which allows us to -y recursively.

@schlichtanders
Copy link
Author

schlichtanders commented Feb 18, 2020

@StefanKarpinski I see now that it is not an one line change then, unfortunately. Nevertheless I think the main advantage of this feature suggestion is still worth it.

If I understood it correctly, the current toml-tablenames in Deps.toml fail in that they semantically describe a version. e.g. [0] describes the versions 0.0.0 to 0.99.99 kind of, but not no-version-at-all.

For me it seems best to relax this definition just for [0], that it also refers to the case where no version at all is given. That slightly breaks the logic, but not much and has the benefit that dependencies defined for no-version will probably be used forward when the first version 0.0.1 is out.

@rapus95
Copy link

rapus95 commented Feb 18, 2020

@schlichtanders the definition for [0] is correct as it allows all versions which have a 0 as their major version. So it'd be a [] which allows all versions that are tagged if I generalize correctly. By that there indeed is no taggable "I wan't it even if there are no versions tagged". That's very consistent though I'd say. That's why I suggested to make it a option to explicitly fallback to the master in that case. That way we don't change anything about the format and the dependency management. But instead add it as a pure convenience function which needs to be supplied on caller instead of callee

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Feb 18, 2020

You are focusing too much on the syntax. Think about it this way: there is no such thing as the dependencies of a package. Only specific versions of packages have dependencies. The registry records concrete facts about the dependencies and compatibility of specific versions of packages. What yous are suggesting is that we change that so that there is some notion of dependencies and compatibility of a package absent any particular version. What happens when you change the dependencies in your local copy of one of those packages? There's no registration process, so now the actuality of that package is out of sync with what the registry claims about the package as a whole. I'm afraid that's a non-starter. It would be better to allow installing the master version of a registered package without versions and just follow whatever dependencies that version has, although, frankly, that also seems like a nightmare that's unlikely to work well.

Again, we should see if #1628 doesn't fix the situation without any of this since it will allow sharing manifests for unregistered packages more smoothly.

@schlichtanders
Copy link
Author

thanks @StefanKarpinski for you patience with me here. So Deps.toml is no option for a fallback to master. I understood that.

Seems like the proposal then would be:

  • allow no-version-at-all
  • fallback to master branch
  • derive dependencies dynamically in this case by first cloning the master and then checking the dependencies

Much more work than initially thought, but still might be worth it because of the already mentioned reasons.

@schlichtanders
Copy link
Author

@rapus95 thank you too for your support. I am not yet sure what is the best way to have a fallback (no-version-at-all, explicit-flag, ...). I prefer the no-version-at-all because it is self-explanatory, but I am open.
The dependency-issue however applies to all these different suggestions and hence should be clarified first. Is it reasonably to extend Pkg to allow dynamic dependency lookup for the purpose of this feature-request?

@StefanKarpinski
Copy link
Member

Is it reasonably to extend Pkg to allow dynamic dependency lookup for the purpose of this feature-request?

I think you'll have to come up with a plan for how to do this. You'll probably want to model no-registered packages as having a single version, with the only version being the master one. You'll have to fill in the dependencies of that one version and then follow its graph of dependencies, etc. Note that this means that if anything might depend on one of these packages with no registered versions, you'll have to download it to figure out the potential dependency graph no matter whether you end up needing to use it in the end or not.

@schlichtanders
Copy link
Author

Considering naming, I would describe the standard packages as "registered package with fixed version", and the newly proposed packages which fall back to master as "registered packages without any fixed version".

I would only need to download a package to check for dependencies in the following case:

  • it is directly going to be installed or a dependency of something going to be installed
  • and the package has no registered version yet, hence the fallback to clone master
    In any case, the situation is that I need this package.

Let me sketch what needs to be implemented/done:

  1. we trigger package installation via Pkg.add("mypackage")
  2. we cannot find a registered version, but can grab the repository url from the registry and hence fall back to Pkg.add("mypackagerepositoryurl#master")
  3. the package gets cloned with the important difference that we don't throw an error if a dependency has no registered version yet
  4. instead we recurse to point (1.) such that Pkg.add("mydependency") actually triggers Pkg.add("mydependencyrepositoryurl#master")
  5. Maybe all this has to be done depth first recursively such that if Pkg.add("mypackagerepositoryurl#master") finishes, all sub-dependencies are already installed. However as we know that after the installation everything will be installed, the order of installation may actually be not important at all, we just assume that everything needed will be there
  6. we would need to implement a roll-back mechanism in case somewhere in the dependencies an error happens, like referring to a package that cannot be find in the registry at all. Of course in this case nothing should be installed.

Overall, this looks quite feasible for me. The most complicated part seems to be to have proper roll-back in case of error.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Feb 19, 2020

  • it is directly going to be installed or a dependency of something going to be installed

The problem is that you need to know what the dependencies of such a package are before you know what's going to be installed. You have to install any of these registered-with-no-version packages if you're even considering installing anything that depends on it. You can't know if you are going to install something until you've built the entire graph of versions that you might potentially install and then done version resolution. For example, you might want to add package A whose most recent version A-1.2.3 depends on your registered-with-no-versions package B, which in turn depends on C which has a conflict with A-1.2.3. The resolution might therefore decide that you have to use an older version of A, say A-1.0.3, which doesn't depend on B at all. So you end up needing to install B to build the dependency graph, but not actually using it.

Let me sketch what needs to be implemented/done:

You need to resolve a compatible set of versions first. You can't just try installing things and see what happens. What happens when you pick some version early on, then install one of your semiregistered packages and then find that it has a conflict with something you've already committed to? The whole thing just fails even though there may be a way to install things? I'm afraid that's unacceptable. What you're proposing is trying to solve an NP-hard optimization problem—version resolution—by hoping for the best. I can assure you that doesn't work.

@schlichtanders
Copy link
Author

thanks for showing the real complexity of installing packages. Still that might be worth it... but I am not sure any longer

so I agree that it seems better to first wait for #1628 and see whether it is a good enough solution

@StefanKarpinski thanks a lot!

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Feb 20, 2020

It is possible to do it: while building the graph of potentially installable versions, you have to download the master version of any no-version package when you get to it and then continue building the graph of potentially installable versions from there. Then you do normal version resolution. You might end up installing no-version packages that you didn't actually need, but that's not the worst thing in the world and it's at most one version per no-version package. Still, it's a lot of complexity for something that might be handled better by sharing manifests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants