Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncurated Hackage Layer #6

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

gbaz
Copy link
Collaborator

@gbaz gbaz commented Jan 26, 2018

There is a tension between two purposes of Hackage -- first as a central repository of Haskell code, and second as a curated store that has artifacts that are intended to be correctly built and depended upon in a self-contained fashion. The aim of this proposal is to separate these two purposes, by allowing authors to distinguish if they wish to opt-out of following the PVP and the attendant curation process that helps to maintain correct dependency information.

Rendered proposal: https://github.com/gbaz/ecosystem-proposals/blob/gbaz-uncurated/proposals/0000-uncurated-layer.rst

@angerman
Copy link

Thank you @gbaz for writing this up (so quickly)! I am however not supportive in it's current form. I believe it will ultimately lead to fewer package on hackage and many more on the uncurated hackage.

I think there's also a problem understanding what hackage is. hackage has always been a curated package repository, where trustees curate packages.

Curated packages cannot depend on uncurated packages, and the hackage server will detect this as an error at upload time.

This seems like a good way to end up with lots of packages that end up in the uncurated package index, even though they maintainer might be sympathetic to the PVP but can't be bothered to start talking two everyone in his dependency chain to please follow the PVP...

Uncurated packages may be "adopted" into the curated ecosystem by trustees. Metadata revisions necessarily remove the x-uncurated property from the revised cabal metadata.

... or try to pressure trustees to do the work for him.

Curated package uploads will be checked on upload to ensure they don't have dependencies on uncurated packages. Further, the curated index will only provide information on curated packages.

So we are eventually going to force PVP, and exclude non PVP from visibility? Would cabal-install still see both indices or only the curated one? If it's the latter, it would be a loss as opposed to what we have right now.

As such I believe this will rather lead to more fragmentation and a split then what the intention is.


As I've mentioned on the SLURP proposal, I'm in favor of providing raw-hackage (which you seem to want as well (my raw-hackage = your hackage/uncurated -- the immutable package store hackage already has, with an empty append only revisions index). I'd also prefer it to be at raw.hackage.haskell.org, such that all I would need to do is change the domain to switch the hackage.

On top of that I'd like to see hackage become an overlay (where improvements to the overlay logic may be needed). Hackage right now as I understand is the immutable package store with an append only revisions index. What I'm suggesting is to turn the append only revisions index just into an overlay.


As I understand we want the PVP so we can have tooling like matrix.hho and other tooling that might come down the line. My suggestion would be that we do not enforce this at the package index level, but at the tooling level. If you try to use a package that doesn't follow the PVP with the tools and want to reap the benefits of the tooling, the tooling will politely inform you that it can not handle the package without PVP.


I'd also like to see a more open policy where the community could provide PVP PRs against the overlay if they want a certain (package, version) to follow the PVP. (Of course not every package can be made to follow the PVP, but some might). Maybe the maintainer doesn't see the need to follow a strict PVP, but someone else might and would want to provide the necessary change, so that he can use the tooling that relies on the PVP. If all that's needed is a PR against the hackage overlay, that seems like a rather low barrier to contribution. If packages have their repository / maintainer contacts set, the author could be pinged about this as well.

At this point the proposed binary flag you suggest could come into play and work as a white list of who would like to be notified about PRs agains the overlay?

@gbaz
Copy link
Collaborator Author

gbaz commented Jan 27, 2018

As I've mentioned on the SLURP proposal, I'm in favor of providing raw-hackage (which you seem to want as well (my raw-hackage = your hackage/uncurated -- the immutable package store hackage already has, with an empty append only revisions index). I'd also prefer it to be at raw.hackage.haskell.org, such that all I would need to do is change the domain to switch the hackage.

What is the difference between the two ideas then, outside of the domain name? (I'm fine with bikeshedding how the uncurated repo is served any which way).

@cumber
Copy link

cumber commented Jan 27, 2018

It sounds like the big difference is @angerman is suggesting that the curated layer should contain every package in the raw layer, it would just also provide revisions for some of them. Whereas this proposal suggests the curated layer would have a subset of the packages in the raw layer (and not permit dependency links into the raw layer).

@angerman
Copy link

@cumber correct!

I’d also want to move the PVP enforcement into the tools that take advantage of PVP and not into hackage. Hackage and cabal-install seem to work with the current upload policy? Am I missing
something?

I’d also want to make community participation in hackage package revisions easier by turning hackage into an overlay over hackage-raw/uncurated, that can be driven from a git repository.

@gbaz
Copy link
Collaborator Author

gbaz commented Jan 27, 2018

I think you're coming at this from the start with overlays in mind, since you've been working on one, and see the benefits. But this proposal starts with a different idea in mind -- collections. Just as stackage is a collection, curated hackage is a collection. It happens the "base material" from which curated hackage will be built is the total of packages on hackage today. But as things evolve, there will be more things in the "uncurated" layer than in the curated layer. As a collection, the things in the curated layer will need the property that collections typically have -- that they are closed under dependencies.

The specific problem with the other approach is that if you have a dependency link into a package that itself does not specify revision bounds, then you transitively also do not specify revision bounds on the dependencies of that package. So that does not suffice.For a curated system to work, it needs to assume that the pvp is followed throughout the ecosystem -- otherwise things fall apart.

An alternate proposal would be to require that curated packages can only depend on uncurated packages if they themselves specify all transitive dependencies they inherit -- but that sounds nightmarish to maintain. Further you lose the generally desirable property of transitive closure.

As to why curated should not be just a "revisions" overlay -- the idea is that people should be able to depend on an index that contains packages that are known to be installable (this is the purpose of curation). That's what the curated index provides. One can of course also use the curated index as an overlay on uncurated. So all the combinations are possible: 1) everything, with no revisions, 2) everything, with revisions, 3) only curated things. (Again, for the last to work, this requires that curated does not depend on uncurated).

The question then becomes, which of the three setups should ship by default with cabal (in the default config file). This proposal does not specify this. My feeling would be that the curated index should be the default, since it consists of things that are and will continue to be cabal-installable. But this is certainly a question for discussion -- and one in fact that need not be resolved for this proposal to be implemented.

@gbaz
Copy link
Collaborator Author

gbaz commented Jan 27, 2018

You also write "This seems like a good way to end up with lots of packages that end up in the uncurated package index, even though they maintainer might be sympathetic to the PVP but can't be bothered to start talking to everyone in his dependency chain to please follow the PVP..."

In the abstract, this is a good concern. In practice, I think that this will be less of one. In my experience, and based on conversations with trustees, most things actually try to follow the pvp, or otherwise stay pretty green. Further, while we have a lot of packages on hackage, there are a relatively small amount that are commonly in dependency chains.

By back of the envelope accounting, of roughly 12,000 packages on hackage, only 1/3 have any rev-deps, only 1/6 have more than one, and 5% (650) have more than 10.

So that seems like a manageable amount to worry about :-)

Edit: oh, and one more thought -- if indeed there ends up being a logjam of requests for adoption of uncurated packages into curated, we can always either add more trustees or move adoption rights to a group beyond just trustees. I would also hope in the future that tooling can be produced (perhaps in conjunction with the matrix builder) that can render adoption mostly automatic.

@tfausak
Copy link

tfausak commented Jan 27, 2018

Thanks for typing this up! I don’t currently have time to give it a full review, but I am broadly in favor of it. The only part I strongly disagree with is (eventually) hiding uncurated packages from the UI entirely. I have no problem with distinguishing curated packages, but I feel that uncurated packages should always be visible.

@gbaz
Copy link
Collaborator Author

gbaz commented Jan 27, 2018

To be clear, the idea (and this certainly is up for discussion) is that there would be a filter feature that does not now exist, which lets certain things (e.g. deprecated packages, packages that are executables) be filtered out. This would be some checkboxes in an in-page javascript thing, probably tied to the functionality that now exists in datatables. The default setting for these would
be to filter the uncurated collection, but users could change this.

Also, in such filtering interfaces we would certainly want to indicate how many results are filtered away and hidden, so that users could expand to see all the missing things with a single click.

The general thought is people search hackage for packages that are cabal-installable. Curation is what indicates that. If people want to find packages usable from a given stackage lts, they tend to look at stackage. So making curated packages the ones that are most easily discoverable suits the needs of hackage users.

If we manage to move into a situation where there are multiple collections on hackage (of which curated is just one), then we'd need to revisit this. E.g. if stackage lts releases were also presented as searchable collections, etc. then we'd definitely want to have a different approach.

@angerman
Copy link

angerman commented Jan 27, 2018

@gbaz you are correct in that I see this through my experience with overlays.

I think you're coming at this from the start with overlays in mind, since you've been working on one, and see the benefits. But this proposal starts with a different idea in mind -- collections. Just as stackage is a collection, curated hackage is a collection. It happens the "base material" from which curated hackage will be built is the total of packages on hackage today. But as things evolve, there will be more things in the "uncurated" layer than in the curated layer. As a collection, the things in the curated layer will need the property that collections typically have -- that they are closed under dependencies.

My basic question here is why can't we improve overlays (allow filtering), such that we can use the same approach to arrive at collections as well?

As such, I would rather like to see myself as wondering if we can't reach what this proposal supports by slightly adjusting the overlay logic we already have and as such provide a uniform mechanism that can be used for a variety of use cases we could come up with?

Under this assumption even stackage could be represented as an overlay over hackage.

Edit: The basic question I have is: assuming we have some hypothetical overlay solution that may be slightly different / improved over the current overlay solution we have, can't we simply use that as a the implementation for this proposal and do not need any custom modifications to hackage just for this proposal, while providing a basic building block that could be used for other purposes as well?

@angerman
Copy link

@gbaz the current curated hackage (with revisions) does work as it is, or am I missing something?

As to why curated should not be just a "revisions" overlay -- the idea is that people should be able to depend on an index that contains packages that are known to be installable (this is the purpose of curation). That's what the curated index provides.

I must be missing something from this proposal then. The way I understand it is to add an additional metadata flag, and a tightening of PVP requirements. Yet, this is a departure from the status quo, which seems to work (for me only?).

@gbaz
Copy link
Collaborator Author

gbaz commented Jan 27, 2018

My feeling is that overlays and collections are two almost opposite ideas. Overlays allow monotone addition of data. Collections are "coherent subsets" of data. The union of two closed sets is itself closed. A subset of a closed set may be open and need completion under a closure operation. I don't see how overlays can produce collections, since overlays are really about unions. In fact, I could imagine that under an imaginary new collections setup, overlays could also provide overlay modifications to collections. Anyway, we're far afield here.

More directly, if I am to understand your second question -- you are asking if curated hackage with revisions works as is, and so why we need to depart from it?

The answer, I think, is it works mainly, for many of us, but not for some people. In particular, on the one side, trustees get frustrated that packages choose not to adhere to versioning standards, leading to a lot of breakage. On the other side, authors can get frustrated when they just want to toss up something as e.g. a research project or the like, or for whatever reason just don't want to follow versioning guidelines, but they find themselves getting requests to change their versioning.

This is because of the dual role Hackage plays, which I alluded to above. You wrote earlier "hackage has always been a curated package repository, where trustees curate packages." Well, yes and no. Hackage hasn't always had trustees, or curation, or revisions. It has had a variety of changes over the years. But it has always been the place to put released Haskell packages. We want it to be able to continue to play that role for everyone, while also providing important features relating to the health of a package ecosystem, which is a consideration (that we would have an ecosystem so large and complex that we needed to maintain its health!) that was hard to imagine back when getting 100 packages onto it was considered a major milestone.

By letting people who want to use hackage as simply a place to release haskell packages do so without fuss, and also establishing a curated layer which is managed as an ecosystem we ideally make everyone happier. Trustees need not worry about packages which opt-out -- they can adopt them, or they can leave them be, but there doesn't need to be a weird middle ground. Authors who don't want to worry about the ecosystem can decide not to. Then someone else can decide to make it their problem (through offering co-maintainership ideally, for the purpose of creating revisions) or not.

At the moment, on hackage, we defacto have packages that are curated and those that effectively have opted-out either by direct request to trustees, or by just doing their own thing but not being on anyone's "radar" by not being a noticeable part of the revdeps graph. But this is all somewhat confused and muddled, and when a package that tries to do the right thing depends on one that does not (and here I know of examples, but don't want to list them, because I don't want to foster animosity -- i want to remove it) then that causes a bunch of work for the trustees.

In a sense this proposal is to make more formal and clear to everyone (and enforce technically) a lot of what has already been semi-worked-out informally, and to let people signal their existing practices more effectively to prevent miscommunication.

@ElvishJerricco
Copy link

+1 @angerman's idea. I think we really need to do the thing that just gets out of people's way. Having to opt in or out, and being told you can't do things because of another person's opting (which, maybe, can change over time), is going to lead to frustration eventually. For example, I would love to follow the PVP, but there are a lot of libraries I like to depend on that don't. Being in a sort of half-way state, where I can do my best but ultimately have a major failure point is much better than not letting me do my best. I think there's a good portion of libraries I would like to write that would end up uncurated even though I'd like to be using PVP. Also, as I alluded to earlier, what if one of my dependencies opts out of the PVP in a later version? I suppose this doesn't need to be possible, but it is another wrench in the plan, as I think some people don't want to be locked into the PVP forever. All in all, "uncurated" will behave like a virus, and eventually everyone will be forced to make all new packages uncurated, defeating the purpose of the proposal.

Having an uncurated layer, with an overlay modeling the current model of unilateral curation by trustees, is going to make it much easier for users to stay out of each others' way. It will likely be harder on trustees, which I believe may be one of the prevailing concerns against this idea. But frankly, I think making things harder on the trustees is better than making things harder on all users.

This has to have a good user experience. Throwing multiple new wrenches into a beginner's face as proposed is going to go very poorly: 1) Now when they upload their first package, they have to figure out why one would opt in or out. 2) They have to deal with Hackage rejecting dependency graphs that the users doesn't understand aren't fully PVP compliant; thus leading to newcomers choosing to opt out of PVP by default just to shut Hackage up. I think that as is, this proposal somewhat violates the principle of least astonishment, which seems like a pretty critical principle in the current environment.

That said, I would still prefer this proposal as is over doing nothing. So +1 from me.

@DanBurton
Copy link

Uncurated packages may be "adopted" into the curated ecosystem by trustees. Metadata revisions necessarily remove the x-uncurated property from the revised cabal metadata.

I don't understand how "x-curated" is considered an opt out of trustee revisions, if trustees have the power to "adopt" any uncurated packages.

Hackage will provide two package repository roots -- http://hackage.haskell.org and http://hackage.haskell.org/uncurated These roots will provide index-01.tar.gz files that contain the information, respectively, for curated packages, or for all packages. The uncurated root will contain no revision information.

I'm not sure I understand this correctly, either. Doesn't the hackage.haskell.org root already include all "unrevised" information in addition to revision information? The uncurated root therefore provides strictly less information. What's the point? Can't tooling just access whatever info it wants?

I understand the stated motivation in theory, but don't see what meaningful change this proposal accomplishes in practice. I like the general idea but something is not quite clicking for me.

@gbaz
Copy link
Collaborator Author

gbaz commented Jan 27, 2018

I don't understand how "x-uncurated" is considered an opt out of trustee revisions, if trustees have the power to "adopt" any uncurated packages.

Thanks to this question, I just realized that I think I got something wrong in the proposal. The reason it opts-out of revisions is that packages in the uncurated index do not include revisions. However, the implication in the proposal is that all packages in the uncurated index do not include revisions. This is wrong, I think. Rather, the uncurated index should be [curated with revisions + uncurated without revisions]. And the curated index should be [curated with revisions + adopted uncurated with revisions].

That should make clear the difference.

I'm not sure I understand this correctly, either. Doesn't the hackage.haskell.org root already include all "unrevised" information in addition to revision information? The uncurated root therefore provides strictly less information. What's the point? Can't tooling just access whatever info it wants?

This is also a pretty good point. Arguably we could just make all the information available and leave it to tooling to interpret it as it sees fit. However, tooling that exists currently can't do that. By providing two index files we make the choice available without any downstream changes.

Again, long term, with mythical collection support, we wouldn't want to end up with an index-per-collection. But in the meantime, this lets us get to the goal in a more modular way.

I also think @ElvishJerricco has two good concerns, first about if "uncurated" will have a sort of inevitable gravitational pull (the tendency of the universe towards chaos, entropy consuming all things, etc.). I was hoping adoption would stem that. But I would really like current hackage trustees themselves to weigh in, since they have the best feel for the dynamics of the ecosystem as it stands.

The second is about new user experience, principle of least astonishment, etc. My take at the moment is the following -- the choice of flag isn't forced on uploaders. Rather, it is false by default. If the transitive check fails (and ideally it does not), then there is a message explaining this, with a link to a good explanation, and suggesting they seek the packages that caused the check to fail to be adopted.

In general, recall that transitivity-enforcement is step five of a five part plan. It might be worthwhile to include something in the proposal that before conducting this step, we need to reassess the state of things to be confident it can be done relatively painlessly, and otherwise seek other mitigating measures or reassess. For example, we could render it a warning first, and keep track of how many times and in what cases the server issues the warning. That way we know if it is an issue in practice or not.

That said, I realized I can motivate the need for transitivity and two indexes by the following chain of implications.

First: We have to start curated with the existing full package set.
Second: Therefore we need curation to be a per-version flag so packages can migrate out.
Third: Therefore the solver can well behave differently in an index that contains both curated and uncurated versions, compared to one that only contains curated versions, even if there are some versions of all dependencies in both.
Fourth: Therefore we need to provide a distinct curated index, and additionally the index must be transitively closed.

@angerman
Copy link

My feeling is that overlays and collections are two almost opposite ideas. Overlays allow monotone addition of data. Collections are "coherent subsets" of data. The union of two closed sets is itself closed. A subset of a closed set may be open and need completion under a closure operation. I don't see how overlays can produce collections, since overlays are really about unions. In fact, I could imagine that under an imaginary new collections setup, overlays could also provide overlay modifications to collections. Anyway, we're far afield here.

@gbaz thank you for clearing this up! Your and my understanding/interpretation of overlay seem to diverge a little. To me an overlay can be restrictive on the set it overlays (e.g. via a predicate) and does not necessarily need to be a union. Maybe we just need a different word? From my point of view an improved overlay mechanism could in fact yield a subset based on a predicate, while at the same time provide augmentation of the underlying set (for example to satisfy the predicate).

I'm sorry if this confusion of terminology has lead to unnecessary misunderstandings!

Again my intention was only to push for a mechanism that would let us model collections and patches ontop of hackage in a unified manner that would be generic and allow for the same
tooling to be use for all use cases.

Again the benefits I see are (given raw hackage):

  • model hackage (as is)
  • model hackage (as proposed in here)
  • model stackage lts (as is)
  • model mobilehaskell and head overlays
  • potentially model the composition of stackage lts and mobile haskell or head (there is no clean composition here as patches just don't compose well, but this might be of little concern if in practice if patches are orthogonal and merging of two "overlays" would result in no merge conflicts)
  • model a blessed set of package (I can only use packages that are MIT or BSD licensed, including transitive dependencies due to company policy, ...)
  • Eta's Etlas (I believe).
  • model something we haven't though of yet, that fits in the generic framework of augment (patch) and restrict (predicate).

The desired end-state will have the following properties:

1) Packages will have an additional flag set in the Hackage package database, that indicates if they are curated or not. This flag is set *per version*.
2) Package authors will set this flag *on upload*, by setting the "x-uncurated" property of the cabal file of a package to "true". If no "x-uncurated" property is set, this will be considered "false".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it is better to avoid additional negations when naming booleans, especially since you're putting in a sensible default anyway. (Image code like if not (uncurated p)).

Instead just call it x-curated, default true?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was that unset flags by convention default to "false". But this is a good argument the other way. Don't have a strong opinion here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prefer to avoid double negations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spirit of avoiding boolean blindness can we please call the flag x-curation and its values curated and uncurated?

@bergmark
Copy link

bergmark commented Jan 27, 2018

Thanks @gbaz, I think this looks promising.

I'm reading this while I'm in a bit of a hurry so please excuse me if I missed something here.

I'm a bit confused on what x-uncurated means. Is it that a release is currently only part of the uncurated index but can be revised by a trustee to be curated, or is it an opt out for maintainers to say that a package will never be apart of the curated index?

As @ElvishJerricco wondered, will everything gravitate towards being uncurated? I think that depends on the answer to my question above. If any version of a package that is uploaded with curation disabled can be revised for inclusion to the curated index I do not expect this to be an issue since trustees will be able to include it. When there is a new release of that package the curated index will not contain it by default so the process must be repeated. The curated index would only contain older versions for a while. This is similar to how stackage nightly works, at times it may be months before the latest release of a package makes it into nightly. This may cause other packages that require the newest release to also be left out of nightly. This causes issues sometimes, but rarely, eventually stackage nightly catches up to the latest versions. I speculate that it would be a similarily sized issue for the curated index.

I do expect that this proposal would lead to more work for trustees, but there are at least two solutions to the issue that we can do today. 1. Onboard more trustees, 2. Package maintainers can add co-maintainers that perform revisions.
Maintainers may be reluctant to give someone maintainer status for just revisions, if this is the case I suggest that we add a "revisionist" permission level to individual packages so that a user can make revisions to a package but not upload new versions.

I need to head out for today, hopefully I'll have some time to re-read this during the weekend.

@hvr
Copy link
Member

hvr commented Jan 27, 2018

@bergmark

  1. Onboard more trustees, 2. Package maintainers can add co-maintainers that perform revisions.

Indeed, that's the idea to keep it simple and to incentivise maintainers teaming up with co-maintainers of their choice as curation inevitably requires some level (even if very minor) of cooperation & communication with a maintainer. I'd like to see how well this works out before considering any more complicated mechanisms (like the "revisionist" bit") in anticipation of something that isn't known to be a problem yet.

@maxnordlund
Copy link

maxnordlund commented Jan 27, 2018

With the rick of repeating something already know, some other package managers[1] is using the concept of a lock file to solve some parts of these issues.

The idea is that you can depend on uncurated packages by pining their revision thus ensuring a green build across machines. While mostly used for end consumers I believe it can be tweaked to by used by packages uploaded to hackages

An alternate proposal would be to require that curated packages can only depend on uncurated packages if they themselves specify all transitive dependencies they inherit -- but that sounds nightmarish to maintain. Further you lose the generally desirable property of transitive closure.

This is very true, but by using a lock file you can give the solver a green build plan. The only caveat is the lose of transitive closure when you have to multiple curated packages depend on different versions for an uncurated package.

I guess this is where manual intervention by trustees is needed, but it might allow more packages to have curated status even when depending on uncurated ones. It should also help to triage those uncurated packages that need blessing and/or some work to adhere to the PVP.


Edit:
I forgot elms package manager that actually tries to enforce semvar, this also seems doable in hackage. Well at least the check, which should help with loosing of version bounds inside those hypothetical lock files for uncuracted packages.

[1] Rubys bundler, nodes yarn and npm, rusts cargo and pythons pip. Pip is by convention by specifying exact requirements instead of ranges.

@23Skidoo
Copy link
Member

23Skidoo commented Jan 27, 2018

If I understand the concept correctly, cabal-install already supports lock files (see cabal [new-]freeze), and stack is basically built around this idea.

@phadej
Copy link

phadej commented Jan 27, 2018

I like this proposal in general, and I really like per version property.


There is current "non-solution" of uploading boundless version, and making a revision to add bounds. It's a non-solution because current (which is curated) index is populated with boundless version for no reason.

This proposal makes that more robust:

  • upload boundless with x-curated: False
  • make a revision adding bounds and removing x-curated field

I could use this approch myself when there are two different indices: I probably will upload boundless version so I don't need to do revisions for test or benchmark dependencies on Stackage. (I find that very annoying).


Curated packages cannot depend on uncurated packages, and the hackage server will detect this as an error at upload time.

I don't think this check is a hard requirement for this proposal to go forward.

Rather the coummnity should write documentation how to use different indices,
and improve tooling where necessary:

  • cabal does support multiple indices, but switching them is inconvinient.
  • AFAIK stack doesn't support index change atm.

As there aren't uncurated packages in the uncurated index, then the check could be as simple as: there should be install plan for a new upload.

FWIW, this can (should?) be done even now. Or to put differently: if we don't want to do it for current Hackage, we shouldn't in curated/uncurated setting.

If the check is done without using a solver, i.e. "for each dependency there is some curated version", I'm afraid it won't work as intended in the long run.

If the check will be implemented, it should only check library and executable's depndencies. Tests and benchmarks don't need to be there. For example, it's common to benchmark against similar packages, which may be uncurated.


I don't understand how "x-uncurated" is considered an opt out of trustee revisions, if trustees have the power to "adopt" any uncurated packages.

Thanks to this question, I just realized that I think I got something wrong in the proposal. The reason it opts-out of revisions is that packages in the uncurated index do not include revisions. However, the implication in the proposal is that all packages in the uncurated index do not include revisions. This is wrong, I think. Rather, the uncurated index should be [curated with revisions + uncurated without revisions]. And the curated index should be [curated with revisions + adopted uncurated with revisions].

I think oppositely: uncurated index should be truly "uncurated", i.e. no revisions at all.

Good properties are:

  • Metadata of packages in uncurated index is the same as in the tarballs. This means that "uncurated = completely unmutated".
  • Then actions of Hackage Trustees won't interfer with e.g. Stackage curation.

Alternatively, technically elegant solution is to have three separate disjoint indices:

  1. uncurated (x-curated: False)
  2. curated first upload
  3. (curated) revisions

In the current version of proposal we have Uncurated = 1 + 2, Curated = 2 +
3
. If someone would need "curated with revisions + uncurated without
revisions", that would be everything 1 + 2 + 3

Note: "curated with revisions + adopted uncurated with revisions" is 2 + 3.


UI-idea: as Hackage lists all versions of package, than non-curated versions can be in differenct color or italic, so they will differ from curated ones. The same way deprecated & preferred are highlighted.


first about if "uncurated" will have a sort of inevitable gravitational pull (the tendency of the universe towards chaos, entropy consuming all things, etc.). I was hoping adoption would stem that. But I would really like current hackage trustees themselves to weigh in, since they have the best feel for the dynamics of the ecosystem as it stands.

I think, we should ask a maintainers of "5% (650)" central packages you (@gbaz) identified, how they will behave if this proposal is implemented?

Some authors (including me) have upper bounds on all dependencies, but I personally don't (AFAIK) maintain (so I can say package should have upper bounds) anything which is very close to the dependency root of Hackage.

Some authors don't want to have upper bounds unconditionally on every dependency, but aren't against revisions (like haskell-infra/hackage-trustees#125 (comment)), and I think Hackage Trustees can also in the future work with them, to have their packages in the curated index.

Hackage Trustees (though I speak only for myself) can handle the workload of adopting some versions of the rest of closure, if there are maintainers will be completely against making x-curated: True releases themselves.

@gelisam
Copy link
Collaborator

gelisam commented Jan 27, 2018

Curated packages cannot depend on uncurated packages

[...]

Hackage trustees will recognize and respect the uncurated flag, and not contact those who set it with any issues.

[...]

Uncurated packages may be "adopted" into the curated ecosystem by trustees.

What about the following scenario: I want my package to have good bounds and I would be happy to be contacted about issues, but I can't because one of my dependencies is uncurated and I don't have the energy to get it adopted. Later on, somebody does get that dependency adopted; I would like to be contacted so I can now make my package curated as well, but I won't be because the fact that my package is uncurated-despite-my-best-wishes will be interpreted as please-don't-contact-me!

Furthermore, I think wishing-to-be-contacted is a per-maintainer decision, not a per-revision decision. So it should be possible to change this decision when a package changes hands, or when a maintainer changes their mind. It also has no impact on building packages, so there's no reason to associate it with a version, a revision, or to keep it immutable. Maybe it could be a mutable flag on our Hackage profile page?

@gbaz
Copy link
Collaborator Author

gbaz commented Jan 28, 2018

@bergmark writes

I'm a bit confused on what x-uncurated means. Is it that a release is currently only part of the uncurated index but can be revised by a trustee to be curated, or is it an opt out for maintainers to say that a package will never be apart of the curated index?

It means the former, which is what you were hoping for :-)

@gelisam raises a very good question:

What about the following scenario: I want my package to have good bounds and I would be happy to be contacted about issues, but I can't because one of my dependencies is uncurated and I don't have the energy to get it adopted. Later on, somebody does get that dependency adopted; I would like to be contacted so I can now make my package curated as well, but I won't be because the fact that my package is uncurated-despite-my-best-wishes will be interpreted as please-don't-contact-me!

This is a very good point. I do think the information should remain per-package at least for the following reason however -- it may be you have stuff you care about and want curated, and you also have stuff you don't want curated and don't care about at all (e.g. point-in-time research artifacts).

Anyway, I think there's a good solution, here coming from the comment from @tomjaguarpaw about boolean blindness. We can have x-curation: curated (by default), x-curation: uncurated, x-curation: uncurated-adoption-sought and x-curation: uncurated-no-contact or the like. Obviously the flag names need work. This also gives a great onramp to users -- the transitive check can suggest the adoption-sought flag, and trustees can look to fix up the blockers.

I want to give a few days for the discussion here to proceed (especially for those that don't consider discussing package management their idea of weekend fun :-P). Following that, I'll revise the proposal to try to take a lot of the important ideas and points raised here into account.

@snoyberg
Copy link
Collaborator

Thanks for writing this up @gbaz. I'd like to focus on how this is intended to interact with Stackage. I believe the proposal is designed to account for desires of Stackage as a downstream consumer, which some commenters here may not be aware of, but the proposal seems to be gearing towards.

Some authors of packages in Stackage wish to opt-out entirely from revisions and requirements of maintaining PVP bounds. Stackage would like to respect their wishes, and not pull in revisions that these authors do not agree to. (In some cases, such revisions have introduced unnecessarily strict upper bounds, causing authors to need to un-revise those bounds or upload new versions.) At the same time, Hackage Trustees wish to be able to provide revisions for consumption by cabal-install's dependency solver.

One response to all of this would be for Stackage to simply ignore all revisions, what I believe is known as the rev0 proposal. This unfortunately has its own limitations:

  • Packages which follow the PVP from rev0 would then retain unnecessarily strict upper bounds for Stackage's usage, since Stackage would not receive the revisions which relax their upper bounds
  • There would be significant user and author confusion by Stackage and Hackage having completely differing views of the world. It would be completely understandable for a Stackage user to file a report against a rev0 view of the cabal file, and an author to not understand why the user does not see the revisions already made.

As I understand it, this is the logic behind your comment above, which seemed to cause confusion:

Rather, the uncurated index should be [curated with revisions + uncurated without revisions]. And the curated index should be [curated with revisions + adopted uncurated with revisions].

In this world, Stackage would use the uncurated index as its upstream. Authors who have opted out of curation will have no revisions from trustees appear in Stackage, as they desire. However, authors who choose to introduce PVP-style bounds and then relax them with revisions will not be asked by Stackage curators/users to make a separate package upload to relax version bounds.

That's at least my understanding of this design. Have I read too much into it?

A few other questions which I think haven't been asked above, apologies if they're repeats:

  • Should uncurated packages block author revisions to avoid confusion? Otherwise, we could end up in the same situation where an author uploads a package with restrictive bounds but opts out of curation, then uses revisions to relax the bounds for cabal-install, but Stackage does not see those changes.
  • What is displayed as far as version information on the Hackage page? A large concern raised throughout these designs is the problem of each ecosystem displaying conflicting information. It would be ideal if the uncurated packages showed the rev0 information by default, perhaps with a link to display revision information.
  • There's a greater question that all of this begs, which is whether we'd all be better off if Stackage ignored version bounds entirely. If I'm not mistaken, many people reviewing the initial SLURP proposal brought up this idea. On the one hand, I hesitate to even raise this idea here, since it's fairly tangential. But on the other hand, such a decision could have a major impact on design of a feature like this. I'll try to do a write-up this week of the arguments I heard for changing Stackage, and perhaps that discussion will shed light on this proposal.

Again, thank you for taking the ball on this proposal.

@tomjaguarpaw
Copy link
Member

I don't know the rational for Stackage not ignoring version bounds. Has this been discussed somewhere before that someone can link me to?

@lspitzner
Copy link

Have there been any constructive proposals about changing (trustee) guidelines? If so, why don't we take the discussion to the relevant proposal/issue/PR? If not, do we really have to continue seeing and replying to non-constructive comments?

If you don't like this proposal, which at least tries to reduce friction inside the community, make a better proposal! If you think we need a code of conduct, or a change to guidelines, or whatever other change, make a constructive first step!

@gbaz
Copy link
Collaborator Author

gbaz commented Jul 10, 2018

@sol the diff to hackage guidelines in line with this proposal was made here: https://github.com/haskell/hackage-server/pull/676/files

I think it should be fine from the standpoint of your concerns.

@lspitzner as I linked above, trustee guidelines in fact were changed and strengthened as a result of discussions: https://github.com/haskell-infra/hackage-trustees/pull/154/files

@lspitzner
Copy link

indeed, yet @vedksah is still complaining without any constructive input from their side, afaict.

It is way too easy to always pick some aspect(s) of what others propose and complain, or state some "-1" without putting in the work that an alternative (proposal) would require. "My concerns have not been addressed!" "What is the progress on vague idea X?" "We should institute some better policy!" If this happens consistently, the discussion is one-sided, and probably will not help in resolving the underlying conflict.

Perhaps this mostly fits on the last few comments, but it is a dynamic worth mentioning.

@tomjaguarpaw
Copy link
Member

@vedksah What concrete change do you actually want to see and what direct benefit would it have?

@sol
Copy link
Member

sol commented Jul 10, 2018

Thanks @gbaz!

@simonpj
Copy link

simonpj commented Jul 10, 2018

We ask for a a code of conduct and all we get are more empty words

I don't know if this is the right place to discuss this question (where is?) but I get widely differing opinions about having a Haskell Community Code of Conduct. Some people (including me) would welcome it. Others have strongly opposed having one. Everyone asks (rightly) how one would police it. Even aside from the difficulty of finding moderators, no one can prevent someone posting on Twitter, Reddit or many other public fora -- the Haskell community is extremely open.

Would a code of conduct without teeth be a step forward? I tend to think so, if only as a point of reference. To avoid the complexity of deciding who "we" are in posting "our" code of conduct, I was even thinking of posting "Simon's personal code of conduct", with no attempt to impose it on anyone, just put it out there.

But what I think is constructive might not be right. I'm not an active participant in many of these fora, so I might be way off beam.

The more we can move from blame to constructive suggestions the better.

@simonpj @simonmar I implore you to do something! This has to stop!

I'm simonpj@microsoft.com if you want to write to me.

@akhra
Copy link

akhra commented Jul 11, 2018

@simonpj active or not, you're recognized and respected throughout the Haskellosphere, including as a moral guidepost. And for what it's worth, I think presenting your personal COC is both useful in it's own right, and entirely in line with why you are already held up as our gold standard of classy behavior.

@mboes
Copy link

mboes commented Jul 11, 2018

@simonpj It's my impression that @vedksah is talking about a Code of Conduct for Hackage Trustees. At any rate, that's what he mentioned upthread here. If so, then that's a much easier problem to solve. It's much easier to enforce it, because that particular CoC would apply to a fixed group of people, who presumably would eventually lose their trustee status given repeated or egregious violations of said CoC.

@gbaz's first response to @vedksah seems to be that such a CoC already exists. It's called the Hackage Trustee Guidelines, and they have been recently strengthened. In which case, the questions become:

  • should the behaviour (s)he refers to be considered "misconduct"?
  • is the behaviour (s)he refers to misconduct of a trustee acting in capacity of a trustee?
  • If yes to both, do the current guidelines make that clear?
  • Further, are the current guidelines effective at reducing the likelihood of such misconduct occurring in the future? (If there's a process for designating trustees, surely there can be an effective process to remove trustees.)

I don't have answers to these questions, but this pull request is the wrong forum to ask them and to answer them. I suggest this be taken over to https://github.com/haskell-infra/hackage-trustees.

@vedksah
Copy link

vedksah commented Jul 16, 2018

lose their trustee status given repeated or egregious violations

That's exactly what needs to be done!

We have more than enough evidence for the pattern of ongoing misconduct

a CoC already exists. It's called the Hackage Trustee Guidelines, and they have been recently strengthened. In which case, the questions become:

What exactly was "strengthened"? What does the CoC effectively protect against? Where is the procedure for dishonorable discharge?

haskell-infra/hackage-trustees#154 (comment) says

I think this is the product of a fair amount of discussion, so I'll go ahead and merge at this point.

Where can I find this pretended "fair amount of discussion"?

In which case, the questions become:

  • should the behaviour (s)he refers to be considered "misconduct"?

Obviously! How else would you describe the behavior?

  • is the behaviour (s)he refers to misconduct of a trustee acting in capacity of a trustee?

Irrelevant! How can you trust somebody in a responsible position who merely acts polite in the streets but abuses in the sheets?

  • If yes to both, do the current guidelines make that clear?

Clear as mud.

  • Further, are the current guidelines effective at reducing the likelihood of such misconduct occurring in the future?

Definitely not. It's not surprising given who wrote them.

@tomjaguarpaw
Copy link
Member

We have more than enough evidence

To me that looks more like "vague accusations". @vedksah I've no idea who, if anyone, is in the right in this matter but the manner in which you are dealing with the issue is doing you no favour. I'd appreciate an answer to my earlier question. Without specific details nothing can be done.

@gbaz
Copy link
Collaborator Author

gbaz commented Jul 16, 2018

It's not surprising given who wrote them.

The changes to the guidelines were drafted by adam bergmark with input given by other trustees. One can can tell who wrote them by checking the authorship of the PR.

@tonymorris
Copy link

I think calls for a Code of Conduct from one of the most abusive Haskell users is … puzzling.

Codes of Conduct are almost universally used as a weapon for political gain. shrug I'm almost tempted to withdraw my strong objections in this one very specific case. Almost.

@tonymorris
Copy link

I was even thinking of posting "Simon's personal code of conduct", with no attempt to impose it on anyone, just put it out there.

I only just noticed this comment.

FWIW, my team of Haskell programmers has a similar Code of Conduct, which is simply a personal statement, "We will try to do this thing." It cannot possibly be used for the insidious purposes for which Codes of Conduct are often deployed, as is being proposed in this case. It does not impose penalties; there are no elevated positions such as judges of morality (except yourself, unto yourself), no witch-hunts, gaslighting, abuse and all the other nefarious properties that accompany these political weapons called Codes of Conduct. A personal "Code of Conduct" is a constructive tool for reflection. It is nothing else.

Here is our team's current CoC, and is subject to change as we learn more about ourselves. Anyone is welcome to adopt some or all of it for their own constructive purpose.

https://gist.github.com/tonymorris/90522094bb964fd0d7bb42acd43ff4fb

To summarise, I am in favour of (and highly recommend) personal Codes of Conduct. I am firmly against political weapons.

Peace.

@simonpj
Copy link

simonpj commented Jul 24, 2018

A personal "Code of Conduct" is a constructive tool for reflection. It is nothing else.

Interesting. There has been very little comment on my suggestion on this thread, although one person wrote privately to say "don't do this". When I asked why s/he said (I paraphrase) that even Simon's personal CoC could (and probably would) be used as a weapon; and that in any case it would be ineffective.

So I've backed off again. I do not want to cause more harm than good. You highly recommend personal codes of conduct -- but does "personal" mean "private"? That is, are you recommending that I -- and perhaps others -- publish our personal codes, as a tool for reflection?

Simon

@tonymorris
Copy link

Well, my team's Code of Conduct is not private. It is somewhat of a joke, but that's rather the point. Surely it cannot be used as a weapon? Maybe that's a little optimistic, but I'm willing to learn and burn if I am wrong.

My own personal "Code of Conduct" is not available publicly, so maybe there is a conflict there in my recommendation. I think you are right to tread carefully here.

@snoyberg
Copy link
Collaborator

screen shot 2018-07-24 at 18 51 19

@tomjaguarpaw
Copy link
Member

@snoyberg Could you clarify the intention behind your message? Certainly it demonstrates that a certain person has acted in a vulgar way. Is there also a subtext of "... and therefore he shouldn't be trusted" or "... and therefore he shouldn't be listened to"?

@akhra
Copy link

akhra commented Jul 25, 2018

@tomjaguarpaw I know it's not deliberate, but you're kind of muckraking right now. @snoyberg called out a lapse in temper, in a manner which was itself a lapse in temper; we don't need to push for that cycle to continue. Rather the opposite, by preference.

Call a spade a spade, but please don't go spade-baiting. We all stumble.

@tonymorris
Copy link

It wasn't a lapse in temper. God I wish you guys could figure it out. That was a lament.

@tomjaguarpaw
Copy link
Member

I'm not willing to presume that Michael's post was a lapse in temper. For all I know it is a reference to something I'm unaware of, carries a very clear meaning for those who are familiar with it, and moves the conversation along productively. That's why I asked for a clarification. On the other hand, if it was a lapse in temper then, well, we all have them from time to time. I'm sure Michael will delete the message and as mature individuals we will all forget about it and move on to more productive discussion.

@snoyberg
Copy link
Collaborator

@tomjaguarpaw It definitely wasn't a lapse in temper. @simonpj is engaging in a good faith discussion right now on the general concept of a Code of Conduct (CoC), and Tony is raising in the discussion his own personal CoC. It's completely relevant to point out what seems to be included within that CoC. I'm not trying to imply anything like the potential interpretations you mentioned.

@tonymorris
Copy link

I have no idea what my personal CoC has to do with responding appropriately and effectively to abuse.

@tomjaguarpaw
Copy link
Member

@snoyberg Sorry for being dense but could you please explain further? The interpretations I mentioned were not supposed to put words in your mouth but as a genuine indication that I really don't get what you're trying to say.

@ElvishJerricco
Copy link

I'm unsubscribing from this issue. This is getting useless. Commenting to point out I believe zero or negative progress is being made.

@tonymorris
Copy link

^ Same. Cheerio everyone.

@lspitzner
Copy link

I would like to thank @snoyberg for expressing so clearly that they consider some screenshot to be constructive input to this discussion. It reaffirms the previously mentioned suspicion that this discussion is one-sided, and can only agree with the conclusion - this is not going forward.

@snoyberg
Copy link
Collaborator

This will be my last clarification, if you want further @tomjaguarpaw, feel free to reach out privately.

I'll be honest: I thought the original comment (without comment) spoke for itself, and that my clarification was sufficient. Put bluntly: Simon is asking for feedback on sharing his own CoC, and he's getting that feedback from someone who is demonstrably vulgar and abusive in public discussions. Simon should know where his information is coming from, and the nature of community behavior engaged in by such proponents.

@vedksah
Copy link

vedksah commented Nov 7, 2018

Hey everyone

We're working on a new code of conduct

https://www.snoyman.com/blog/2018/11/proposal-stack-coc

If you want to contribute just head over to

https://github.com/snoyberg/stack-coc

and join the discussion. Everyone's welcome to participate, this is an open process!

Let's make the first step towards a healthier community!

@gbaz
Copy link
Collaborator Author

gbaz commented Nov 7, 2018

This ticket was to discuss the uncurated hackage proposal. Its use as a general discussion and update and lament thread, especially by a user who is widely recognized as an unstable and abusive troll (including on this thread), and who has been banned from the haskell subreddit more times than I can count, is not appropriate. And whatever the merits of a CoC, vedksah, who just on this thread used language that would be inappropriate under any conceivable CoC is the last person whose opinion matters on one.

Many people have already "voted with their feet" and unsubbed from this thread for this reason.

Until such point as there's something further to say on uncurated hackage itself, I'm going to lock the thread.

@haskell haskell locked as off-topic and limited conversation to collaborators Nov 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.