Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[new PEP] Use SPDX license expressions in Core package metadata #2

Closed
wants to merge 40 commits into from

Conversation

pombredanne
Copy link
Owner

@pombredanne pombredanne commented Aug 16, 2019

@pombredanne
Copy link
Owner Author

This was originally at python#1148 and is now closed

Copy link

@ncoghlan ncoghlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together! The basic concept seems sound to me, but I think we should tread very softly on the deprecation side of things, since it doesn't gain consumers much (since they still need to deal with old releases that don't have this new field), but imposes a cost on all publishers, even those that aren't publishing projects that get consumed by large organisations.

pep-9999.rst Show resolved Hide resolved
pep-9999.rst Outdated

The use of license-related classifiers in this field will be deprecated in the
future and its documentation has been updated accordingly. Tools are encouraged
to provide a warning when this field is used with license-related classifiers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of deprecating these fields, as SPDX is driven by the needs of large consumer organisations, and if they're not offering to pay project maintainers to update their licensing metadata (e.g. through a Tidelift subscription or a consulting contract), then it isn't reasonable to push new work onto publishers solely for the benefit of these organisations.

Instead, I'd prefer to see these sections say something along these lines:

License:

If the License-Expression field is present, publishing tools MUST NOT also populate the License field. However, for compatibility with existing publishing and installation processes, the License field SHOULD continue to be accepted if the License-Expression field is absent. Publishing tools MAY infer License-Expression from the provided License information if they are able to do so unambiguously.

Classifiers:

If the License-Expression field is present, publishing tools MUST NOT also provide any licensing related Classifiers entries. However, for compatibility with existing publishing and installation processes, licensing related Classifiers entries SHOULD continue to be accepted if the License-Expression field is absent. Publishing tools MAY infer License-Expression from the provided Classifiers entries if they are able to do so unambiguously.

However, no new licensing related classifiers will be added, with anyone requesting them being directed to use the License-Expression field instead.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that omitting the deprecations doesn't actually make things any more complicated for metadata consumers, since they have to deal with at least the 1,428,826 project releases already on PyPI anyway, and none of those will have the License-Expression field.

The important part of this PEP is providing a way for folks that already care to be unambiguous about their licensing, and to offer a low impact migration path if they want to send PRs to other open source projects that they would also like to see clarify their licensing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll second @ncoghlan.

Let's not make existing packages incompatible with Metadata 2.2 and instead, add License-Expression as an opt-in unambiguous alternative to existing mechanisms. Having it be exclusive to License Classifiers and "License" metadata is a nice approach for that.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncoghlan you wrote:

I don't like the idea of deprecating these fields, as SPDX is driven by the needs of large consumer organisations,

and:

The important part of this PEP is providing a way for folks that already care to be unambiguous about their licensing, and to offer a low impact migration path if they want to send PRs to other open source projects that they would also like to see clarify their licensing.

Agreed.
Note that I think that using here SPDX is not driven by the needs of large consumer organisations exclusively. Small development teams, authors and FOSS supporters all benefit from improve clarity in licensing.

But to your point, yes, it could be a new burden, so the next push no longer uses a License-Expression field but instead re-purposes the existing License field and provides immediate compatibility with v2.1 without
doing any change.

Beyond this I have integrated your comments in the latest push.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pombredanne I don't generally agree that there are significant benefits to regular people to mandate SPDX. Additionally, I don't think it's worth it to repurpose the existing License field.

If people want to fill in this field, then adding a SPDX-License-Expression tag makes sense.

Though, I'm tempted to suggest that you make this a versioned tag (like SPDX-3.0-License-Expression) because SPDX does not remain consistent on license tags.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Conan-Kudo you wrote in #2 (comment):

@pombredanne Between the times that the expression grammar was changed (+ -> WITH, / -> OR, etc.),

From what I can see there never was any such change: the expression grammar was introduced in SPDX 2.0 and there are no material changes in version 2.1. I extracted the texts from each version for the Expression spec part and posted the two versions there https://gist.github.com/pombredanne/49b2b8699d15403bec21030fe359c797/revisions

  • + is not replaced by WITH: both were always there
  • / was never a substitute for OR: you may refer too the way this has been used in Rust Cargo manifests... but that's Rust way and this was never specified in SPDX specs.

the stupid case sensitivity issue (and vs AND)

Being a Pythonista I am with you there: everything should be in proper lower case and snake case! :D

That said that's minor since this is only a canonical form issue. In practice you can ignore case when parsing (e.g. being lenient on inputs) and output something with the proper case. The license-expression library does exactly this FWIW, so I do not see this as a problem.

and the change of the GNU license tags (GPL-2.0 -> GPL-2.0-only, GPL-3.0+ -> GPL-3.0-or-later, etc.),

I am siding with you there too. And I voiced my concerns publicly when that changed happened in the fall of 2017. I was against it, but I was not the majority so I accepted the community consensus. You should have voiced your concerns in pblic then ... I would have mucho appreciated the support. In the end the SPDX community showed deference to rms and the request of the FSF to use these updated ids for their own licenses. Even though I was against it (especially since I was in the middle of a major licensing clarification work with the linux kernel maintainers), when a license author (and someone as prominent in the community as rms himself) comes and kindly ask for changes on how their things are named, I think this is OK to do the change. If you still disaagree, please bring your concern to the SPDX mailing list, as well as to rms and the FSF, but that ship has sailed now IMHO.

I've lost faith in the SPDX organization to keep this stable and reliable.

I do not recall that you joined the discussions at SPDX back then (but I have a crass memory, so forgive me if this is wrong), and that's really something that I would have appreciated as we are thinking alike on many of these topics and you seem to care about these.

That said, things are versioned and evolve in SPDX, they are not set in stone! The same way I expect software to evolve with bug fixes and new features. To the credit of the volunteers caring for the SPDX spec and license list, things are versioned and they are trying hard to keep backward compability AND ensure that new and retired license ids are never reused and have a clear mapping. I think that's quite OK in general. I am not sure what else I could

openSUSE, Rust, and the Linux kernel all implement SPDX license identifiers differently based on when they pulled the rules. And that's the most frustrating part of it all! Suddenly many things failed validation when they used to pass because the tools and data were updated to invalidate them.

I can understand your frustration, but SPDX spec and ids list are versioned for a good reason to cope with changes. You cannot blame tools that may not work yet with newer versions of the specs or stop to support previous versions. This is no different from code: there are at times some major changes and they may break compatibilty. That said, tools can handle that alright: for instance the license-expression library that I maintain is prefectly happy with past, current and future versions of the SPDX license list as well as mixing all versions together(and FWIW with any list of license symbols you can feed it with) so this means that it is possible for a tool to deal with updates in an orderly way.

That said, if I dive in the specifics:

There is a also an mapping table at https://docs.google.com/spreadsheets/d/14AdaJ6cmU0kvQ4ulq9pWpjdZL5tkR03exRSYJmPGdfs/pub which looks to me as bringing order to map several legacy openSUSE ids to a sinegle SPDX id such as for AGPL... and I see that as a good thing. Now you seem much closer than I am to openSUSE so you likely know better.

AGPL-3.0-only	Affero GPL
AGPL-3.0-only	AGPL-3.0
AGPL-3.0-only	AGPLv3
AGPL-3.0-or-later	AGPLv3+
AGPL-3.0-or-later	AGPL-3.0+
AGPL-3.0-or-later	SUSE-AGPL-3.0+
  • Rust is very clear on what they use and that sounds very clean and consistent to me.
    @wking @dwijnand If could comment on the Rust/Cargo adventure with SPDX license expressions that would be awesome!

See https://doc.rust-lang.org/cargo/reference/manifest.html#package-metadata
And https://github.com/rust-lang/cargo/blob/fe0e5a48b75da2b405c8ce1ba2674e174ae11d5d/src/doc/src/reference/manifest.md#L254
And https://github.com/rust-lang/cargo/blame/fe0e5a48b75da2b405c8ce1ba2674e174ae11d5d/src/doc/src/reference/manifest.md#L254

# This is an SPDX 2.1 license expression for this package. Currently
# crates.io will validate the license provided against a whitelist of
# known license and exception identifiers from the SPDX license list
# 2.4. Parentheses are not currently supported.
#
# Multiple licenses can be separated with a `/`, although that usage
# is deprecated. Instead, use a license expression with AND and OR
# operators to get more explicit semantics.
license = "..."
  • For the Linux kernel, I have been in the first line as I helped Greg Kroah-Hartman, Thomas Gleixner and Kate Stewart to streamline the kernel licensing by using SPDX expressions in source code and the FSFE reuse conventions. The bulk of the initial work took place in fall 2017 just when the FSF requested the GPL id cange @ SPDX... this has been handled nicely and without much hiccups by pinning the version of the licenses being used. There are tons of discussions on lkml on this and many patches since thousands of files were changed, but tell me if you want some pointers on specifics.

That said you may be referring to things like this thread https://lists.opensuse.org/opensuse-factory/2018-02/threads3.html https://lists.opensuse.org/opensuse-factory/2018-02/msg00464.html

As I mentioned here, I was first line when that happened and I am with you on the issue. I explained here why this happened with rms and the FSF. And I was against it, but that was not te consensus. It would have been great for folks from openSUSE to chime in at the ttime.

I don't want that for Python, at all.

I highly respect your opinions there: since you care and want something different for Python, may I kindly suggest that you submit a PR on top of this PR/branch with your suggested updates or specific comments to evolve it?

Or if you think this PEP is completely off base and wrong, would you mind to start a concurrent PEP with an alternaive proposal so that we can all discuss and review both proposals to work out something constructive?

Copy link

@Conan-Kudo Conan-Kudo Sep 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pombredanne I think it's perfectly fine to allow people to put SPDX expressions in License, but adding a field that can be optionally used that must be SPDX compliant means that if that field is detected, software can guarantee they can process it that way.

We could be explicit in the packaging documentation about which exact version of the SPDX license list we support at any time. Would this work out for you?

I'm aware that the identifiers list is actually versioned. What I'm saying is that this PEP should specify a version and have a local copy of it, and updating the version of the identifiers should require a revision to this PEP.
EDIT: You did in fact ask me this, and the answer is yes, it would work for me. I'd prefer a local copy embedded because historically it's been a pain to request specific versions of the identifiers, and having a local copy avoids that problem. I want updating to new SPDX identifier versions to require PEP updates.

And for what it's worth, I have been subscribed to the main SPDX mailing list (I subscribed when I was more interested in migrating Fedora to SPDX identifiers as part of my Fedora Rust SIG work), and I voiced my concern about the change when it happened as well. That was the straw that broke the camel's back for me, as the discussion did not resolve the issue well for me. I would have probably been less annoyed if it was easy to directly request specific versions of the identifier list for machine parsing.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Conan-Kudo thanks for your answer details... I was afk for a while and I am back.

You wrote in #2 (comment):

I want updating to new SPDX identifier versions to require PEP updates.

That's reasonable (this is more or less what we ended up doing for the Linux kernel: we store the list of valid identifiers in the kernel doc)... but at the same time, this means an update to a foundational document (the metadata doc/PEP) which is fairly significant and not to be done lightly.

To me the key thing would be how often would this possibly happen in the future? The rate at which new FOSS and related licenses evolve is slow enough. Here are some anecdotes:

  1. SPDX adds about 20 new licenses per year
  2. In ScanCode this is more like 1 or 2 per week.

That said, the new added licenses are mostly either old, seldom used licenses that were not yet "discovered" and new licenses that are not much used as of now. So in 99% of the cases the new licenses could be qualified as exotic.

Therefore, I think we could freeze a version of the SPDX license list in the PEP alright and the need for an update should be rather rare (maybe once in three years or so).

@Steap @ncoghlan @pfmoore @cjerdonek what would you think about this? This would mean being strict in this section: https://github.com/pombredanne/spdx-pypi-pep/pull/2/files#diff-7a25ca1769914c1141cb5c63dc781f32R223 and specify that we use a defined version of the list and that adopting future version would require an update to the metadata doc and version.

  • The positive: there is no ambiguity about which licenses ids are supported
  • The negative: adopting a new version of the license list every other year would require a new PEP, which is a disruption but that would be every couple years or so only.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Conan-Kudo you wrote in #2 (comment)

I think it's perfectly fine to allow people to put SPDX expressions in License, but adding a field that can be optionally used that must be SPDX compliant means that if that field is detected, software can guarantee they can process it that way.

My personal inclination as documented in the current version is to avoid field inflation and reuse the license field. There is no ambiguity at all when this contains a parsable SPDX license expression or not. Since the field is in use and a warning would be issued it provides the proper gentle nagging that will help authors evolve towards a more accurate license documentation. a field that's new and optional is likely to have a lower impact and create a bigger disruption:

We have today two fields used for license (and this is confusing). And we would go to three fields all optional if we add a new one, a likely source of more confusion for authors IMHO.

With that said, if there is a consensus to use a separate field, I will update the draft to use that instead.

@Steap @ncoghlan @pfmoore @cjerdonek @pradyunsg what's your last take on this topic?

This and the freezing the list of licenses discussed in #2 (comment) are IMHO the last two objections/concerns to address and resolve before moving this PEP to an official draft IMHO

Copy link

@pradyunsg pradyunsg Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do these discussions in dedicated issues for each of these points? It'd be weird to discuss these inline on a PR.

Also requested the same at https://discuss.python.org/t/improving-license-clarity-with-better-package-metadata/2154/64.

pep-9999.rst Outdated Show resolved Hide resolved
Copy link
Owner Author

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncoghlan @pradyunsg Thank you ++ for your review.
I have pushed an updated version

pep-9999.rst Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated

The use of license-related classifiers in this field will be deprecated in the
future and its documentation has been updated accordingly. Tools are encouraged
to provide a warning when this field is used with license-related classifiers.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncoghlan you wrote:

I don't like the idea of deprecating these fields, as SPDX is driven by the needs of large consumer organisations,

and:

The important part of this PEP is providing a way for folks that already care to be unambiguous about their licensing, and to offer a low impact migration path if they want to send PRs to other open source projects that they would also like to see clarify their licensing.

Agreed.
Note that I think that using here SPDX is not driven by the needs of large consumer organisations exclusively. Small development teams, authors and FOSS supporters all benefit from improve clarity in licensing.

But to your point, yes, it could be a new burden, so the next push no longer uses a License-Expression field but instead re-purposes the existing License field and provides immediate compatibility with v2.1 without
doing any change.

Beyond this I have integrated your comments in the latest push.

pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated
* Updated the documentation of two fields: ``License`` and ``Classifiers``


License Expression Library Reference implementation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opinion: It is important to separate the discussion of the standard from the discussion about the implementation.

This section (and the discussion of a library being developed for licensing) should probably be dropped from the PEP -- it's not relevant to the metadata update and going into too much detail of how-to-do-this, unnecessarily "fixing" what we would be doing here. eg: the library could well be implemented independently and not be a part of the pypa organization here.

Copy link
Owner Author

@pombredanne pombredanne Aug 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pradyunsg you wrote:

opinion: It is important to separate the discussion of the standard from the discussion about the implementation.

This section (and the discussion of a library being developed for licensing) should probably be dropped from the PEP -- it's not relevant to the metadata update and going into too much detail of how-to-do-this, unnecessarily "fixing" what we would be doing here. eg: the library could well be implemented independently and not be a part of the pypa organization here.

This makes sense, though I added the section about a reference implementation based on a comment from @pfmoore

FWIW, the library already exists and is not part of Pypa alright.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth and ended keeping this as a reworded reference implementation section for now. That said, I am quite happy to remove that section if there is a consensus that it does not belong there.

pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
@pombredanne
Copy link
Owner Author

Note to self: distutils docs define the license field as a "short string: of no more than 200 characters per https://github.com/python/cpython/blob/dae1229729920e3aa2be015453b7f702dff9b375/Doc/distutils/setupscript.rst#L462

'short string'
A single line of text, not more than 200 characters.

and also this interesting bit referencing the English spelling of licence.
See https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Lib/distutils/dist.py#L254

The license field is a text indicating the license covering the package where the license is not a selection from the "License" Trove classifiers. See the Classifier field. Notice that there's a licence distribution option which is deprecated but still acts as an alias for license.

There is also the "UNKNOWN" business... See https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Lib/distutils/dist.py#L1189

I am not sure "unknown" is still a thing with newer packaging tools though it likely is still used at least by setuptools based on https://github.com/pypa/setuptools/blob/375138c7a477278ee7bcc5e4d78cbe243ef5c008/setuptools/monkey.py#L104

pep-9999.rst Outdated Show resolved Hide resolved
Copy link

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some typos I spotted on a skim. :)

pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
Copy link

@ncoghlan ncoghlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the direction this has taken!

pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated
- making the existing `License` and new `License-File` fields mandatory
including stricter enforcement in tools and Pypi publishing.

- restricting the upload of packages to the public Pypi index to the packaes
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packaes -> packages

Worth noting: SPDX license list has metadata for whether the license is approved by OSI and FSF.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 86eb7a8

pep-9999.rst Outdated
- SPDX license ids are becoming a de-facto way to reference common licenses
everywhere, whether or not a license expression syntax is used. But they often
need to be supplemented with extra license ids or conventions to accept extra
or generic licenses such as "Proprietary" or "Public domain" not tracked by
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Public domain should be argued based on https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files

  • I haven't ever seen good rationale why Proprietary needs to be there. It feels that in medium-size companies legal department tells tech department things, but nothing concrete.

  • There's also Add “NONE” to the license expression syntax spdx/spdx-spec#49, i.e. you could say NONE to mean there are no license (and again, no lawyer have ever explained to me how All rights reserved differs from license = NONE in this technical context)`

  • I don't see how LicenseRef-PublicDomain or LicenseRef-Proprietrary are any worse. They are however valid SPDX license expressions, so generic tools/libraries will understand them.

  • Also, if there's some actual proprietary license, not "All rights reserved", then LicenseRef-OurCompanyLicense is valid license expression and more correct&descriptive.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LicenseRef-OurCompanyLicense is valid license expression

I spent a few minutes looking at the SPDX site and couldn't find any confirmation of this (not saying you're incorrect, but rather that the information is hard to find) and there's no way I'd have known to even look for an expression like this if all I knew was "I need to record that the package I'm publishing is licensed under our company license".

My concern here is that if it's too hard for people to find a reasonable thing to put in this metadata, they'll end up not bothering, and just supplying a LICENSE.txt file, or if the field is made mandatory we'll get endless support requests on the packaging tracker.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at appendix iv in https://spdx.org/spdx-specification-21-web-version

Some examples:
LicenseRef-23
LicenseRef-MIT-Style-1
DocumentRef-spdx-tool-1.2:LicenseRef-MIT-Style-2

Unfortunaty, indeed https://spdx.org/ids-how doesn't mentioned LicenseRef and DocumentRef use cases directly. That's something to be raised with SPDX people.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pfmoore if company went through a cost of writing own proprietary license, it's not hard to figure out what SPDX expression should they use for it.

I repeat: SPDX is univeral standard, having Python-specific deviation won't really help anyone.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phadej FWIW, I happen to be one of the SPDX co-founders though I speak here exclusively with my Python hat on and not on behalf of SPDX.

LicenseRef-XXX license ids are only valid within the context of a full SPDX document and not outside in a solo expression like we are considering here. So using LicenseRef-Proprietary is no more valid than using Proprietary in this context ... much it is much simpler to write down, remember and specify.

There are some ongoing discussions at SPDX to defines license ids "namespaces" to cope with this but this is not fully there yet. In the meantime, there is no many other ways than to use expressions with extra not-SPDX-listed ids (which is what npm and Suse and ClearlyDefined do for now)..

@pf_moore you wrote:

My concern here is that if it's too hard for people to find a reasonable thing to put in this metadata, they'll end up not bothering, and just supplying a LICENSE.txt file, or if the field is made mandatory we'll get endless support requests on the packaging tracker.

exactly: which is why I am not making anything mandatory and I find that using a generic Proprietary for anything off SPDX is simpler.

pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated
'''''''''''''''''''''''''

A `License Expression` is a string using the SPDX license expression syntax as
documented in the SPDX specification [#spdx]_ using either Version 2.1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd explicitly discourage use of + combinator. Recent license lists added GPL-2.0-or-later and GPL-2.0-only identifiers, and deprecated GPL-2.0. I don't remember proper rationale, but that change removed most needs for +.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stand corrected: https://spdx.org/ids-how it's an issue with FSF license only.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The + is mostly abandoned at this stage and the main area where this was showing up was for GNU licenses and as you rightly pointed out SPDX changed these to -only and or-later based on requests from FSF.

I don't remember proper rationale, but that change removed most needs for +.

That was a request from rms and the FSF. See https://www.gnu.org/licenses/identify-licenses-clearly.html and then https://www.fsf.org/blogs/rms/rms-article-for-claritys-sake-please-dont-say-licensed-under-gnu-gpl-2

pep-9999.rst Outdated
- any SPDX-listed license short-form identifiers that are published in the
SPDX License List [#spdxlist]_ using either Version 3.6 of this list or any
later compatible version. Note that the SPDX working group never removes any
license identifiers: instead they may only one as obsolete.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we (Haskell/Hackage) took SPDX into use for the license field, we didn't included any identifiers already deprecated in the first version we used (IIRC license list 3.0).

Fortunately for us, suffix-less GPL-2.0 was already deprecated. So one have to explicitly write GPL-2.0-only or GPL-2.0-or-later.

pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated
with type, file an text keys. This is mandatory unless there is a LICENSE or
LICENCE fie provided.

- Haskell Cabal [#cabal]_ specifies a single string with a list of accepted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect. license field is documented at https://cabal.readthedocs.io/en/latest/developing-packages.html#pkg-field-license

  • it's not a list, it's proper SPDX License Expression (with additional NONE)

Cabal used to have own short list of licenses, but we moved to SPDX because

  • more licenses
  • expressions to combine them

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you ++ for your review!
I fixed this in 3290f56
I also added you name to the Acknowledgement section

pep-9999.rst Outdated
When processing the `License` field to determine if it contains a valid license
expression, tools:

- MUST ignore the case of the `License` field.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale to this?

I mean, the choice is irrelevant as there are no ambiguities.


Anecdotally all miss-cases cases I'm aware of were actual mistakes, so IMO one COULD report warning if non-canonical casing is used.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spdx/spdx-spec#63 is related issue.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale to this?

Accept anything even if there is a case change since this case really does not matter.

Anecdotally all miss-cases cases I'm aware of were actual mistakes,

Indeed!

so IMO one COULD report warning if non-canonical casing is used.

Good point: reporting a warning is a good way.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phadej this has been updated in d584c48 to report a warning!
Thank you ++

pep-9999.rst Outdated Show resolved Hide resolved
@pombredanne
Copy link
Owner Author

@phadej @ncoghlan I pushed a change and we now use LicenseRef-Public-Domain and LicenseRef-Proprietary as you both suggested.

pep-9999.rst Outdated

Several package authors have expressed difficulty and/or frustrations with the
possibilities to express licensing in package metadata. This also applies to
Liux distribution packagers. This has triggered several license-related
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s_Liux_GNU/Linux_

Could we also mention *BSD? And since this is also an issue with Macports, maybe we could find a more "generic" term. How about "package maintainers in various operating systems"?

no package uses them in PyPI as of the writing of this PEP.

The remainder of the `Classifiers` using a `License::` prefix map to a simple
single license expression using the corresponding SPDX license identifiers.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote https://framagit.org/upt/upt-pypi/blob/master/upt_pypi/licenses.py#L15 . Should we provide a ready-to-use mapping in an annex?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Steap that would be great. Do you mind to update the draft directly and add yourself as a co-author? Your call... but that would be great!

pep-9999.rst Outdated Show resolved Hide resolved
:::::::::::::::::::::::::::

The License-File is a string that is a package-root relative path to a license
file. The license file content __must__ be UTF-8-encoded text.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counter-proposal: it is a simple filename for a license file in the dist-info directory.

I see license as part of the metadata, not the program code, so it seems it would be better to have all the info inside dist-info.

(Your program wants to access the license file to display it? use functions in importlib.metadata)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the mailing list, it was suggest to reuse the rules for RECORD, so with that it would be a path relative to site-packages directory, which makes it easy to point to project-0.42.dist-info/license.txt.

Copy link
Owner Author

@pombredanne pombredanne Jul 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merwok that's exactly what should be written here: you are entirely correct there, this is actually a bug that I am about to fix in a few:

  1. wheel accepts root-relative paths in license_files. For instance:
[metadata]
license_files= 
 LICENSE
 foo/bar/NOTICE
  1. this will produce a dist-info with these two files:
LICENSE
NOTICE

So the behaviour is what you are advocating for. And there is one oddity you made me discover in wheel!

When we have this setup.cfg:

[metadata]
license_files= 
 LICENSE
 etc/LICENSE

the built wheel will contain only one LICENSE file with the content of etc/LICENSE e.g. the last reference to a filename wins
I think this is reasonable behaviour and it could be mentioned in the PEP for reference

Copy link
Owner Author

@pombredanne pombredanne Jul 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merwok this also highlights another issue. With this setup.cfg

[metadata]
license_files= 
 LICENSE
 license

A whee will have both:

 LICENSE
 license

That's on a POSIX filesystem that's case-sensitive for paths.
But if that wheel were installed on a case-insensitive FS such as Windows and more recently macOS APFS, it may be that one of te two files get overwritten which is IMHO not an acceptable solution for METADATA. IMHO we need to specify that each License-File entry must be unique ignoring case. And that will be for tools to honor (and for wheel to be fixed accordingly)

@agronholm and @pfmoore what's your take on this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that "the paths of license files in the source tree" is a different information than "the path of the license files copied to the dist info directory". Hence while setup.cfg has license_files, it is not the same information as will be stored in the dist-info metadata.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one dist-info file that’s not defined by PEP 376: entry_points.txt
https://packaging.python.org/specifications/entry-points/#file-format

It is not referenced from METADATA but is present in RECORD, so maybe it’s precedent enough!
(the naming doesn’t use the all-caps style because the spec was retrofitted from a setuptools invention)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, it works as a precedent for shipping license files - but as you say, it doesn't indicate how we reference other files from metadata. Also (and this is something else the proposal here needs to consider) what if someone tries to ship a license file called entry-points.txt? Yes, I know it's a silly thing to do, but standards need to cover edge cases...

I think if we're going to standardise shipping license files (as opposed to the original scope of this PEP which was just about specifying what license was in use), we probably need to reserve a namespace in .dist-info - say that all licenses must go under .dist-info/licenses, or something.

On a procedural note, by the way, this discussion is getting too complex to be handled just on the tracker, it should be part of the main discussion thread. @pombredanne could you summarise the discussion so far, and post that summary to the Discourse thread to allow others to comment? Thanks.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pfmoore Let me summarize all the discussion over the week-end and start a new discourse topic.

what if someone tries to ship a license file called entry-points.txt

FWIW, you can do that alright but it is even worse than that: I can overwrite the dist-info/METADATA with a METADATA arbitrary file entry in the license_files setup.cfg section :|
This is a wheel bug alright, and I will tackle this as that and that's not a topic for this PEP.

What's a topic is that License-File as it is specified is NOT a solution for sure.

Copy link

@pradyunsg pradyunsg Jul 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counter-proposal: it is a simple filename for a license file in the dist-info directory.

I actually think the PEP's current approach, is a much better approach than including the license file like this.

A case that comes to mind is a Sphinx theme I'm working on which is gonna be under the MIT license, and vendors material-icons with it's own different license in the same directory (copyrighted to Google). Referencing the specific files in the distribution is much better IMO, since I would like my metadata to not indicate that the entire project is copyrighted by Google. :)

I think what's in the PEP right now is a much more capable mechanism to handle such instances.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that simple. Some licences are applied by using a comment header and attaching multiple files (e.g. LGPL needs a file with full GPL text and a file with full LGPL text, some other licenses need LICENSE and NOTICE). While merging the text to one file is technically possible, it's non-standard.

Copy link

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer Rejected Idea 1 over what the PEP currently proposes, but other than that, I think this looks basically ready to go into the PEPs repository. :)

pep-9999.rst Outdated Show resolved Hide resolved
@pradyunsg
Copy link

pradyunsg commented Sep 7, 2020

@pombredanne Nudge. Consider going ahead and submitting this PEP (see PEP 1 for the details like sponsorship). We can definitely iterate on this as it moves forward even after the PEP is submitted, but it'll be nice to have a PEP number to link to from discussions about licensing in Python packages. :-)

@pradyunsg
Copy link

You might also want to update the base branch here. :)

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pypa/packaging-problems#41

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reported-by: Aliaksei Urbanski @Jamim
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne and others added 24 commits September 27, 2020 18:55
- Refactor intro with new and improved abstract, scope, non-scope,
motivation and rationale sections
- Add new Backwards Compatibility, Security and How to Teach sections
- Move Reference Implementation out of appendix as its own section
- Add new Rejected ideas section
- Add new License Expression example using setuptools in Appendix

Reported-by: Chris Jerdonek @cjerdonek
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reported-By: Pradyun Gedam <pradyunsg@gmail.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

Co-Authored-By: Pradyun Gedam <pradyunsg@gmail.com>
Reported-By: Pradyun Gedam <pradyunsg@gmail.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

Co-Authored-By: Pradyun Gedam <pradyunsg@gmail.com>
Reported-By: Pradyun Gedam <pradyunsg@gmail.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

Co-Authored-By: Pradyun Gedam <pradyunsg@gmail.com>
Reported-By: Pradyun Gedam <pradyunsg@gmail.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

Co-Authored-By: Pradyun Gedam <pradyunsg@gmail.com>
Reported-by: Nick Coghlan @ncoghlan
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reported-by: Nick Coghlan @ncoghlan
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
The case does nt matter, but there is a canonical case: if the case
is the not the standard canonical case, tools should issue a warning.

Reported-by: Oleg Grenrus @phadej
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reported-by: Oleg Grenrus @phadej
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Cabal uses both expressions and license files as proposed in this PEP

Reported-by: Oleg Grenrus @phadej
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reported-by: Oleg Grenrus @phadej
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
This help endsure that the expressions is fully parseable by a
conforming license expression processor

Reported-by: Oleg Grenrus @phadej
Reported-by: Nick Coghlan @ncoghlan
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reported-by: Nick Coghlan @ncoghlan
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Reported-by: Pradyun Gedam @pradyunsg
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>

Co-authored-by: Pradyun Gedam <pradyunsg@gmail.com>
Use latest SPDX spec 2.2 and SPDX license list 3.10

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne pombredanne changed the title [WIP: new PEP] Use SPDX license expressions in Core package metadata [new PEP] Use SPDX license expressions in Core package metadata Sep 27, 2020
Reported-by: Miro Hrončok @hroncok
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Owner Author

The belated PEP PR has been submitted at last at python#1625 !
I will also follow up on the python-dev ML.
I am closing this PR now (but will keep it around to keep the discussion) and you can track the formal PEP PR if you like.
Thank you for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants