-
Notifications
You must be signed in to change notification settings - Fork 3k
SPDX and Non SPDX License Support #10479
Comments
Thanks for taking the time to put this together! The team and community have spent a significant amount of time considering the various trade-offs involved in how the I believe the current format is the best, or least worst, tradeoff between encouraging people to use a standard license declaration while allowing them to do something more complex or customized when necessary. As such, I'm not going to reopen this discussion, and with the exception of landing changes that make it easier to combine SPDX license identifiers with |
Pity that one such move is stopping you from another - which can be nicely sold as "an improvement on the first, moving forward using new insights obtained through additional user feedback" as that first change is still very recent and both address the same topic. Anyhow, this is your call, your decision, not ours to make, and you have made up your mind clearly. Thank you for all the hard work, both in the past and in the future, on Now, life goes on. Salute, |
@othiym23 @GerHobbelt Honestly the level of collaboration with the community on this issue was not great. Anyone reviewing the dialog will see that NPM's response to the issue (speaking of the people from NPM that participated) was greeted with "what is wrong with what we have". This flavour of dialog is not collaboration, particularly when the conversation is shut down. It is not responsible. Atm you have a captive audience and seem to be emboldened by this fact which is frustrating. Perhaps changes in the manifest should have been first vetted by the Technical Steering Committee before forcing such changes upon our software that diminish the utility of the metadata without adequately considering consequences or alternative. I will file this as an issue for the committee. Evolving to something that ensures that license metadata is available to developers and search tools from the manifest is in everyone's interest (regardless of the license applied to the software). |
npm doesn't have a Technical Steering Committee, and isn't a part of the Node.js project (aside from being distributed in its installers), nor do we presently participate in Node's TSC. When it comes to npm CLI product decisions, the buck does stop with me, for now, but the team tends to make product decisions by consensus, and we both value and accommodate community input wherever possible. In this case, in fact, the to move to SPDX license IDs was suggested (and the code written) by a member of the community, and both
We're in complete agreement here. The CLI has gotten a bit ahead of the web site here (the web team has been busy with other things), but one of these days this patch will land, and there have already been discussions about how to surface the license text on the web site as well. |
i'm really sad about this breaking change that disables my choice, breaks my code from several years, discriminates agains licenses not favoured by spdx and is not open for alternative license repositories. please give me at least the option to disable spdx-checking in |
How does it break your code?
Why isn't it acceptable to have I can't speak entirely to the motivations of @kemitchell (hi!) in choosing SPDX over other license vocabularies, but it is an enterprise, industry standard with a decent amount of support behind it. I'm not bound to it specifically, nor do I think npm is. It might be possible to create a superset of acceptable licenses for use as IDs, as long as they can be mapped to URLs that link to some kind of normative resource for the license. What I can say is that I'm unwilling to inflict more work on the package maintainers who've already moved to the new syntax, and feel pretty strongly that any solution that results in breaking changes for packages that are currently passing validation is an unacceptable level of churn, and that there isn't enough broken in the current validation to warrant the scope of changes proposed both on this issue and in #8918. I do believe that the current solution is a significant improvement over the freeform situation that was in place before; I've had to deal with the technical side of license conformance and validation before, and want to ensure that the time spent by software engineers writing license checkers and validators is as small as possible. The current behavior strikes a balance between encouraging good behavior and preventing people from getting stuff done – it has pretty good coverage of licenses (based on a standard), and at worst it prints a warning, which can be noisy, but isn't an error. |
I have posted the issue for the Node Technical Committee to examine. The way this has been handled feels unilateral. Truly it should not be this way. This affects every node developer since we have no choice but to prepare manifests and use NPM that is bundled with node. The current solution degrades search and discovery and adds more work for someone to investigate a non-SPDX license. I see that StongLoop's solution was not to include the second license (their proprietary license) in their dual license packages – to exclude it from their manifests. As you can see, even from this limited example, this is not working. It is not helpful to defend this when it is possible to move to a better approach. Alternatives are available to ensure the inclusion of license types in an agnostic way while still respecting your desire to validate against SPDX. We are both well intentioned with respect to the issue but I feel strongly that this has not been adequately resolved. I have referenced this for the Node Technical Committee here: |
@othiym23 The solution above impacts those that have had to move to SEE LICENSE IN to revert back to including the license of choice. Everything else is backwards compatible. I am certain there will be more satisfaction with a balanced solution. You could best determine whether or when you wish to enforce SPDX() for validation with a warning. I am recommending you make developers aware over some time before introducing noise with validation warnings. I will add that changes to NPM's website will not satisfy as a solution. The package.json goes beyond NPM and includes discovery in other tooling including private package repositories, search tools and sites. The package manifest is much more that it was five years ago and used for much more than package fetching itself. |
Hi @othiym23. Special greetings also to @scriptjs, @GerHobbelt, and @yetzt. Nice to meet more folks who care about this! There's been a lot written here, which has been the trend for issues on this topic, since the very beginning. I will also follow nodejs/node#3949. As far as I know, I was the first to PR validation of SPDX license IDs, in #8179 and related PRs linked there. I've since become affiliated with npm, but wasn't at the time. What I do with and for CLI remains wholly on my time and dime. I suppose I have a slight insider advantage, in that I can corner @othiym23 when I visit npm's offices, and he's too polite to kick me out or flee. In return, he @-mentions me on PRs he know I can't leave alone 😉. Indeed, I'm happy to answer any questions about my motivations or thought process behind adding validation, or to point out past conversations where it's come up. Long story short, it was about making license audit easier for my clients, who are both companies and freelancers. The last straw was an audit of existing I can't take credit for spec'ing SPDX in metadata, though. Nor can anyone at npm. The SPDX language in the package.json docs came from similar guidelines for RubyGems. RG recently accepted a PR of mine to add validation, too. Other language repo docs mention it; I'm not sure whether they validate. As for the PR itself: A lot of thought here. Much respect! It's not an easy problem. But before diving into all the careful details, I'd step back, because I think the details obscure the fundamental question being asked: Should npm default to expecting a machine-readable Before, I very strongly support |
@kemitchell Hello. I agree with validation by default but this does not need to be a situation where we cannot have validation via SPDX to the exclusion of a license type that is not in SPDX. In fact you can have both and backwards compatibility with what has been proposed. Further, other sources of license validation may also be included if desired. The objective of license validation is a good one but it should not be done to the exclusion or preference for any type of license. This is metadata. That said, this is an important enough issue since it impacts much more than npm and the way we all describe this in our manifests and for an increasing amount of tooling for discovery and seach. |
First and foremost: @scriptjs, I hope you know, more than anything, that I respect your thought on this, and am just really excited to run into someone else who cares about licensing! I worry about opening up to a variety of validation schemes, because from an audit point of view, a diversity of ways to express the same standard license makes using the metadata without human intervention impractical. This is especially true since MIT, the BSD licenses (often ambiguously specified), ISC (thanks, As string-to-license maps go, SPDX is by far the most inclusive. It includes many near-variants and even vanity licenses like The current npm approach is:
Going back to my generalization of the problem, this combination as a few advantages:
There are certain edge cases that no automated system is going to avoid completely, like packages that include conflicting license terms or metadata. Detecting and handling those in software is an npm-scale project unto itself. |
@kemitchell I hope you realize I already agree with more than 90% of what you have written. The notion of validation against other schemes is only a selling point of the suggested scheme since it is open to this possibility in the future should something else come up. I realize SPDX is a decent set of open licenses. That said, it is a subset. Being a subset of open licences excludes every proprietary license and open license that has not been submitted to the license review committee of SPDX. This set me off at first because warnings in NPM on build tools and fetching must be handled. No one wants this occurring in their code. I appreciate the effort you have made in a SPDX validator but this does not negate the fact that any other form of licensing metadata is now excluded from the license property. They are now exchanged for statements that do not make the metadata as useful. It would have been helpful to have consulted at this stage but I feel this was pushed out without considering the impact fully. This is occurring at a time where the package.json is only growing in importance for other tools for configuration and private package management. Thus the impact is more far reaching than NPM. Anyway I have been in general disagreement with an approach that cannot include license types for all software. Metadata is for discovery by humans and machines. This comes unglued with references and not data. There will be nothing to handle the references to infer any license type when it is not a SPDX licence and that is a sad day for anyone outside the Linux Foundation. |
@scriptjs, yes, I think we understand each other. So glad to know that. A few last thoughts: If a competitor standard for standard open-source license metadata cropped up---I'd say it's unlikely, but possible---I might support switching to it, but I'd never supporting running two "standards" for exactly the same semantics in parallel. One or the other should win out, to make it easy for programs consuming the JSON. SPDX covers the vast majority of open-source licenses folks want to use, based on current behavior in the public registry. The number of packages that don't use license terms with assigned SPDX identifiers is very small. Even just "MIT", the BSDs, Apache, and ISC, all on OSI's much more limited list, cover a huge percentage. SPDX as a whole does not exclude any licenses. It doesn't assign identifiers for every license, proprietary or arguably open-source. (I churn out several new custom software licenses per month.) But arbitrary license text is supported. This is so because SPDX is in fact a broader standard for freestanding XML metadata files, distinct from package manager metadata, that may reference non-standard, arbitrary license term content with a special identifiers ( npm uses only the subset of SPDX license expressions without When I'm reviewing license terms, even npm's approach is too much for my taste. If a package isn't clearly marked with a single, OSI-approved, SPDX-identified form license, I want to inspect the whole package. This really goes to my measure of usefulness: JSON is for programs, the rest is for people. Whatever software can't do reliably on the basis of structured data, people (ehem, lawyers) will do. The metadata and even the LICENSE file, if there is one, become just two of a hundred factors indicating how risky it is to use the code. I'm sorry to hear the error messages may have caused you pain. I'd be lying if I let on I didn't see some pain like that coming. In my defense, LTS establishes for the first time a real, accountable right to expect stability from Node and the npm that comes with it. More importantly, though, we're about to hit a quarter-million packages in the npm public registry. That registry serves a package manager whose clear priorities are making it fast, space-efficient, and easy to construct massively nested deps trees. In my mind, it wasn't just "Could npm be the first open-source community to make clear licensing the norm?". It was also "If the npm community doesn't start taking machine-readable licensing seriously now, will using npm in a natural way ever be safe when licensing is a concern?". |
@kemitchell, @othiym23 I wish JSON were for machines but sadly most programmers do their share of reading and writing it manually, particularly in initializing app and module development. Again, I am sold on all the reasons for license validation and SPDX as an appropriate choice for validation. You also understand where I am coming that ensuring that non SPDX license types should somehow be included in the license property of the manifest, otherwise we all loose human and machine discovery. References alone obfuscate the issue if the license is not a SPDX license. We don't need to degrade the capability or quality of the manifest to achieve our goals. It is rather sad that an open manifest standard has never come about. You can imagine we would not be debating anything if this were so. As a programmer, the number of manifests and pieces of configuration can make one sick these days and more noise only makes things worse. I seem to have determined a solution that will work in the interim. It will validate without throwing warnings and allows a non-SPDX within the metadata without conflicting with the SPDX validation scheme. If we can agree on this solution and commit to a decent discussion about how this might evolve for future, I would be satisfied. The solution is as follows and currently requires no changes to NPM or to the validation scheme.
Examples
This alleviates the primary concern that non SPDX licenses cannot be included in the package.json metadata for developers and machines to discover. It will still validate free form and what is within brackets could be harvested by tools. Can we agree on this as a recommended solution. ie. You may optionally include the non-SPDX license(s) following the reference. |
@othiym23 i'm putting most of my code in the public domain. therefore i can appreciate the need for machine-readable metadata, working mostly on open data projects myself. but making changes that make the existing metadata of five years of published code invalid and stipulating one closed and restricted repository of licenses (SPDX) as the only, once and forever choice for licensing your code (with some awkward escape hatch as an afterthought) contradicts and defies all the openness, egality, inclusiveness and beauty of node and npm i once fell in love with. i suggested to create an (npm-style) open repository of all licenses with no discrimination (which could possibly include a graph of license compatibility for more automation, yay!). please make openness and inclusion a prioroty instead of an afterthought. |
We've gone back-and-forth pretty quick so far, but I want to take some time to think through your proposal. A couple things I have in mind:
I'll let it roll around at least a few days before weighing in again. |
@kemitchell with my "i'm trying to break everything" hat on, i will write a valid |
@yetzt: ROFL. Lawyers are inherently lazy. Like programmers 😉 [Edit: Miss the functional programming pun? 😄] |
@yetzt: Much love to German hackers! Really sad I could not go to CCC this summer. I can't give legal advice on licensing and how the public domain works over the Internet---especially under German law---but I assure you there are many great options for making your work as open as possible with npm right now. License metadata validation is in fact all about making sure folks can be sure they have the rights to use your work without thinking about it, using software. If you want to give others all the rights in the world to use your stuff, the idea is that npm should help you make that as clear as possible. If you're interested in the public domain, please do take time to read up on how works actually get into the public domain. I strongly recommend you start at Wikipedia. Many great hackers misunderstand the public domain! I'd also strongly recommend you skim some "licenses":
The SPDX legal working group is a little old-school, but definitely not closed. See the new SPDX license request process, inclusion principles, and wiki tracker. They track copies of all the licenses they accept in a Git repository, too. Members of the SPDX tech group recently worked together with me to make structured data for the license IDs available as JSON on their website, which is how both npm and RubyGems now find out about them. There are plans to make license texts available in a structured way, too. Really: Who wants to tend a list of form contracts? The SPDX people have done it for years with little recognition. I'd also like to point out, AFAIK, the only practical effect a bad license string is supposed to have on any npm package is a warning to stderr. npmjs.com is not showing some licenses correctly at the moment, but the PR to fix that is on its way. |
@kemitchell There will always be a potential mismatch between metadata and actual files as long as there are human beings regardless of SPDX. The reality is mostly unlikely and if discovered can be corrected by module authors. Obviously there is an expectation the license identified will be included in the package when it is identified in the package metadata. What I prefer to see in place is the proposal that started this thread that would leave the license metadata fully in tact for developers, leaving things backwards compatible rather than undermining this resource that we use in our work. Developers are the main users of NPM and I sincerely hope you appreciate this fully. I have suggested the second proposal in an attempt to mitigate the impact of eliminating license types from the manifest completely (that is my primary concern). The problem you are suggesting to avoid is the problem you create for every developer that would have to inspect a file that is not a SPDX license. I note that the second proposal does adequately address the issue that @yetzt has raised, but does work with the current long term release of node without modifications. Please, I don't want to get into the semantics of interpreting licenses. For practical purposes, packages generally contain one or more licenses. It is up to an author to identify the license and include it if it has not identified by reference. Let's not go there. This issue that we must resolve is one of metadata and discovery by humans and machines that is in jeopardy as the result of the changes. Lastly, I am confused somewhat by your role in this. I am looking at this as a developer. As one of thousands of developers that access and discover modules and data on multiple systems every day. I tend to work more with a private package repository than NPM most days. The package.json is the source of truth about the package. I tend to write as many or more packages than I consume from NPM. Metadata we have been able to assess quickly is about to become increasingly ambiguous for good across our ecosystem. This is something I feel everyone needs to be concerned about. Even worse is being painted into a corner and forced to write awkward, unhelpful text as metadata that assists no one just for the sake ceasing warnings while degrading the information I am passing to users of our software. |
For the record, my current position/hoped-for compromise:
This I estimate is the least disruptive & minimal change to what currently is. (Without having looked at the npm code itself; estimated by looking at the proposed change itself.) P.S.: A decent alternative solution to the same problem as suggested in #8918 (comment) is also fine but may be more work? (while I'ld prefer that one from an unambiguous information dissemination perspective) Or should we kick this up to SPDX themselves to get something like that included in their specification so as to cover commercial and other 'unsupported / one-off' licenses in their spec and thus line it up for inclusion in npm (and others)? Looking at http://spdx.org/sites/spdx/files/SPDX-2.0.pdf section: Appendix IV: SPDX License Expressions (pages 81-88) however indicates that this might be counter to their process as they already have another way to potentially 'solve' this issue at least from their own perspective: 'DocumentRef-'. |
@othiym23, this has brought one interesting use case to mind. I can summarize, then I'm afraid I've given all I can here. The upshot for me is that while I am not in favor of any changes to how npm handles license metadata at this time, there was good input here. The use case:
At this point, "LibCo Production Use License" has all the meaning it needs at DevCo. It's been pseudo-standardized, in that it's now a string identifier with an unambiguous connection to a particular legal outcome. DevCo programmers cruising an index of DevCo's private registry, which looks at npm-standard package.json props, know the consequences of using packages with that license value---perhaps other libs from LibCo, too---and they know they can trust the metadata alone. DevOps can roll a shell script that checks for Current The prospect of this pain doesn't change my view, however, for three reasons:
To say it another way: In my mind, npm package metadata for license terms should be a lingua franca among all developers with npm installed. That lingua franca should be optimized for packages that do the most traveling, meaning open-source packages. It should be as simple as possible to parse, interpret, and translate. It should have one escape hatch for "the license vocabulary all npm users share can't express what's going on here". Organizations will always develop their own "dialects" of license-talk, with their own lists of non-standard license nicknames that local speakers understand. This should be encouraged; it has to be accepted. But nobody else should have to know any of these dialects to understand content in valid npm-standard |
I am opening this issue that was closed while discussion was ongoing for an appropriate solution to #8918. Discussion has been ongoing for months over this in #8918 as well as #8291, #8557, #8773, #8795 so it has touched a nerve. I urge NPM to listen and collaborate for an appropriately considered solution that will work for everyone.
The latest solution recommended is as follows that has the following benefits:
Valid SPDX licenses
Non SPDX licenses
May Emit Warning
Backwards compatible but a SPDX License.
My recommendation is to inform NPM users of the change of the license property and give module developers some time before driving everyone crazy with SPDX warnings as has been done when you imposed it. Perhaps blog about the change first to allow voluntary revisions until a certain date where warnings could be emitted. One way or the other, I urge you to engage users before disturbing software and build systems with noise.
I have not heard anyone come out against SPDX, only the way you have chosen to implement it that is not backwards compatible to about 5 years of data, excludes non SPDX licenses from package metadata, and creates a non standard SPDX description of "SEE LICENSE IN" that makes the language of the metadata awkward. ie.
Metadata is a source of truth and these type of phrases are meaningless and only require more investigation into a repo or package.
The text was updated successfully, but these errors were encountered: