Makefile: Add a Makefile and build script #22

wking · 2018-01-02T23:05:15Z

Based on the script in-flight with spdx/license-list-XML#592, but using an external test-data directory. This script does not cover fetching those directories, but it does cover the build process once they're fetched. I didn't build the JAR myself, so I'm just hashing the JAR that Bintray is currently returning and assuming that is uncompromised.

Docs on $@, $<, and $* are here.

This pull request is an alternative to #21, we'll want one or the other.

goneall · 2018-01-02T23:26:15Z

Could you document or propose changes to the license-list-XML travis files to use these when there is a PR or commit? For example, do we clone this repo into the license-list-XML working directory?

wking · 2018-01-03T00:00:17Z

On Tue, Jan 02, 2018 at 03:26:16PM -0800, goneall wrote: Could you document or propose changes to the license-list-XML travis files to use these when there is a PR or commit? For example, do we clone this repo into the license-list-XML working directory?

To minimize the number of balls in the air, I've punted on auto-publish for now. I can fill that in (with follow-up PRs?) if you like. I'd like the eventual license-list-XML .travis.yml to look something like: matrix: include: - language: java before_install: - git clone --depth 1 https://github.com/goneall/license-test-files - git clone --depth 1 https://github.com/spdx/license-list-data script: - 'if [[ master = "${TRAVIS_BRANCH}" ]]; then make -C license-list-data LICENSE_XML=../ TEST_DATA=../license-test-files publish; else make TEST_DATA=license-test-files validate-canonical-match; fi' - language: node_js node_js: - "node"

Based on the script in-flight with [1], but using an external test-data directory. This script does not cover fetching those directories, but it does cover the build process once they're fetched. I didn't build the JAR myself, so I'm just hashing the JAR that Bintray is currently returning and assuming that is uncompromised. Docs on $@, $<, and $* in [2]. The key in goneall.gpg is 0xEA760AF7CBD0E4F9013F06FB8CCBE63DE91BFF1C, imported with: $ gpg --search 0x8CCBE63DE91BFF1C gpg: data source: http://5.45.108.219:11371 (1) Gary O'Neall (Key for signing jar files) <gary@sourceauditor.com> 4096 bit RSA key 8CCBE63DE91BFF1C, created: 2017-12-13 Keys 1-1 of 1 for "0x8CCBE63DE91BFF1C". Enter number(s), N)ext, or Q)uit > 1 gpg: key 8CCBE63DE91BFF1C: public key "Gary O'Neall (Key for signing jar files) <gary@sourceauditor.com>" imported gpg: Total number processed: 1 gpg: imported: 1 The source keyserver is from hkp://pool.sks-keyservers.net. I then exported the key with: $ gpg --export 0x8CCBE63DE91BFF1C >goneall.gpg When exported without ASCII armor, such keys can be used as a keyring. The --no-default-keyring option disables the user's default keyring (if any. There probably won't be one on Travis). And the --keyring ./goneall.gpg slots in our local file. We need the ./ (at least with GnuPG 2.1.20), because gpg2(1) has: If the filename does not contain a slash, it is assumed to be in the GnuPG home directory ("~/.gnupg" if --homedir or $GNUPGHOME is not used). The output may still include: gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. but that's ok, because gpg still exits zero. And because of our keyring manipulation, we know the key is trusted. Signatures from any other key will instead show: gpg: Can't check signature: No public key and exit 2. [1]: spdx/license-list-XML#592 [2]: https://www.gnu.org/software/make/manual/html_node/Automatic-Variables.html

goneall · 2018-01-03T06:20:47Z

I like the makefile approach. I would like to move this into the license-list-XML repo and reference the local test data.

Since the script version already has this working and most of your recommendations have been implemented, I would like to roll out the scripts first then follow-up with an improved makefile implementation.

I would also like to implement an incremental build functionality. It takes a long time to process all of the license list.

We could detect which files were modified since the last build and enhance the SPDX tools to take a list of license-listXML files to be "compiled".

Using Make to orchestrate this would be a lot easier than implementing in a script.

wking · 2018-01-03T15:10:51Z

I would like to move this into the license-list-XML repo and reference the local test data.

I'd rather not, for the reasons given here and in spdx/license-test-files#6. Managing separate checkouts (as I do in spdx/license-list-XML#593) isn't hard, and could be automated with scripts/submodules. I think that's worth it to stay dry.

Since the script version already has this working and most of your recommendations have been implemented...

This PR is also implemented ;).

...I would like to roll out the scripts first then follow-up with an improved makefile implementation.

The scripts still auto-tag releases, and I think that's a mistake. Also, I don't see a hurry to auto-publish. spdx/license-list-XML#593 has enough to auto-validate without pulling in output-format-specific code. We can circle back and auto-publish in follow-up work.

goneall · 2018-01-03T17:05:38Z

I commented out the tagging for the releases.

Once we auto-publish and test, I no longer need to manually run the scripts myself. This saves us time and allows me to go on vacations ;)

In terms of the repositories for the tests, I don't think you and are are going to agree. I do think putting the tests in a repository separate from the source XML files places a burden on contributors and maintainers of the license-list-XML repo with little or no practical benefit.

As an example, the recent PR to change the test for the Bittorrent-1.0 license would cause a failure for the next commit to the license list XML. The commit would likely be unrelated to Bittorrent. It would be confusing for the committer. Since the submittal is in a different repository, it won't be very obvious what caused the error. To fix the problem, the license XML would have to be updated. Since they are in separate repositories, you would have to coordinate the submittals to avoid test failures. If you submit the license XML first, it may fail since it doesn't match the test. If you submit the test first you have the problem describe above. As an alternative, if we maintained the tests and the license XML in the same repository you could make one commit that included both the test update and license XML update. This would logically group the changes together. Make the changes atomic. Make it easier to revert and keep consistent.

wking · 2018-01-03T17:33:30Z

On Wed, Jan 03, 2018 at 05:05:38PM +0000, goneall wrote: Once we auto-publish and test, I no longer need to manually run the scripts myself. This saves us time and allows me to go on vacations ;)

I think everyone's on board with auto-testing as soon as possible. But I don't see a problem with publishing lagging a month or so if you happen to be on vacation. If license-list-XML expects to tag a release, I expect we can either negotiate the timing to avoid your vacations or grant other folks write access and have them push the release tag instead.

In terms of the repositories for the tests, I don't think you and are are going to agree.

Is there some middle ground where we can agree? Would you accept a submodule entry in license-list-XML to include the test data?

I do think putting the tests in a repository separate from the source XML files places a burden on contributors and maintainers of the license-list-XML repo with little or no practical benefit.

And I think the burden is the other way around ;). But I guess we can wait and see how your way plays out.

As an example, the recent PR to change the test for the Bittorrent-1.0 license would cause a failure for the next commit to the license list XML.

No it wouldn't, because license-list-XML is pinning a specific test-data commit [1].

Since they are in separate repositories, you would have to coordinate the submittals to avoid test failures.

With separate repositories, you'd still need to coordinate the *order* of license-list-XML landings (at least, if you wanted to stay with commits that were in the test-data master branch). For example, if the test-data repo landed a BitTorrent-1.0 fix and then a GPL-2.0 fix (for example), license-list-XML would have to merge their BitTorrent-1.0 PR before they merged thir GPL-2.0 fix. I don't think that's a large cost.

Make the changes atomic.

They'd still be atomic, since a single license-list-XML commit would bump the pinned test-data commit hash and update the XML. [1]: https://github.com/spdx/license-list-XML/pull/593/files#diff-354f30a63fb0907d4ad57269548329e3R6

goneall · 2018-01-03T20:02:27Z

Would you accept a submodule entry in license-list-XML to include the test data?

This would solve some of my concerns by making the changes and history more visible. Would this still require submitting the test file and license file in separate repos? If I were to make a change that required a new or updated test along with the XML file, I would like to be able to make that change in a single commit.

goneall · 2018-01-03T20:16:31Z

@wking separate question and topic:

If I were to change the interface to the RdfaLicenseGenerator tool to take in specific file patterns in place of a directory name, would you be able to update the makefile to only compile the files which have changed since the last time the license-list-data files were generated?

I am thinking the incremental builds would greatly improve the performance of the license checks and could also be used to generate the output files if we take the additional step of deleting output files on deletion of the associated XML file.

wking · 2018-01-03T20:23:37Z

On Wed, Jan 03, 2018 at 12:02:27PM -0800, goneall wrote: Would this still require submitting the test file and license file in separate repos? If I were to make a change that required a new or updated test along with the XML file, I would like to be able to make that change in a single commit.

You'd still need two commits: * One like [1] to license-test-files to update the test data. * Another like [2] to update the XML and bump the submodule. However, with the spdx/license-list-XML#492 approach, you *already* have those same two commits. Unless you drop [1], in which case folks need to pull in all of license-list-XML if they want access to this test data. The submodule approach (or the fairly similar commit-pinning from [3]) is giving you the ability to recycle [1] when you're working up [2] instead of requiring you to repeat [1] like you did in [4]. [1]: spdx/license-test-files@7771d37 [2]: wking/license-list-XML@86c2af5 [3]: spdx/license-list-XML@992d62e [4]: spdx/license-list-XML@2c1c392

wking · 2018-01-03T20:45:24Z

On Wed, Jan 03, 2018 at 08:16:32PM +0000, goneall wrote: If I were to change the interface to the RdfaLicenseGenerator tool to take in specific file patterns in place of a directory name, would you be able to update the makefile to only compile the files which have changed since the last time the license-list-data files were generated?

The cleanest rule for something like this would be to use static pattern rules [1], wildcard [2], substitution references [3], and shell [4]: LICENSE_PATHS = $(wildcard $(LICENSE_XML)/src/*.xml) LICENSES = $(LICENSE_PATHS:$(LICENSE_XML)/src/%.xml=%) VERSION = $(shell git -C "${LICENSE_XML}" describe --always || echo 'UNKNOWN') DATE = $(shell date -I) text/%.txt: $(LICENSE_XML)/src/%.xml spdx-tools-$(TOOL_VERSION).jar-valid resources/licenses-full.json java -jar spdx-tools-$(TOOL_VERSION).jar LicenseTextGenerator $< $@ $(VERSION) $(DATE) … additional rules for the other output types …

I am thinking the incremental builds would greatly improve the performance of the license checks and could also be used to generate the output files…

How often do you build? While efficient build tooling is nice, the current build only takes ~5min [5].

… if we take the additional step of deleting output files on deletion of the associated XML file.

The make rule I floated above is using mtimes to determine file updates. Using Git to determine file removals would require also tracking the last-built commit. That's possible (especially if it's always “the current commit's first parent”), but I'm not sure it's worth the trouble to trim a 5min build. [1]: https://www.gnu.org/software/make/manual/html_node/Static-Usage.html [2]: https://www.gnu.org/software/make/manual/html_node/Wildcard-Function.html [3]: https://www.gnu.org/software/make/manual/html_node/Substitution-Refs.html [4]: https://www.gnu.org/software/make/manual/html_node/Shell-Function.html [5]: https://travis-ci.org/spdx/license-list-XML/builds/324695123

wking · 2018-01-03T21:02:07Z

build.sh

+TOOLS=${TOOLS:-spdx-tools-*.jar}
+LICENSE_XML=${LICENSE_XML:-../license-list-XML}
+TEST_DATA=${TEST_DATA:-../license-test-files/simpleForGenerator}
+EXPECTED_WARNINGS="\"Duplicates licenses: AGPL-3.0-or-later, AGPL-3.0-only\",\"Duplicates licenses: GFDL-1.1-or-later, GFDL-1.1-only\",\"Duplicates licenses: GFDL-1.2-or-later, GFDL-1.2-only\",\"Duplicates licenses: GFDL-1.3-or-later, GFDL-1.3-only\",\"Duplicates licenses: GPL-1.0-or-later, GPL-1.0-only\",\"Duplicates licenses: GPL-2.0-or-later, GPL-2.0-only\",\"Duplicates licenses: GPL-3.0-or-later, GPL-3.0-only\",\"Duplicates licenses: LGPL-2.0-or-later, LGPL-2.0-only\",\"Duplicates licenses: LGPL-2.1-or-later, LGPL-2.1-only\",\"Duplicates licenses: LGPL-3.0-or-later, LGPL-3.0-only\",\"Duplicates licenses: MPL-2.0, MPL-2.0-no-copyleft-exception\""


We'll be able to drop this line if spdx/license-list-XML#593 lands with its expected-warnings file.

goneall · 2018-01-03T21:43:29Z

but I'm not sure it's worth the trouble to trim a 5min build.

OK - Let's leave it as is.

goneall · 2018-01-05T00:42:53Z

I'm still of the opinion to store the test files in the license-list-XML repository.

I'm OK re-opening the discussion once we decide on the format of the test repository. Jack is still working out the proposal.

As far as keeping the 2 repositories in sync, we could exclude the canonical text tests from the license list test data repo, or, we could automatically update the test data repo when the license-list-XML canonical text is updated using the same Travis-CI scripts which test the license XML files.

wking · 2018-01-05T01:02:13Z

On Fri, Jan 05, 2018 at 12:42:53AM +0000, goneall wrote: As far as keeping the 2 repositories in sync, we could exclude the canonical text tests from the license list test data repo…

That gets around the issue somewhat for canonical/original text. But either the alt/optional sections will remain unexercised in license-list-XML or *all* of the test data (including other variants) will have to go into license-list-XML.

… or, we could automatically update the test data repo when the license-list-XML canonical text is updated using the same Travis-CI scripts which test the license XML files.

If we end up with all the test data (including variants) in license-list-XML, I don't see a point to maintaining this repository. Folks who need the test data can just pull license-list-XML. I think we're missing an opportunity for a useful separation of concerns, but I can't think of any reasons beyond those I've already mentioned.

goneall · 2018-01-05T05:02:55Z

I think we're missing an opportunity for a useful separation of concerns

I do agree with this point - let's revisit this once we have the test repository fully up and running. We can always switch over to a different approach. In the tools, I have a much more robust testing framework which uses positive and negative tests. The only thing missing is a consensus on the file structure for the tests.

newtires · 2018-01-09T22:10:33Z

MIT License

Copyright (c) [year] [fullname]

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Update README examples

goneall · 2020-09-10T20:53:04Z

Since we have the testing up and running in the license-list-XML repo, closing this PR

wking force-pushed the makefile branch from 5b76b23 to f456c0a Compare January 2, 2018 23:06

wking mentioned this pull request Jan 2, 2018

Add automated testing and deployment of the license list to license-list-data repo spdx/license-list-XML#592

Closed

wking force-pushed the makefile branch from f456c0a to aeaf45c Compare January 3, 2018 06:14

wking mentioned this pull request Jan 3, 2018

Makefile: Use Travis to run canonical-matching tests spdx/license-list-XML#593

Merged

wking commented Jan 3, 2018

View reviewed changes

wking mentioned this pull request Jan 9, 2018

/licenses/licenses.json: All isDeprecatedLicenseId properties are false spdx/tools#144

Closed

carmenbianca pushed a commit to carmenbianca/license-list-data that referenced this pull request Apr 13, 2019

Merge pull request spdx#22 from rtgdk/patch

daa3a1f

Update README examples

ghost approved these changes Jul 6, 2020

View reviewed changes

goneall closed this Sep 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Makefile: Add a Makefile and build script #22

Makefile: Add a Makefile and build script #22

wking commented Jan 2, 2018

goneall commented Jan 2, 2018

wking commented Jan 3, 2018 via email

goneall commented Jan 3, 2018

wking commented Jan 3, 2018 •

edited

Loading

goneall commented Jan 3, 2018

wking commented Jan 3, 2018 via email

goneall commented Jan 3, 2018

goneall commented Jan 3, 2018

wking commented Jan 3, 2018 via email

wking commented Jan 3, 2018 via email

wking Jan 3, 2018

goneall commented Jan 3, 2018

goneall commented Jan 5, 2018

wking commented Jan 5, 2018 via email

goneall commented Jan 5, 2018

newtires commented Jan 9, 2018

goneall commented Sep 10, 2020

Makefile: Add a Makefile and build script #22

Makefile: Add a Makefile and build script #22

Conversation

wking commented Jan 2, 2018

goneall commented Jan 2, 2018

wking commented Jan 3, 2018 via email

goneall commented Jan 3, 2018

wking commented Jan 3, 2018 • edited Loading

goneall commented Jan 3, 2018

wking commented Jan 3, 2018 via email

goneall commented Jan 3, 2018

goneall commented Jan 3, 2018

wking commented Jan 3, 2018 via email

wking commented Jan 3, 2018 via email

wking Jan 3, 2018

Choose a reason for hiding this comment

goneall commented Jan 3, 2018

goneall commented Jan 5, 2018

wking commented Jan 5, 2018 via email

goneall commented Jan 5, 2018

newtires commented Jan 9, 2018

goneall commented Sep 10, 2020

wking commented Jan 3, 2018 •

edited

Loading