Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support repo license #24872

Merged
merged 209 commits into from
Oct 1, 2024
Merged

Conversation

yp05327
Copy link
Contributor

@yp05327 yp05327 commented May 23, 2023

Close #278
Close #24076

Solutions:

Screen shot

Single License:
image

Multiple Licenses:
image

Triggers:

  • Push commit to default branch
  • Create repo
  • Mirror repo
  • When Default Branch is changed, licenses should be updated

Todo:

  • Save Licenses info in to DB when there's a change to license file in the commit
  • DB Migration
  • A nominal test?
  • Select which library to use(Support repo license #24872 (comment))
  • API Support
  • Add repo license table
  • Select license in settings if there are several licenses(Not recommended)
  • License board(later, not in this PR)
    image

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label May 23, 2023
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 23, 2023
@silverwind
Copy link
Member

@JakobDev
Copy link
Contributor

The last commit on google/licensecheck is from one Year ago, so it looks like it is not developed anymore. I would suggest using go-enry/go-license-detector.

@silverwind
Copy link
Member

silverwind commented May 23, 2023

The last commit on google/licensecheck is from one Year ago, so it looks like it is not developed anymore. I would suggest using go-enry/go-license-detector.

They should be compared by more than just that, ideally by feeding them a corpus of oddly licensed repos. One thing i'd be interested in if any of the two can detect license field in package.json for example.

@silverwind silverwind added the type/feature Completely new functionality. Can only be merged if feature freeze is not active. label May 23, 2023
@silverwind
Copy link
Member

silverwind commented May 23, 2023

Select license in settings if there are several licenses?

I wouldn't. License detection should be fully automatic and not overrideable, so that the git repo remains portable and is not tied to gitea-specific license override that can not be migrated.

Save Licenses info in to DB when there's a change to license file in the commit?

Some form of cache would be nice to not calculate license every render, but instead only on push to the branch.

License board?

Seems nice but I'd say something for later, and only if we can outsource the license metadata.

@techknowlogick
Copy link
Member

Thanks for this PR <3

Could you add at least a nominal test, as while upstream is tested we should ensure that our use of the library is acceptable.

I do think the storage of information in the database is a good idea as that's what we do with language stats.

As for which library we use, there are pros/cons for both. The google one is used by pkg.go.dev, and for enry we already use their library for language stats. Although I suspect that they both are somewhat close to getting what the actual license issue. so I think unless there is some significant performance impact or one library is wildly inaccurate either is fine.

@yp05327
Copy link
Contributor Author

yp05327 commented May 24, 2023

I tried the CLI tool of go-license-detector.

There are two problems:

  • Performance problem
    go-license-detector can automatically search license files which seems good,
    but if an unrelated big size file's name starts with license, it will takes lots of time to get the result.
    For example, in this folder:
    image
    Run ./license-detector ./1 is very fast, but ./license-detector ./ will take a lot of time.
    image
    image
    (It is too slow, so I canceled the process)

    And, for google/licensecheck, it only takes about 10s to finish the detection.
    image
    So it seems that google/licensecheck has higher performance.
    (Is it caused by different IO process? The test of license-detector used it's own IO process, the test of licensecheck used charset? Not sure about this now.)
    Test code:
    image
    Test Result:
    image
    (Stopped at licensedb.InvestigateLicenseText for long time)

  • Detect multi licenses.
    It seems that go-license-detector can not detect multi licenses in one file.
    In https://github.com/yp05327/test, there are two licenses in LICENSE
    In https://github.com/yp05327/test2, there is only one license in LICENSE
    but the result is same:
    image
    image
    google/licensecheck can detect all of them.
    image

@pull-request-size pull-request-size bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 1, 2024
@yp05327 yp05327 changed the title Display license of a repo Support repo license Oct 1, 2024
@wxiaoguang wxiaoguang dismissed their stale review October 1, 2024 03:18

dimiss changr request

@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged labels Oct 1, 2024
@techknowlogick techknowlogick merged commit 70b7df0 into go-gitea:main Oct 1, 2024
26 checks passed
@lunny lunny modified the milestones: 1.24.0, 1.23.0 Oct 1, 2024
zjjhot added a commit to zjjhot/gitea that referenced this pull request Oct 2, 2024
* giteaofficial/main:
  Fix javascript error when an anonymous user visiting migration page (go-gitea#32144)
  Make oauth2 code clear. Move oauth2 provider code to their own packages/files (go-gitea#32148)
  Support repo license (go-gitea#24872)
  Fix the logic of finding the latest pull review commit ID (go-gitea#32139)
  Ensure `GetCSRF` doesn't return an empty token (go-gitea#32130)
  Bump minio-go to latest version (go-gitea#32156)
@yp05327 yp05327 deleted the add-repo-license-display branch October 3, 2024 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. modifies/api This PR adds API routes or modifies them modifies/dependencies modifies/go Pull requests that update Go code modifies/internal modifies/migrations modifies/templates This PR modifies the template files modifies/translation size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. type/changelog Adds the changelog for a new Gitea version type/feature Completely new functionality. Can only be merged if feature freeze is not active.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it more obvious what license a repo is using Display a License tab