Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes support and a few feature ideas #335

Open
Atharex opened this issue Oct 3, 2021 · 6 comments
Open

Kubernetes support and a few feature ideas #335

Atharex opened this issue Oct 3, 2021 · 6 comments

Comments

@Atharex
Copy link

Atharex commented Oct 3, 2021

Hi!

I like the direction scancode.io is evolving to and would love to use it myself as part of my DevOps stack in kubernetes :)

Through my own deconstruction of your docker-compose, I've made this k8s manifests file, which should be able to deploy scancode.io as a kubernetes deployment. The input is of course the image built from your Dockerfile. Give it a try and let me know if it works correctly already (had to rename the file to a .txt, because .yaml is not supported as an attachment)
k8s.txt

Also I wanted to briefly mention a few ideas...

  1. In particular I'd love to get scancode.io running as a persistent service in kubernetes, where I can centrally schedule
    scanning runs on my docker images in my private registry and also be able to draw up reports to stakeholders in the company.

  2. Another really great feature would be to be able to point scancode to a github repository (e.g. https://github.com/nexB/scancode.io) and have it automatically download the main branch/latest release and perform a license scan on it, without manually uploading the zip/tar files of the releases

  3. Ideally both of the previous scenarios would support on-demand triggers and scheduled runs

  4. Also it would be great to have some sort of policy checks (e.g. like vulnerability policies in container vuln. scanners) where for each project (either a repository or docker image/VM) we could define a set of policies to have a project checked against. The result of a check would be either safe (passed), potentially safe (warning) or unsafe (failed), where each state would hold a custom defined set of licenses (e.g. permissive licenses for safe and any copyleft as unsafe) and they would be shown as such on the dashboard

@tdruez
Copy link
Contributor

tdruez commented Oct 4, 2021

Hi @Atharex thanks for your input!

Here's some answers:

Give it a try and let me know if it works correctly already

I'm actually curious to know if it works on your side first :)

Another really great feature would be to be able to point scancode to a github repository (e.g. https://github.com/nexB/scancode.io) and have it automatically download the main branch/latest release and perform a license scan on it, without manually uploading the zip/tar files of the releases

You can paste URLs such as "https://github.com/nexB/scancode.io/archive/refs/heads/main.zip" or "https://github.com/nexB/scancode.io/archive/refs/tags/v21.9.6.zip" in the "Download URLs" field.

Ideally both of the previous scenarios would support on-demand triggers and scheduled runs

One way to trigger Pipeline on demand is to use the REST API https://scancodeio.readthedocs.io/en/latest/rest-api.html#create-a-project

Also it would be great to have some sort of policy checks (e.g. like vulnerability policies in container vuln. scanners) where for each project (either a repository or docker image/VM) we could define a set of policies to have a project checked against. The result of a check would be either safe (passed), potentially safe (warning) or unsafe (failed), where each state would hold a custom defined set of licenses (e.g. permissive licenses for safe and any copyleft as unsafe) and they would be shown as such on the dashboard

There's support for license policies with a pass/warning/error system, see https://scancodeio.readthedocs.io/en/latest/scancodeio-settings.html#scancodeio-policies-file

Screenshot 2021-10-04 at 3 27 47 PM

You can click on the chart to reach a list filtered by the compliance alert.

@Atharex
Copy link
Author

Atharex commented Oct 4, 2021

Hi @Atharex thanks for your input!

Here's some answers:

Give it a try and let me know if it works correctly already

I'm actually curious to know if it works on your side first :)

Yes it works on my side. I'm just having problems because my security scanner complains about the vulnerabilities in the python:3.9 base image, so I tried also to use centos 7 as a base image. The vulnerabilties are not present in that case, but then the rpm_inspector binary also stops working because of some libraries missing :/ (for example "GCRYPT_1.6 not found" even though I have the gcrypt libraries installed)
sample-vulns

Another really great feature would be to be able to point scancode to a github repository (e.g. https://github.com/nexB/scancode.io) and have it automatically download the main branch/latest release and perform a license scan on it, without manually uploading the zip/tar files of the releases

You can paste URLs such as "https://github.com/nexB/scancode.io/archive/refs/heads/main.zip" or "https://github.com/nexB/scancode.io/archive/refs/tags/v21.9.6.zip" in the "Download URLs" field.

True, but such URLs are not preserved across other repositories (e.g. https://github.com/kubernetes/kubernetes/archive/refs/heads/main.zip does not work and the releases are always named differently, so I would have to figure their name out in advance before giving in the input). In an environment where I would like to have automated scheduled & triggered scanning runs, I would need to implement a separate workflow to figure out the valid URLs of the packages and then feed those names into scanpipe. Having to always manually find a repository's package URL and manually insert it into a scancode pipeline (even if it is through a REST API, it is still an additional step), breaks a bit of the automation part.

Ideally both of the previous scenarios would support on-demand triggers and scheduled runs

One way to trigger Pipeline on demand is to use the REST API https://scancodeio.readthedocs.io/en/latest/rest-api.html#create-a-project

For on-demand triggers yes, but I thought about a native support for scheduled runs, instead of having an external cron job triggering them.

Also it would be great to have some sort of policy checks (e.g. like vulnerability policies in container vuln. scanners) where for each project (either a repository or docker image/VM) we could define a set of policies to have a project checked against. The result of a check would be either safe (passed), potentially safe (warning) or unsafe (failed), where each state would hold a custom defined set of licenses (e.g. permissive licenses for safe and any copyleft as unsafe) and they would be shown as such on the dashboard

There's support for license policies with a pass/warning/error system, see https://scancodeio.readthedocs.io/en/latest/scancodeio-settings.html#scancodeio-policies-file

Screenshot 2021-10-04 at 3 27 47 PM

You can click on the chart to reach a list filtered by the compliance alert.

Nice, I did not find that feature! It is too hidden in my opinion... I believe it would deserve a more prominent location in your documentation :) preferably also as a tutorial.

@tdruez
Copy link
Contributor

tdruez commented Oct 4, 2021

True, but such URLs are not preserved across other repositories (e.g. https://github.com/kubernetes/kubernetes/archive/refs/heads/main.zip does not work and the releases are always named differently, so I would have to figure their name out in advance before giving in the input).

This should be handled by our code fetching library: aboutcode-org/fetchcode#38
Not sure what's the progress on this one.

For on-demand triggers yes, but I thought about a native support for scheduled runs, instead of having an external cron job triggering them.

We're in the process of migrating from Celery to RQ. See #333
We should be able to leverage the schedule feature from RQ for implementation in SC.io https://python-rq.org/docs/scheduling/#scheduling-jobs-for-execution

Nice, I did not find that feature! It is too hidden in my opinion... I believe it would deserve a more prominent location in your documentation :) preferably also as a tutorial.

Absolutely, entered as #337

@aalexanderr
Copy link
Contributor

aalexanderr commented Oct 4, 2021

@Atharex we're deploying scancode.io in k8s too!
We used https://kompose.io/ to transform docker-compose.yaml to k8s definitions

The definitions we have are currently a bit of a mess as its still work in progress but we can PR it when ready (I'll just say it should be soonTM as I'm having each of my time estimates blow in my face really hard lately)

Our deployment includes OIDC from #336 & keycloak

@Atharex
Copy link
Author

Atharex commented Oct 4, 2021

Sounds good @aalexanderr I'm looking forward to an official k8s manifest 👍

Regarding OIDC user management, it does sound nice. I'd love to see some Azure Active Directory integration there, but including keycloak in the deployment might be too much for ppl who already have an existing instance or another provider like Okta running in their infrastructure.

@aalexanderr
Copy link
Contributor

our deployment- its not on par with typical helm charts yet- e.g. no quick ability to use own components rather than bitnami ones, etc but it works and should get better with time :)
https://github.com/xerrni/scancode-kube

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants