Linux (OpenJDK 11) | Windows (Oracle JDK 11) | JitPack (OpenJDK 11) |
---|---|---|
License status | Code quality | TODOs | Interact with us! |
---|---|---|---|
The OSS Review Toolkit (ORT) aims to assist with the tasks that commonly need to be performed in the context of license compliance checks, especially for (but not limited to) Free and Open Source Software dependencies.
It does so by orchestrating a highly customizable pipeline of tools that abstract away the underlying services. These tools are implemented as libraries (for programmatic use) and exposed via a command line interface (for scripted use):
- Analyzer - determines the dependencies of projects and their meta-data, abstracting which package managers or build systems are actually being used.
- Downloader - fetches all source code of the projects and their dependencies, abstracting which Version Control System (VCS) or other means are used to retrieve the source code.
- Scanner - uses configured source code scanners to detect license / copyright findings, abstracting the type of scanner.
- Advisor - retrieves security advisories for used dependencies from configured vulnerability data services.
- Evaluator - evaluates license / copyright findings against customizable policy rules and license classifications.
- Reporter - presents results in various formats such as visual reports, Open Source notices or Bill-Of-Materials (BOMs) to easily identify dependencies, licenses, copyrights or policy rule violations.
Preliminary binary artifacts for ORT are currently available via JitPack. Please note that due to limitations with the JitPack build environment, the reporter is not able to create the Web App report.
Install the following basic prerequisites:
- Git (any recent version will do).
Then clone this repository. If you intend to run tests, you need to clone with submodules by running
git clone --recurse-submodules
. If you have already cloned non-recursively, you can initialize submodules afterwards
by running git submodule update --init --recursive
.
Install the following basic prerequisites:
- Docker 18.09 or later (and ensure its daemon is running).
- Enable BuildKit for Docker.
Change into the directory with ORT's source code and run docker build -t ort .
.
Install these additional prerequisites:
- Java Development Kit (JDK) version 11 or later; also remember to set the
JAVA_HOME
environment variable accordingly.
Change into the directory with ORT's source code and run ./gradlew installDist
(on the first run this will bootstrap
Gradle and download all required dependencies).
ORT can now be run using
./cli/build/install/ort/bin/ort --help
Note that if you make any changes to ORT's source code, you would have to regenerate the distribution using the steps above.
To avoid that, you can also build and run ORT in one go (if you have the prerequisites from the Build natively section installed):
./gradlew cli:run --args="--help"
Note that in this case the working directory used by ORT is that of the cli
project, not the directory gradlew
is
located in (see gradle/gradle#6074).
Like for building ORT from sources you have the option to run ORT from a Docker image (which comes with all runtime dependencies) or to run ORT natively (in which case some additional requirements need to be fulfilled).
After you have built the image as described above, simply run
docker run <DOCKER_ARGS> ort <ORT_ARGS>
. You typically use <DOCKER_ARGS>
to mount the project directory to analyze
into the container for ORT to access it, like:
docker run -v /workspace:/project ort --info analyze -f JSON -i /project -o /project/ort/analyzer
You can find further hints for using ORT with Docker in the documentation.
First of all, make sure that the locale of your system is set to en_US.UTF-8
as using other locales might lead to
issues with parsing the output of some external tools.
Then install any missing external command line tools as listed by
./cli/build/install/ort/bin/ort requirements
or
./gradlew cli:run --args="requirements"
Then run ORT like
./cli/build/install/ort/bin/ort --info analyze -f JSON -i /project -o /project/ort/analyzer
or
./gradlew cli:run --args="--info analyze -f JSON -i /project -o /project/ort/analyzer"
A basic ORT pipeline (using the analyzer, scanner and reporter) can easily be run on Jenkins CI by using the Jenkinsfile in a (declarative) pipeline job. Please see the Jenkinsfile itself for documentation of the required Jenkins plugins. The job accepts various parameters that are translated to ORT command line arguments. Additionally, one can trigger a downstream job which e.g. further processes scan results. Note that it is the downstream job's responsibility to copy any artifacts it needs from the upstream job.
Please see Getting Started for an introduction to the individual tools.
ORT supports several environment variables that influence its behavior:
Name | Default value | Purpose |
---|---|---|
ORT_DATA_DIR | ~/.ort |
All data, like caches, archives, storages (read & write) |
ORT_CONFIG_DIR | $ORT_DATA_DIR/config |
Configuration files, see below (read only) |
ORT_HTTP_USERNAME | Empty (n/a) | Generic username to use for HTTP(S) downloads |
ORT_HTTP_PASSWORD | Empty (n/a) | Generic password to use for HTTP(S) downloads |
http_proxy | Empty (n/a) | Proxy to use for HTTP downloads |
https_proxy | Empty (n/a) | Proxy to use for HTTPS downloads |
ORT looks for its configuration files in the directory pointed to by the ORT_CONFIG_DIR
environment variable. If this
variable is not set, it defaults to the config
directory below the directory pointed to by the ORT_DATA_DIR
environment variable, which in turn defaults to the .ort
directory below the current user's home directory.
The following provides an overview of the various configuration files that can be used to customize ORT behavior:
The main configuration file for the operation of ORT. This configuration is maintained by an administrator who manages the ORT instance. In contrast to the configuration files in the following, this file rarely changes once ORT is operational.
Format | Scope | Default location | Default value |
---|---|---|---|
HOCON | Global | $ORT_CONFIG_DIR/ort.conf |
Empty (built-in) |
The reference configuration file gives a good impression about the content of the main ORT configuration file. It consists of sections related to different sub components of ORT. The meaning of these sections and the properties they can contain is described together with the corresponding sub components.
While the file is rather static, there are means to override configuration options for a specific run of ORT or to customize the configuration to a specific environment. The following options are supported, in order of precedence:
-
Properties can be defined via environment variables by using the full property path as the variable name. For instance, one can override the Postgres schema by setting
ort.scanner.storages.postgresStorage.schema=test_schema
. The variable's name is case sensitive. Some programs like Bash do not support dots in variable names. For this case, the dots can be replaced by double underscores, i.e., the above example is turned intoort__scanner__storages__postgresStorage__schema=test_schema
. -
In addition to that, one can override the values of properties on the command line using the
-P
option. The option expects a key-value pair. Again, the key must define the full path to the property to be overridden, e.g.-P ort.scanner.storages.postgresStorage.schema=test_schema
. The-P
option can be repeated on the command line to override multiple properties. -
Properties in the configuration file can reference environment variables using the syntax
${VAR}
. This is especially useful to reference dynamic or sensitive data. As an example, the credentials for the Postgres database used as scan results storage could be defined in thePOSTGRES_USERNAME
andPOSTGRES_PASSWORD
environment variables. The configuration file can then reference these values as follows:postgres { url = "jdbc:postgresql://your-postgresql-server:5444/your-database" username = ${POSTGRES_USERNAME} password = ${POSTGRES_PASSWORD} }
A list of copyright statements that are considered garbage, for example statements that were incorrectly classified as copyrights by the scanner.
Format | Scope | Default location | Default value |
---|---|---|---|
YAML / JSON | Global | $ORT_CONFIG_DIR/copyright-garbage.yml |
Empty (n/a) |
A file to correct invalid or missing package metadata, and to set the concluded license for packages.
Format | Scope | Default location | Default value |
---|---|---|---|
YAML / JSON | Global | $ORT_CONFIG_DIR/curations.yml |
Empty (n/a) |
A directory that contains license texts which are not provided by ORT.
Format | Scope | Default location | Default value |
---|---|---|---|
Text | Global | $ORT_CONFIG_DIR/custom-license-texts/ |
Empty (n/a) |
A Kotlin script that enables the injection of how-to-fix texts in markdown format for ORT issues into the reports.
Format | Scope | Default location | Default value |
---|---|---|---|
Kotlin script | Global | $ORT_CONFIG_DIR/how-to-fix-text-provider.kts |
Empty (n/a) |
A file that contains user-defined categorization of licenses.
Format | Scope | Default location | Default value |
---|---|---|---|
YAML / JSON | Global | $ORT_CONFIG_DIR/license-classifications.yml |
Empty (n/a) |
Configurations to resolve any issues or rule violations by providing a mandatory reason, and an optional comment to justify the resolution on a global scale.
Format | Scope | Default location | Default value |
---|---|---|---|
YAML / JSON | Global | $ORT_CONFIG_DIR/resolutions.yml |
Empty (n/a) |
A configuration file, usually stored in the project's repository, for license finding curations, exclusions, and issues or rule violations resolutions in the context of the repository.
Format | Scope | Default location | Default value |
---|---|---|---|
YAML / JSON | Repository (project) | [analyzer-input-dir]/.ort.yml |
Empty (n/a) |
A single file or a directory with multiple files containing configurations to set provenance-specific path excludes and
license finding curations for dependency packages to address issues found within a scan result. The helper-cli
's
GeneratePackageConfigurationsCommand
can be used to populate a directory with template package configuration files.
Format | Scope | Default location | Default value |
---|---|---|---|
YAML / JSON | Package (dependency) | $ORT_CONFIG_DIR/package-configurations/ |
Empty (n/a) |
The file containing any policy rule implementations to be used with the evaluator.
Format | Scope | Default location | Default value |
---|---|---|---|
Kotlin script (DSL) | Evaluator | $ORT_CONFIG_DIR/rules.kts |
Empty (n/a) |
The analyzer is a Software Composition Analysis (SCA) tool that determines the dependencies of software projects
inside the specified input directory (-i
). It does so by querying the detected package managers; no modifications
to your existing project source code, like applying build system plugins, are necessary for that to work. The tree of
transitive dependencies per project is written out as part of an
OrtResult in YAML (or
JSON, see -f
) format to a file named analyzer-result.yml
in the specified output directory (-o
). The output file
exactly documents the status quo of all package-related meta-data. It can be further processed or manually edited before
passing it to one of the other tools.
Currently, the following package managers are supported:
- Bower (JavaScript)
- Bundler (Ruby)
- Cargo (Rust)
- Carthage (iOS / Cocoa)
- Composer (PHP)
- Conan (C / C++, experimental as the VCS locations often times do not contain the actual source code, see issue #2037)
- dep (Go)
- DotNet (.NET, with currently some limitations)
- Glide (Go)
- Godep (Go)
- GoMod (Go, experimental as only proxy-based source artifacts but no VCS locations are supported)
- Gradle (Java)
- Maven (Java)
- NPM (Node.js)
- NuGet (.NET, with currently some limitations)
- PIP (Python)
- Pipenv (Python)
- Pub (Dart / Flutter)
- SBT (Scala)
- SPDX (SPDX documents used to describe projects or packages)
- Stack (Haskell)
- Yarn (Node.js)
Taking an ORT result file with an analyzer result as the input (-i
), the downloader retrieves the source code of
all contained packages to the specified output directory (-o
). The downloader takes care of things like normalizing
URLs and using the appropriate VCS tool to checkout source code from version
control.
Currently, the following Version Control Systems (VCS) are supported:
This tool wraps underlying license / copyright scanners with a common API so all supported scanners can be used in the
same way to easily run them and compare their results. If passed an ORT result file with an analyzer result (-i
), the
scanner will automatically download the sources of the dependencies via the downloader and scan them afterwards.
We recommend to use ORT with one of the following scanners as their integration has been thoroughly tested:
Additionally, the following reference implementations exist:
For a comparison of some of these, see this Bachelor Thesis.
In order to not download or scan any previously scanned sources again, or to reuse scan results generated via other services, the scanner can be configured to use so-called storage backends. Before processing a package, it checks whether compatible scan results are already available in one of the storages declared; if this is the case, they are fetched and reused. Otherwise, the package's source code is downloaded and scanned. Afterwards, the new scan results can be put into a storage for later reuse.
It is possible to configure multiple storages to read scan results from or to write scan results to. For reading, the declaration order in the configuration is important, as the scanner queries the storages in this order and uses the first matching result. This allows a fine-grained control over the sources, from which existing scan results are loaded. For instance, you can specify that the scanner checks first whether results for a specific package are available in a local storage on the file system. If this is not the case, it can look up the package in a Postgres database. If this does not yield any results either, a service like ClearlyDefined can be queried. Only if all of these steps fail, the scanner has to actually process the package.
When storing a newly generated scan result the scanner invokes all the storages declared as writers. The storage operation is considered successful if all writer storages could successfully persist the scan result.
The configuration of storage backends is located in the ORT configuration file. (For the general structure of this file and the set of options available refer to the reference configuration.) The file has a section named storages that lists all the storage backends and assigns them a name. Each storage backend is of a specific type and needs to be configured with type-specific properties. The different types of storage backends supported by ORT are described below.
After the declaration of the storage backends, the configuration file has to specify which ones of them the scanner should use for looking up existing scan results or to store new results. This is done in two list properties named storageReaders and storageWriters. The lists reference the names of the storage backends declared in the storages section. The scanner invokes the storage backends in the order they appear in the lists; so for readers, this defines a priority for look-up operations. Each storage backend can act as a reader; however, some types do not support updates and thus cannot serve as writers. If a storage backend is referenced both as reader and writer, the scanner creates only a single instance of this storage class.
The following subsections describe the different storage backend implementations supported by ORT. Note that the name of
a storage entry (like fileBasedStorage
) can be freely chosen. That name is then used to refer to the storage from the
storageReaders
and storageWriters
sections.
By default, the scanner stores scan results on the local file system in the current user's home directory (i.e.
~/.ort/scanner/scan-results
) for later reuse. Settings like the storage directory and the compression flag can be
customized in the ORT configuration file (-c
) with a respective storage configuration:
ort {
scanner {
storages {
fileBasedStorage {
backend {
localFileStorage {
directory = "/tmp/ort/scan-results"
compression = false
}
}
}
}
storageReaders: [
"fileBasedStorage"
]
storageWriters: [
"fileBasedStorage"
]
}
}
Any HTTP file server can be used to store scan results. Custom headers can be configured to provide authentication credentials. For example, to use Artifactory to store scan results, use the following configuration:
ort {
scanner {
storages {
artifactoryStorage {
backend {
httpFileStorage {
url = "https://artifactory.domain.com/artifactory/repository/scan-results"
headers {
X-JFrog-Art-Api = "api-token"
}
}
}
}
}
storageReaders: [
"artifactoryStorage"
]
storageWriters: [
"artifactoryStorage"
]
}
}
To use PostgreSQL for storing scan results you need at least version 9.4, create a database with the client_encoding
set to UTF8
, and a configuration like the following:
ort {
scanner {
storages {
postgresStorage {
url = "jdbc:postgresql://example.com:5444/database"
schema = "schema"
username = "username"
password = "password"
sslmode = "verify-full"
}
}
storageReaders: [
"postgresStorage"
]
storageWriters: [
"postgresStorage"
]
}
}
While the specified schema already needs to exist, the scanner will itself create a table called scan_results
and
store the data in a jsonb column.
If you do not want to use SSL set the sslmode
to disable
, other possible values are explained in the
documentation. For other supported configuration
options see ScanStorageConfiguration.kt.
ClearlyDefined is a service offering curated metadata for Open Source components. This includes scan results that can be used by ORT's scanner tool (if they have been generated by a compatible scanner version with a suitable configuration). This storage backend queries the ClearlyDefined service for scan results of the packages to be processed. It is read-only; so it will not upload any new scan results to ClearlyDefined. In the configuration the URL of the ClearlyDefined service needs to be set:
ort {
scanner {
storages {
clearlyDefined {
serverUrl = "https://api.clearlydefined.io"
}
}
storageReaders: [
"clearlyDefined"
]
}
}
The advisor retrieves security advisories from configured services. It requires the analyzer result as an input.
The advisor needs to be configured in the ORT configuration file:
ort {
advisor {
nexusiq {
serverUrl = "https://nexusiq.ossreviewtoolkit.org"
username = myUser
password = myPassword
}
}
}
Currently Nexus IQ Server (-a NexusIQ
) is the only supported security data
provider.
The evaluator is used to perform custom license policy checks on scan results. The rules to check against are implemented as scripts (currently Kotlin scripts, with a dedicated DSL, but support for other scripting can be added as well. See rules.kts for an example file.
The reporter generates a wide variety of documents in different formats from ORT result files. Currently, the following formats are supported (reporter names are case-insensitive):
- Amazon OSS Attribution Builder document (experimental,
-f AmazonOssAttributionBuilder
) - AsciiDoc Template (
-f AsciiDocTemplate
)- Content customizable with Apache Freemarker templates and AsciiDoc
- Supports all AsciiDoc backends
- PDF style customizable with Asciidoctor PDF themes
- CycloneDX BOM (
-f CycloneDx
) - Excel sheet (
-f Excel
) - GitLabLicenseModel (
-f GitLabLicenseModel
)- A nice tutorial video has been published by GitLab engineer @mokhan.
- NOTICE file in two variants
- List license texts and copyrights by package (
-f NoticeTemplate
) - Summarize all license texts and copyrights (
-f NoticeTemplate -O NoticeTemplate=template.id=summary
) - Customizable with Apache Freemarker templates
- List license texts and copyrights by package (
- SPDX Document, version 2.2 (
-f SpdxDocument
) - Static HTML (
-f StaticHtml
) - Web App (
-f WebApp
)
ORT is being continuously used on Linux, Windows and macOS by the core development team, so these operating systems are considered to be well supported.
To run the ORT binaries (also see Installation from binaries) at least Java 11 is required. Memory and
CPU requirements vary depending on the size and type of project(s) to analyze / scan, but the general recommendation is
to configure Java with 8 GiB of memory (-Xmx=8g
) and to use a CPU with at least 4 cores.
If ORT requires external tools in order to analyze a project, these tools are listed by the ort requirements
command.
If a package manager is not list listed there, support for it is integrated directly into ORT and does not require any
external tools to be installed.
ORT is written in Kotlin and uses Gradle as the build system, with Kotlin script instead of Groovy as the DSL.
When developing on the command line, use the committed Gradle wrapper to bootstrap Gradle in the configured version and execute any given tasks. The most important tasks for this project are:
Task | Purpose |
---|---|
assemble | Build the JAR artifacts for all projects |
detekt | Run static code analysis on all projects |
test | Run unit tests for all projects |
funTest | Run functional tests for all projects |
installDist | Build all projects and install the start scripts for distribution |
All contributions need to pass the detekt
, test
and funTest
checks before they can be merged.
For IDE development we recommend the IntelliJ IDEA Community Edition which can directly import the Gradle build files. After cloning the project's source code recursively, simply run IDEA and use the following steps to import the project.
-
From the wizard dialog: Select Import Project.
From a running IDEA instance: Select File -> New -> Project from Existing Sources...
-
Browse to ORT's source code directory and select either the
build.gradle.kts
or thesettings.gradle.kts
file. -
In the Import Project from Gradle dialog select Use auto-import and leave all other settings at their defaults.
To set up a basic run configuration for debugging, navigate to OrtMain.kt
in the cli
module and look for the
fun main(args: Array<String>)
function. In the gutter next to it, a green "Play" icon should be displayed. Click on it
and select Run 'org.ossreviewtoolkit.Main'
to run the entry point, which implicitly creates a run configuration.
Double-check that running ORT without any arguments will simply show the command line help in IDEA's Run tool window.
Finally, edit the created run configuration to your needs, e.g. by adding an argument and options to run a specific ORT
sub-command.
For running tests and individual test cases from the IDE, the kotest plugin needs to be installed. Afterwards tests can be run via the green "Play" icon from the gutter as described above.
Copyright (C) 2017-2020 HERE Europe B.V.
See the LICENSE file in the root of this project for license details.
OSS Review Toolkit (ORT) is a Linux Foundation project and part of ACT.