Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a JSON build manifest #10761

Merged
merged 5 commits into from
Jun 30, 2023
Merged

Conversation

matejak
Copy link
Member

@matejak matejak commented Jun 27, 2023

Description:

Summarize product's built content in a JSON file.
The format isn't definitive, but the PR illustrates that it is easy to provide certain key information about content from compiled artifacts.

Rationale:

Although right now, a datastream can be used as a single source of truth, it is difficult to parse it. The JSON format is much better in this regard, and it allows e.g. machine-readable high-level comparison of two different content releases.
An accompanying tool able to compare contents of two versions of ComplianceAsCode will be added to this PR later.

Review Hints:

Build a product, and examine the manifest.json in the build directory.

@matejak matejak added this to the 0.1.69 milestone Jun 27, 2023
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Used by openshift-ci bot. label Jun 27, 2023
@openshift-ci
Copy link

openshift-ci bot commented Jun 27, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@github-actions
Copy link

Start a new ephemeral environment with changes proposed in this pull request:

Fedora Environment
Open in Gitpod

Oracle Linux 8 Environment
Open in Gitpod

Copy link
Collaborator

@jan-cerny jan-cerny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is that the process of generating a data stream is complex and therefore there can be some difference between the manifest generated this way and the actual contents of data stream.

expected_filename = os.path.join(main_dir, expected_filename_template.format(id=rid))
if os.path.exists(expected_filename):
contents.add(content_id)
output_dict[rid] = list(contents)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, the list under the rule id is often empty, even if the remediation files exist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't reproduce this, could you provide an example?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can reproduce it using ./build_product rhel9 . There is a large amount of examples basically you can pick any rule that has this list empty in the manifest, eg. package_sudo_installed, aide_periodic_cron_checking, partition_for_tmp, and you will find that in fact there exists a bash or ansible.

add_custom_command(
OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/manifest.json"
COMMAND env "PYTHONPATH=$ENV{PYTHONPATH}" "${PYTHON_EXECUTABLE}" "${SSG_BUILD_SCRIPTS}/generate_manifest.py" --output "${CMAKE_CURRENT_BINARY_DIR}/manifest.json" --build-root "${CMAKE_CURRENT_BINARY_DIR}"
DEPENDS "${CMAKE_CURRENT_BINARY_DIR}/cpe-oval-unlinked.xml"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the dependency cpe-oval-unlinked?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean: shouldn't this cmake command depend on a different target?

Copy link
Member Author

@matejak matejak Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the generator uses compiled artifacts. While most of artifacts are compiled at the beginning of the build, the per-rule OVALs are made available only after that step that is marked as a dependency.
I will add a comment to clarify that, it is indeed confusing.
Moreover, it doesn't appear to be working.

@matejak
Copy link
Member Author

matejak commented Jun 28, 2023

My main concern is that the process of generating a data stream is complex and therefore there can be some difference between the manifest generated this way and the actual contents of data stream.

Yes, this manifest definitely reflects capabilities of the project. Although we are pretty sure that whatever is compiled makes its way to the datastream, it is not guaranteed.
OTOH, the manifest can remain stable regardless of what output types are produced, and datastreams won't be the default output forever. On the top of it, manifests are extremely easy to produce and to parse, which makes them attractive.

matejak added 3 commits June 29, 2023 10:59
Collect built artifacts, and save the collection to a json file
that represents the structure of the content available for the built
product in a form that is more easy to process than the
scanner-facing artifacts s.a. datastreams.
jan-cerny added a commit to jan-cerny/scap-security-guide that referenced this pull request Jun 29, 2023
This commit introduces a new script compare_versions.py which can show
differences between two ComplianceAsCode versions.  Lists added or
removed rules, profiles, changes in profile composition and changes in
remediations and platforms.  For comparison, you can use git tags or
ComplianceAsCode JSON manifest files directly.  This new feature
leverages the ComplianceAsCode JSON manifests, which have been
introduced by ComplianceAsCode#10761
@jan-cerny jan-cerny self-assigned this Jun 30, 2023
@jan-cerny jan-cerny added the Highlight This PR/Issue should make it to the featured changelog. label Jun 30, 2023
OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/manifest-${PRODUCT}.json"
COMMAND env "PYTHONPATH=$ENV{PYTHONPATH}" "${PYTHON_EXECUTABLE}" "${SSG_BUILD_SCRIPTS}/generate_manifest.py" --output "${CMAKE_CURRENT_BINARY_DIR}/manifest-${PRODUCT}.json" --build-root "${CMAKE_CURRENT_BINARY_DIR}"
# The manifest requires compiled artifacts on right places and also per-rule OVAL.
# It is not clear when those things assume their places, so manifest is compiled late
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation problem. It causes a CI fail in the cmakelint test.

@matejak matejak marked this pull request as ready for review June 30, 2023 08:59
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Used by openshift-ci bot. label Jun 30, 2023
@codeclimate
Copy link

codeclimate bot commented Jun 30, 2023

Code Climate has analyzed commit f215139 and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (50% is the threshold).

This pull request will bring the total coverage in the repository to 53.5% (0.0% change).

View more on Code Climate.

Copy link
Collaborator

@jan-cerny jan-cerny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have generated and reviewed the manifest for RHEL 9 and the data in there seem to be correct.

@jan-cerny jan-cerny merged commit e2bfb40 into ComplianceAsCode:master Jun 30, 2023
jan-cerny added a commit to jan-cerny/scap-security-guide that referenced this pull request Jun 30, 2023
This commit introduces a new script compare_versions.py which can show
differences between two ComplianceAsCode versions.  Lists added or
removed rules, profiles, changes in profile composition and changes in
remediations and platforms.  For comparison, you can use git tags or
ComplianceAsCode JSON manifest files directly.  This new feature
leverages the ComplianceAsCode JSON manifests, which have been
introduced by ComplianceAsCode#10761
jan-cerny added a commit to jan-cerny/scap-security-guide that referenced this pull request Jul 3, 2023
This commit introduces a new script compare_versions.py which can show
differences between two ComplianceAsCode versions.  Lists added or
removed rules, profiles, changes in profile composition and changes in
remediations and platforms.  For comparison, you can use git tags or
ComplianceAsCode JSON manifest files directly.  This new feature
leverages the ComplianceAsCode JSON manifests, which have been
introduced by ComplianceAsCode#10761
wokis pushed a commit to wokis/ComplianceAsCode that referenced this pull request Jul 7, 2023
This commit introduces a new script compare_versions.py which can show
differences between two ComplianceAsCode versions.  Lists added or
removed rules, profiles, changes in profile composition and changes in
remediations and platforms.  For comparison, you can use git tags or
ComplianceAsCode JSON manifest files directly.  This new feature
leverages the ComplianceAsCode JSON manifests, which have been
introduced by ComplianceAsCode#10761
rhmdnd pushed a commit to openshift/cac-content-fork that referenced this pull request Jul 21, 2023
This commit introduces a new script compare_versions.py which can show
differences between two ComplianceAsCode versions.  Lists added or
removed rules, profiles, changes in profile composition and changes in
remediations and platforms.  For comparison, you can use git tags or
ComplianceAsCode JSON manifest files directly.  This new feature
leverages the ComplianceAsCode JSON manifests, which have been
introduced by ComplianceAsCode/content#10761
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Highlight This PR/Issue should make it to the featured changelog.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants