Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA-1348: Fix dkrv2_openshift_secondary_metadata_scraper openshift_secondary_metadata_extraction test #978

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,14 @@ use self::cincinnati::plugins::prelude_plugin_impl::*;
use tokio::sync::Mutex as FuturesMutex;

pub static DEFAULT_OUTPUT_ALLOWLIST: &[&str] = &[
"/LICENSE$",
"/channels/.+\\.ya+ml$",
"/blocked-edges/.+\\.ya+ml$",
"/raw/metadata.json$",
"/version$",
// The patterns may be potentially matched against different relative
// paths depending on the underlying decoder or the specific archive.
// for example: `archive/version` or `version`
"(^|/)LICENSE$",
"(^|/)channels/.+\\.ya+ml$",
"(^|/)blocked-edges/.+\\.ya+ml$",
"(^|/)raw/metadata.json$",
"(^|/)version$",
Comment on lines +21 to +25
Copy link
Contributor

@PratikMahajan PratikMahajan Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm particularly concerned about this change.
we've similar code fragment in the other secondary parser https://github.com/openshift/cincinnati/blob/master/cincinnati/src/plugins/internal/graph_builder/github_openshift_secondary_metadata_scraper/plugin.rs#L13-L19

does this change affect the outcome of unit/e2es etc ?

Copy link
Contributor Author

@DavidHurta DavidHurta Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When debugging the unit tests I have written down my findings in the 005098f commit:

...
Currently, the regex patterns inside the
dkrv2_openshift_secondary_metadata_scraper plugin are matched
against the archive files (e.g. version); however, regex
patterns inside the github_openshift_secondary_metadata_scraper
plugin are matched against the files prefixed with the archive
directory (e.g. archive/version). This is most likely dependent
on the underlying decoder library or the specific archive.
...

I did not have much luck in understanding exactly why. I have concluded that the chosen decoder or the archive likely influences the functionality.

Let me run the CI on the PR changes excluding this one to check e2e 👍 Without this change, the unit tests locally did not pass for me as nothing was getting matched by the original DEFAULT_OUTPUT_ALLOWLIST regex patterns.

Copy link
Contributor Author

@DavidHurta DavidHurta Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The e2e job passes with and without this change. Check PR test history. Although, I do not know what the e2e job is testing.

The cargo test job starts failing with panicked at 'no files were extracted':

thread 'plugins::internal::graph_builder::dkrv2_openshift_secondary_metadata_scraper::plugin::network_tests::openshift_secondary_metadata_extraction' panicked at 'no files were extracted', cincinnati/src/plugins/internal/graph_builder/dkrv2_openshift_secondary_metadata_scraper/plugin.rs:723:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

This means the current regex patterns on master were not able to match anything, and thus nothing was extracted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The c466179 in the full PR test history reverted the commits regarding the modifications to the DEFAULT_OUTPUT_ALLOWLIST variable.

I am dropping the reverts.

];

pub static DEFAULT_METADATA_IMAGE_REGISTRY: &str = "";
Expand Down Expand Up @@ -189,7 +192,7 @@ impl DkrV2OpenshiftSecondaryMetadataScraperPlugin {
let data_dir = tempfile::tempdir_in(&settings.output_directory)?;

let mut ns_registry = settings.registry.clone();
ns_registry.push_str(&settings.repository);
ns_registry.push_str(&format!("/{}", settings.repository));

let registry = registry::Registry::try_from_str(&ns_registry)
.context(format!("trying to extract Registry from {}", &ns_registry))?;
Expand Down Expand Up @@ -459,6 +462,171 @@ impl DkrV2OpenshiftSecondaryMetadataScraperPlugin {
}
}

#[cfg(test)]
mod tests {
use crate::plugins::internal::dkrv2_openshift_secondary_metadata_scraper::plugin::DEFAULT_OUTPUT_ALLOWLIST;

#[test]
fn default_output_allowlist() {
struct TestCase {
name: String,
filepath: String,
expected_match: bool,
}
let test_cases = vec![
// Expected Match == True
// Required files from the graph-data repository
TestCase {
name: String::from("Should match a required file"),
filepath: String::from("LICENSE"),
expected_match: true,
},
TestCase {
name: String::from("Should match a required type of files"),
filepath: String::from("channels/name-x.y.z.yaml"),
expected_match: true,
},
TestCase {
name: String::from("Should match a required type of files"),
filepath: String::from("blocked-edges/x.y.z-name.yaml"),
expected_match: true,
},
TestCase {
name: String::from("Should match a required file"),
filepath: String::from("raw/metadata.json"),
expected_match: true,
},
TestCase {
name: String::from("Should match a required file"),
filepath: String::from("version"),
expected_match: true,
},
// Expected Match == False
// Some of the other files present in the graph-data repository
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("build-suggestions/4.17.yaml"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from(".github/dependabot.yml"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("deploy/1-image-stream.yaml"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("graph-data.rs/src/main.rs"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("hack/OWNERS"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("internal-channels/stable.yaml"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("raw/OWNERS"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from(".gitignore"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("CONTRIBUTING.md"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("Dockerfile"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("OWNERS"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("OWNERS_ALIASES"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("README.md"),
expected_match: false,
},
TestCase {
name: String::from("Should not match other graph-data files"),
filepath: String::from("requirements.txt"),
expected_match: false,
},
// Expected Match == False
// System files
TestCase {
name: String::from("Should not match system files"),
filepath: String::from("/etc/alternatives/unversioned-python-man"),
expected_match: false,
},
TestCase {
name: String::from("Should not match system files"),
filepath: String::from(
"/usr/lib/python3.6/site-packages/setuptools-39.2.0.dist-info/LICENSE.txt",
),
expected_match: false,
},
];
let mut regexes = Vec::new();
for str_regex in DEFAULT_OUTPUT_ALLOWLIST {
match regex::Regex::new(&str_regex) {
Ok(regex) => regexes.push(regex),
Err(err) => panic!(
"invalid regex pattern {} in DEFAULT_OUTPUT_ALLOWLIST\nerr: {}",
str_regex, err
),
}
}
for tc in test_cases {
let actual = regexes.iter().any(|re| re.is_match(&tc.filepath));
assert_eq!(
tc.expected_match,
actual,
"matching default output allowlist regexes failed\ntest case: {}: {}\nexpected match {}; got {}",
&tc.name,
tc.filepath,
tc.expected_match,
actual
);
// The matching should not change in case the files are prefixed with the
// archive directory
let actual = regexes
.iter()
.any(|re| re.is_match(&format!("archive/{}", tc.filepath)));
assert_eq!(
tc.expected_match,
actual,
"matching default output allowlist regexes failed\ntest case (with a prefixed archive): {}: {}\nexpected match {}; got {}",
&tc.name,
tc.filepath,
tc.expected_match,
actual
);
}
}
}

#[cfg(test)]
#[cfg(feature = "test-net")]
mod network_tests {
Expand All @@ -468,6 +636,8 @@ mod network_tests {

#[tokio::test(flavor = "multi_thread")]
async fn openshift_secondary_metadata_extraction() -> Fallible<()> {
env_logger::init();

let fixtures = PathBuf::from(
"./src/plugins/internal/graph_builder/dkrv2_openshift_secondary_metadata_scraper/test_fixtures",
);
Expand All @@ -480,7 +650,7 @@ mod network_tests {

let _m = mockito::mock(
"GET",
"/sha256=3d8d70c6090d4b843f885c8a0c80d01c5fb78dd7c8d16e20929ffc32a15e2fde/signature-3",
"/sha256=c38b30c516858d763a705b3e1a709617c1a04704b2e635d91c7fe52d906816ae/signature-3",
)
.with_status(200)
.with_body_from_file(signature_path.canonicalize()?)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
FROM quay.io/app-sre/busybox as downloader
FROM registry.access.redhat.com/ubi9-minimal as downloader
ADD https://github.com/openshift/cincinnati-graph-data/archive/e8692fe50ccbfa525cce340f781d56d5a1d4364d.tar.gz /graph-data.tar.gz
RUN microdnf install -y tar gzip
RUN mkdir -p /graph-data
RUN tar xav -C /graph-data -f /graph-data.tar.gz --no-same-owner

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
-----BEGIN PGP PUBLIC KEY BLOCK-----

mDMEZyopcxYJKwYBBAHaRw8BAQdAXH1M1WCrEClQqIx8keNmUPmMPiAjzhEoCBph
+jnewT20H0RhdmlkIEh1cnRhIDxkaHVydGFAcmVkaGF0LmNvbT6IkwQTFgoAOxYh
BAFg3HqS5RZ949wgw26Qne9O4omXBQJnKilzAhsDBQsJCAcCAiICBhUKCQgLAgQW
AgMBAh4HAheAAAoJEG6Qne9O4omXyigA/ikgLwiwS9DwRTB59fknPxf1RmNKvsDG
e6r/UGUDG6GAAP4iYVunXoUzNtp67f6SvtYLABL8TVIUpcpp5MOe6wWqBrg4BGcq
KXMSCisGAQQBl1UBBQEBB0D8r9JYjG2cl4RwkGomUN+IMNTI4ozkOac7J2K/tKfD
NgMBCAeIeAQYFgoAIBYhBAFg3HqS5RZ949wgw26Qne9O4omXBQJnKilzAhsMAAoJ
EG6Qne9O4omXM0QA/in7Dw9H10eZC2dp/7qdD23RIBIMtoM3ITJwuPwDC7QVAQCB
fEPBrHTx/mE5HD3+qRTlzX8bpgtTP4z0M9Y3fzfEAg==
=HUqV
-----END PGP PUBLIC KEY BLOCK-----

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"critical": {
"image": {
"docker-manifest-digest": "sha256:3d8d70c6090d4b843f885c8a0c80d01c5fb78dd7c8d16e20929ffc32a15e2fde"
"docker-manifest-digest": "sha256:c38b30c516858d763a705b3e1a709617c1a04704b2e635d91c7fe52d906816ae"
},
"type": "atomic container signature",
"identity": {
Expand Down
Binary file not shown.
2 changes: 1 addition & 1 deletion graph-builder/tests/images/build-n-push-buildah.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ for target in "${@}"; do
tag="${tag_raw/_/-}"

full_tag="${REPO_BASE}-${repo}:${tag}"
buildah bud -t "${full_tag}" "${target}"
buildah bud -t "${full_tag}" --format docker "${target}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This graph-builder/tests/images/build-n-push-buildah.sh script is run in two places.

This change is needed for the former script as described in the change's commit. However, I have not run the latter script with the new changes. I would like to run it after all tests are fixed to verify the change and to more easily notice any tests start breaking.

Just checked, it seems running the script won't be necessary probably; the existing images produced by the latter script already have the docker format.

$ podman manifest inspect registry.ci.openshift.org/cincinnati-ci-public/cincinnati-test-emptyfirsttag-public-manual:0.0.0 | jq .mediaType
WARN[0000] The manifest type application/vnd.docker.distribution.manifest.v2+json is not a manifest list but a single image. 
"application/vnd.docker.distribution.manifest.v2+json"


if [[ -e "${arch_file}" ]]; then
img_arch="$(head -n1 ${arch_file})"
Expand Down
6 changes: 3 additions & 3 deletions hack/get_ci_secrets.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env bash
#
# This script helps to set up a local development environment with secrets that are stored on the CI cluster.
# This script helps to set up a local development environment with secrets that are stored on the app.ci cluster.
# It fetches these secrets and populates local files with them.
# The script needs to be `source`d to work: `source hack/get_ci_secrets.sh`.

Expand All @@ -12,8 +12,8 @@ CINCINNATI_CI_PUBLIC_DOCKERJSON_PATH=$(mktemp)
if (
set -xeuo pipefail

oc get secrets --namespace=cincinnati-ci ci-image-sa-dockercfg-vjdrw -o 'go-template={{index .data ".dockercfg"}}' | base64 -d > "${CINCINNATI_CI_DOCKERCFG_PATH}"
oc get secrets --namespace=cincinnati-ci-public ci-image-sa-dockercfg-cwj4w -o 'go-template={{index .data ".dockercfg"}}' | base64 -d > "${CINCINNATI_CI_PUBLIC_DOCKERCFG_PATH}"
oc get secrets --namespace=cincinnati-ci ci-image-sa-dockercfg-fcq47 -o 'go-template={{index .data ".dockercfg"}}' | base64 -d > "${CINCINNATI_CI_DOCKERCFG_PATH}"
oc get secrets --namespace=cincinnati-ci-public ci-image-sa-dockercfg-68k8s -o 'go-template={{index .data ".dockercfg"}}' | base64 -d > "${CINCINNATI_CI_PUBLIC_DOCKERCFG_PATH}"

export CINCINNATI_CI_DOCKERCFG_PATH
export CINCINNATI_CI_PUBLIC_DOCKERCFG_PATH
Expand Down