Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge release_4.14 to master #811

Merged
merged 44 commits into from
Nov 2, 2023

Conversation

praveenkumar
Copy link
Member

@praveenkumar praveenkumar commented Oct 30, 2023

Merge release_4.14 branch to master, there is some conflicts which resolved manually and microshift script changes e6e743c one.

praveenkumar and others added 30 commits June 6, 2023 12:59
As of now internalP is part of kubelet drop-in unit file for OCP bundle
only but it should be same for OKD also. During d90d53d
looks like it added the regression.
4.14 is in dev phase so mirror url changed to `ocp-dev-preview`
podman requires a container runtime, and networking plugins to work.
They are only marked as Recommends/Suggests in podman's spec file, so we
need to ensure they get installed.

commit 14fdeea installs these explicitly, but podman's spec file
recommends `crun` and not `runc`, no idea if these are the same or not.
Better to rely on what podman .spec file provides rather than hardcoding
it.

microshift repositories are enabled after podman installation, as they
provide a podman build which is only meant to be used in an openshift
cluster, which, in particular, requires `runc` at runtime without
explicit rpm dependencies. The standard rhel build does not have such a
requirement.
As of now this file contain the device name and disk ID associated with
a physical volume identifier (PVI) and we noticed that device name is
different for different hypervisor like for hyperV it is `/dev/sda` and
for libvirt it is `/dev/vda` [0].

In this PR we follow `man lvmdevices` which states remove of this
file means lvm will not use a devices file.
```
The  LVM  devices file lists devices that lvm can use.
The default file is /etc/lvm/devices/system.devices, and
the lvmdevices(8) command is used to add or remove device
entries.  If the file does not exist, or if lvm.conf includes
use_devicesfile=0, then lvm will not use a devices file.
```

With this patch I don't see any degrade in boot time
```
<=== with the patch ===>
INFO CRC instance is running with IP 127.0.0.1
INFO CRC VM is running
^C
real	0m13.727s
user	0m0.142s

<=== without the patch ===>
INFO CRC instance is running with IP 127.0.0.1
INFO CRC VM is running
INFO Updating authorized keys...
^C

real	0m14.216s
user	0m0.158s

```

[0] https://www.baeldung.com/linux/vda-vs-sda
This PR adds deployment resource for route controller which use correct
image tag, what we cached. Once the bundle is created using this PR, we
will also need to make change on crc side. The resource file is located
in `/opt/crc` dir.

Also if we take this in it will not going to have any effect on created
bundle until we implement the logic on crc side and it will be just a
unused extra file in the bundle.

- crc-org/crc#3502
We now have arm64/amd64 images for hostpath-csi-driver and no need to
build it on internal brew and putting on `quay.io/crcont`.
Default service account doesn't have permission to list the routes for
all the namespaces and `openshift-ingress` namespace have `router`
service account which have those permission. Without this permission
all the routes which is created by application is not readable and
following error happen

```
E0627 12:47:51.662778       1 reflector.go:138] /remote-source/app/main.go:53: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:openshift-ingress:default" cannot list resource "services" in API group "" at the cluster scope
W0627 12:48:09.285708       1 reflector.go:324] /remote-source/app/main.go:64: failed to list *v1.Route: routes.route.openshift.io is forbidden: User "system:serviceaccount:openshift-ingress:default" cannot list resource "routes" in API group "route.openshift.io" at the cluster scope
E0627 12:48:09.286674       1 reflector.go:138] /remote-source/app/main.go:64: Failed to watch *v1.Route: failed to list *v1.Route: routes.route.openshift.io is forbidden: User "system:serviceaccount:openshift-ingress:default" cannot list resource "routes" in API group "route.openshift.io" at the cluster scope
W0627 12:48:30.587710       1 reflector.go:324] /remote-source/app/main.go:53: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:openshift-ingress:default" cannot list resource "services" in API group "" at the cluster scope
E0627 12:48:30.588814       1 reflector.go:138] /remote-source/app/main.go:53: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:openshift-ingress:default" cannot list resource "services" in API group "" at the cluster scope
```

It will cause issue for crc not updating the routes to `/etc/hosts` as
expected and blocker for current release.

This PR fix this issue.
During microshift bundle creation d90d53d
we created `sparsify_lvm` to move fast to have microshift bundle sooner.
With this PR we are going to use sparsify helper for microshift bundle
and remove the `sparsify_lvm` helper.
This helper make sure that qemu image creation happen for all the bundle
type and use `create_qemu_image` generic function. It is also helpful in
next commit where we only downgrade the kernel version for macOS.
68c6383 moved the kernel changes from an aarch64-only block to a
block which is run when the macOS bundle is enabled. However, the
kernel downgrade is done before any bundle is generated, so the kernel
will be downgraded for all our bundles, linux, macos, windows.

This PR fixes this and use crc-org#637 logic
to only downgrade the kernel for macOS.
After upgrading our CI to RHEL-9, We can use https://access.redhat.com/solutions/134403
to downgrade the kernel instead of depending on the internal repo which also affect us to
perform generate the macOS bundle on CI.

fixes: crc-org#753
…rnal repo access"

This reverts commit 4d83775. Since we
are not depend on internal repo to downgrade the kernel version so
better to build mac bundles from CI.
microshift is officially released in RH repos, and microshift.sh already
requires a subscription, so we can use `yum download` instead of `brew
download-build`.
I've checked that setting `MICROSHIFT_NVR` works for 4.13 releases.
With 114d1c8 we removed the internal
cert along with internal repo but this cert is still required for our
patch image build purpose where we copy the images from internal
registry and without it following error happen.
```
$ skopeo copy --dest-authfile updated_pullsecret.json --all --src-cert-dir=repos/ docker://registry-proxy.engineering.redhat.com/rh-osbs/openshift-crc-cluster-kube-apiserver-operator:v4.14.0-202307041530.p0.g8b64249.assembly.stream docker://quay.io/crcont/openshift-crc-cluster-kube-apiserver-operator:4.14.0-ec.3
time="2023-07-07T03:27:58-04:00" level=fatal msg="initializing source docker://registry-proxy.engineering.redhat.com/rh-osbs/openshift-crc-cluster-kube-apiserver-operator:v4.14.0-202307041530.p0.g8b64249.assembly.stream: pinging container registry registry-proxy.engineering.redhat.com: Get \"https://registry-proxy.engineering.redhat.com/v2/\": x509: certificate signed by unknown authority"
```
Before 114d1c8 it make sense to have directory name as `repos` because
it had internal repo to download the kernel rpms but now since this
directory only content the internal cert so better to rename it `pki`.
openshift/okd bundles use `qemu+tcp` connection so that doesn't need the
'sudo' to get the details about network, uri and capabilities but for
microshift bundle we use same preflight checks and use `qemu:///system`
uri which need `sudo` to get those details otherwise the CI fails with
following error:
```
+ '[' microshift == okd ']'
+ virsh -c qemu:///system uri
error: failed to connect to the hypervisor
```
looks like with f1dc411, I put the
content of the cert using git diff and it included `-` from begining of
each line :(. This PR fixes it.
- This is more consistent with the naming of the container image
- This allows to workaround a problem with bundle v4.13.3 which had an invalid route_controller.json file
    - crc-org#747
With 3edc3ce we want to make change the
file with `routes-controller.yaml.in` but by mistake file is renamed to
`router-controller.yaml.in` and the uses of the file named correctly
with `routes-*`. This PR fix that file rename mistake.
For microshift bundle creation `VM_PREFIX` is not set and CI have
following error. This PR add `$VM_NAME` as default value in case there
is no value set.
```
 ./createdisk.sh: line 157: VM_PREFIX: unbound variable
```
We need to remove the crc domain which is created as part of bundle
creation process otherwise we are not able to test the this bundle with
`crc` binary which also want to create same domain and fails with
following error
```
 level=info msg="Creating CRC VM for MicroShift 4.13.4..."
Error creating machine: Error in driver during machine creation: virError(Code=9, Domain=20, Message='operation failed: domain 'crc' already exists with uuid 5ab0edb0-4c93-484f-8563-a91529d467ba')
```
During creation of bundle `crc` vm is already in shutdown state so
`virsh destroy` commands fails. We only need to undefine the VM.
…dy present

Recently there was issue with brew and jobs are stuck to `free` state
for a day and after that it succeed but our jenkins jobs fails to wait
that long and when we rebuild the job since KAO image already built on
the brew it will not fetch the repo and routes and dnsmasq images never
built for that specific version of openshift.

This PR make sure we always built those images as long as the `From`
section from Dockerfile changes.
For the podman-env setting we use an SSH connection to start the container.
It seems that containers get killed after a timeout. The way it is
handled in podman-machine side is to enable lingering for `core` user and
this PR is suppose to do same.
Recently we found out that due to mcp is not in updated state, created
bundles had issue with machine config like following
```
Marking Degraded due to: machineconfig.machineconfiguration.openshift.io "rendered-master-5e0b4b6fd5ad9c5c64801e18039b9233" not found
```

when the mcp is not updated properly
```
$ oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-5e0b4b6fd5ad9c5c64801e18039b9233   False     True       True       1              0                   0                     1                      3d13h
worker   rendered-worker-e6687df4c217440327c4dc1dcf0f507c   True      False      False      0              0                   0                     0                      3d13h
```
It uses `MICROSHIFT_PRERELEASE` as environment variable to create
bundle for upcoming release by getting the microshift-* rpms from
mirror.openshift.com
Only `okd` doesn't have support for arm64 but other preset have so
we just need to avoid creating the arm64 bundle support for `okd`. This
PR update the `if` condition around it.
This PR refactor around set_bundle_variables function to make
sure image can be generated for microshift bundle also.
In microshift we don't configure app route different than base domain
like we can do in OCP using ingress configuration, so app route for
microshift become `apps.crc.testing` but for OCP it is
`apps-crc.testing`. This patch make sure it is updated to the bundle
metadata correctly.
…tion

Looks like internally [0] anonymous operations are changed from `git` to
`https` because git protocol miss encryption.

[0] https://issues.redhat.com/browse/RHELBLD-10855

should fix
```
git remote add upstream git://pkgs.devel.redhat.com/containers/ose-cluster-kube-apiserver-operator
+ git fetch upstream rhaos-4.14-rhel-8 --no-tags
fatal: unable to connect to pkgs.devel.redhat.com:
pkgs.devel.redhat.com[0: x.x.x.x]: errno=No route to host
```
…capability

4.14 add more capability like image-registry, build and deployment
as addon which can be added on top of `None`. By default since we are
using `None` those capabilities need to be added explicitly to have it
part of cluster since we are providing these features before.

fixes: crc-org#806
This PR adds `clearpart` option `--disklabel` to use `gpt` as default
disk label and to allow this we also need to add a partition of
`biosboot` so that it will not error out with following error.
```
your bios-based system needs a special partition to boot from a gpt disk label
```

- https://pykickstart.readthedocs.io/en/latest/kickstart-docs.html#clearpart
- https://access.redhat.com/solutions/3370891

Note: Upstream PR is merged but only available from 4.15 and microshift
team is not interesting to backport it for 4.13/4.14 branch and
suggested that crc team keep this modification to respective repo.

- openshift/microshift#2331
If we don't boot the ISO in uefi mode, then the UEFI firmware files
won't get installed in the generated qcow2 image. This would prevent
this image from booting with UEFI.
Since we are starting the ISO with `efi` mode then by default there is
no bootloader for legacy support, this PR add that so the qcow2 image
able to boot for UEFI and legacy both.
@openshift-ci openshift-ci bot requested review from anjannath and cfergeau October 30, 2023 11:59
@praveenkumar praveenkumar force-pushed the merge_4.14_to_master branch 4 times, most recently from e6e743c to a1eac3b Compare November 2, 2023 08:14
Copy link

openshift-ci bot commented Nov 2, 2023

@praveenkumar: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-snc ec38e09 link true /test e2e-snc
ci/prow/e2e-microshift 1152bdf link true /test e2e-microshift

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@cfergeau cfergeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ git grep dev-prev
microshift.sh:    MIRROR=${MIRROR:-https://mirror.openshift.com/pub/openshift-v4/$ARCH/clients/ocp-dev-preview}
repos/mirror-microshift.repo:baseurl=https://mirror.openshift.com/pub/openshift-v4/$basearch/microshift/ocp-dev-preview/latest-4.14/elrhel-9/os/

These also need to be updated?

@praveenkumar
Copy link
Member Author

$ git grep dev-prev
microshift.sh: MIRROR=${MIRROR:-https://mirror.openshift.com/pub/openshift-v4/$ARCH/clients/ocp-dev-preview}
repos/mirror-microshift.repo:baseurl=https://mirror.openshift.com/pub/openshift-v4/$basearch/microshift/ocp-dev-preview/latest-4.14/elrhel-9/os/
These also need to be updated?

No, microshift.sh one is for prerelease flag and the mirror url is used in case prerelease bundle need to be created.

@openshift-ci openshift-ci bot added the lgtm label Nov 2, 2023
Copy link

openshift-ci bot commented Nov 2, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cfergeau

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants