Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix upgrade size calculation inside k8s #537

Merged
merged 5 commits into from
Sep 13, 2024
Merged

Conversation

Itxaka
Copy link
Member

@Itxaka Itxaka commented Sep 13, 2024

As we use the suc-upgrade script to upgrade inside k8s containers, the host rootfs is mounted under $HOST_DIR. The upgrade image fs is actually the rootfs of the container, so its mounted on /

The upgrade calculation was going over this dir when calculating the size for the upgrade, as the host dir is mounted under / This means that we were calling the upgrade with --source dir:/ and that tried to go into the /$HOST_DIR and count that size.

This created 2 issues. The first one is that the upgraed size would be doubled as it counted the upgrade image + the host contents. The second would be that counting an installed system is a bit problematic with dirs like /tmp or /proc in which there can be temp files, broken symlinks and symlinks to self which wont work inside the container (self links point to the running system, so a self link in the mounted host would point to a different process than in the running container)

To fix this, we try to get the $HOST_DIR variable when calculating the sizes. The suc script sets this ALWAYS so if we are running under k8s with an upgrade, we will get it and we can skip counting that dir.

KUBERNETES_SERVICE_HOST is alway set for pods running under k8s so we
can infer that we are under k8s if that its set.

If that its true and the HOST_DIR var is not set, we default to the safe
value that system-upgrade-controller sets the host root into

As we use the suc-upgrade script to upgrade inside k8s containers, the
host rootfs is mounted under `$HOST_DIR`. The upgrade image fs is actually
the rootfs of the container, so its mounted on `/`

The upgrade calculation was going over this dir when calculating the
size for the upgrade, as the host dir is mounted under `/`
This means that we were calling the upgrade with `--source dir:/` and
that tried to go into the `/$HOST_DIR` and count that size.

This created 2 issues. The first one is that the upgraed size would be
doubled as it counted the upgrade image + the host contents. The second
would be that counting an installed system is a bit problematic with
dirs like /tmp or /proc in which there can be temp files, broken
symlinks and symlinks to self which wont work inside the container (self
links point to the running system, so a self link in the mounted host
would point to a different process than in the running container)

To fix this, we try to get the `$HOST_DIR` variable when calculating the
sizes. The suc script sets this ALWAYS so if we are running under k8s
with an upgrade, we will get it and we can skip counting that dir.

Signed-off-by: Itxaka <itxaka@kairos.io>
@Itxaka Itxaka requested a review from a team September 13, 2024 08:57
@Itxaka
Copy link
Member Author

Itxaka commented Sep 13, 2024

This requires backport to 2.13.x branch

Copy link

codecov bot commented Sep 13, 2024

Codecov Report

Attention: Patch coverage is 83.33333% with 2 lines in your changes missing coverage. Please review.

Project coverage is 49.90%. Comparing base (83afd40) to head (6f61df4).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/config/spec.go 83.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #537      +/-   ##
==========================================
+ Coverage   49.84%   49.90%   +0.05%     
==========================================
  Files          48       48              
  Lines        4580     4589       +9     
==========================================
+ Hits         2283     2290       +7     
- Misses       2028     2029       +1     
- Partials      269      270       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

KUBERNETES_SERVICE_HOST is alway set for pods running under k8s so we
can infer that we are under k8s if that its set.

If that its true and the HOST_DIR var is not set, we default to the safe
value that system-upgrade-controller sets the host root into

Signed-off-by: Itxaka <itxaka@kairos.io>
Copy link
Contributor

@jimmykarily jimmykarily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Itxaka and others added 2 commits September 13, 2024 12:46
We dont want to calculate any runtime dirs like /proc /dev or /run

Signed-off-by: Itxaka <itxaka@kairos.io>
Co-authored-by: Dimitris Karakasilis <dimitris@karakasilis.me>
@Itxaka
Copy link
Member Author

Itxaka commented Sep 13, 2024

Tested with k8s and it works as expected!

k3s upgrade test from k8s installs to disk with custom config [provider, provider-upgrade-latest-k8s-with-kubernetes]
/home/itxaka/projects/kairos/tests/provider_upgrade_latest_k8s_test.go:59
State dir: /tmp/1292742850
Using ssh port: 36857
  STEP: checking if it has default service active @ 09/13/24 12:40:46.999
  STEP: copy the config @ 09/13/24 12:40:47.194
  STEP: find the correct device (qemu vs vbox) @ 09/13/24 12:40:47.282
  STEP: installing @ 09/13/24 12:40:47.359
  STEP: rebooting after install @ 09/13/24 12:41:04.496
  STEP: checking default services are on after first boot @ 09/13/24 12:42:04.706
  STEP: Checking agent provider correct start @ 09/13/24 12:42:05.177
  STEP: checking kubeconfig @ 09/13/24 12:42:05.242
  STEP: checking current version @ 09/13/24 12:42:05.3
  STEP: wait system-upgrade-controller @ 09/13/24 12:42:05.361
  STEP: wait for all containers to be in running state @ 09/13/24 12:42:05.361
  STEP: triggering an upgrade @ 09/13/24 12:42:15.746
  STEP: checking upgraded version @ 09/13/24 12:42:16.192
• [1021.403 seconds]
------------------------------
SS

Ran 1 of 21 Specs in 1021.404 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 20 Skipped
PASS

Signed-off-by: Itxaka <itxaka@kairos.io>
@Itxaka Itxaka merged commit 7c6c195 into main Sep 13, 2024
13 of 14 checks passed
@Itxaka Itxaka deleted the fix_upgrade_kubernetes branch September 13, 2024 11:07
Itxaka added a commit that referenced this pull request Sep 13, 2024
Co-authored-by: Dimitris Karakasilis <dimitris@karakasilis.me>
(cherry picked from commit 7c6c195)
renovate bot referenced this pull request in kairos-io/provider-kairos Sep 19, 2024
…4.1 (#638)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
|
[github.com/kairos-io/kairos-agent/v2](https://redirect.github.com/kairos-io/kairos-agent)
| `v2.14.0` -> `v2.14.1` |
[![age](https://developer.mend.io/api/mc/badges/age/go/github.com%2fkairos-io%2fkairos-agent%2fv2/v2.14.1?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/go/github.com%2fkairos-io%2fkairos-agent%2fv2/v2.14.1?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/go/github.com%2fkairos-io%2fkairos-agent%2fv2/v2.14.0/v2.14.1?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/go/github.com%2fkairos-io%2fkairos-agent%2fv2/v2.14.0/v2.14.1?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>kairos-io/kairos-agent
(github.com/kairos-io/kairos-agent/v2)</summary>

###
[`v2.14.1`](https://redirect.github.com/kairos-io/kairos-agent/releases/tag/v2.14.1)

[Compare
Source](https://redirect.github.com/kairos-io/kairos-agent/compare/v2.14.0...v2.14.1)

#### What's Changed

- Bump go to 1.23.1 by
[@&#8203;Itxaka](https://redirect.github.com/Itxaka) in
[https://github.com/kairos-io/kairos-agent/pull/526](https://redirect.github.com/kairos-io/kairos-agent/pull/526)
- Display net info on QR code page by
[@&#8203;Itxaka](https://redirect.github.com/Itxaka) in
[https://github.com/kairos-io/kairos-agent/pull/525](https://redirect.github.com/kairos-io/kairos-agent/pull/525)
- chore(deps): update securego/gosec action to v2.21.1 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos-agent/pull/528](https://redirect.github.com/kairos-io/kairos-agent/pull/528)
- fix(deps): update module github.com/kairos-io/kairos-sdk to v0.4.3 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos-agent/pull/529](https://redirect.github.com/kairos-io/kairos-agent/pull/529)
- Fail if remote url address doesnt exist by
[@&#8203;Itxaka](https://redirect.github.com/Itxaka) in
[https://github.com/kairos-io/kairos-agent/pull/527](https://redirect.github.com/kairos-io/kairos-agent/pull/527)
- fix(deps): update module github.com/google/go-github/v63 to v64 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos-agent/pull/532](https://redirect.github.com/kairos-io/kairos-agent/pull/532)
- chore(deps): update google/osv-scanner-action action to v1.8.5 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos-agent/pull/534](https://redirect.github.com/kairos-io/kairos-agent/pull/534)
- fix(deps): update module github.com/twpayne/go-vfs/v4 to v5 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos-agent/pull/533](https://redirect.github.com/kairos-io/kairos-agent/pull/533)
- Improve get partitions and reset spec by
[@&#8203;Itxaka](https://redirect.github.com/Itxaka) in
[https://github.com/kairos-io/kairos-agent/pull/530](https://redirect.github.com/kairos-io/kairos-agent/pull/530)
- fix(deps): update module k8s.io/mount-utils to v0.31.1 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos-agent/pull/536](https://redirect.github.com/kairos-io/kairos-agent/pull/536)
- Fix upgrade size calculation inside k8s by
[@&#8203;Itxaka](https://redirect.github.com/Itxaka) in
[https://github.com/kairos-io/kairos-agent/pull/537](https://redirect.github.com/kairos-io/kairos-agent/pull/537)
- Do not skip /run when counting the size by
[@&#8203;Itxaka](https://redirect.github.com/Itxaka) in
[https://github.com/kairos-io/kairos-agent/pull/538](https://redirect.github.com/kairos-io/kairos-agent/pull/538)
- fix(deps): update module github.com/google/go-github/v63 to v64 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos-agent/pull/539](https://redirect.github.com/kairos-io/kairos-agent/pull/539)
- fix(deps): update module github.com/twpayne/go-vfs/v4 to v5 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos-agent/pull/540](https://redirect.github.com/kairos-io/kairos-agent/pull/540)
- fix(deps): update module github.com/kairos-io/kairos-sdk to v0.4.4 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos-agent/pull/541](https://redirect.github.com/kairos-io/kairos-agent/pull/541)
- fix(deps): update module github.com/google/go-github/v64 to v65 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos-agent/pull/542](https://redirect.github.com/kairos-io/kairos-agent/pull/542)
- Move to use our ghw clone by
[@&#8203;Itxaka](https://redirect.github.com/Itxaka) in
[https://github.com/kairos-io/kairos-agent/pull/535](https://redirect.github.com/kairos-io/kairos-agent/pull/535)
- Run tests in parallel and output github formats on workflow by
[@&#8203;Itxaka](https://redirect.github.com/Itxaka) in
[https://github.com/kairos-io/kairos-agent/pull/543](https://redirect.github.com/kairos-io/kairos-agent/pull/543)

**Full Changelog**:
kairos-io/kairos-agent@v2.14.0...v2.14.1

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "after 11pm every weekday,before 7am
every weekday,every weekend" in timezone Europe/Brussels, Automerge - At
any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/kairos-io/provider-kairos).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC44MC4wIiwidXBkYXRlZEluVmVyIjoiMzguODAuMCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOltdfQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
renovate bot referenced this pull request in marinatedconcrete/config Sep 22, 2024
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [kairos-io/kairos](https://redirect.github.com/kairos-io/kairos) |
patch | `v3.1.2` -> `v3.1.3` |

---

> [!WARNING]
> Some dependencies could not be looked up. Check the Dependency
Dashboard for more information.

---

### Release Notes

<details>
<summary>kairos-io/kairos (kairos-io/kairos)</summary>

###
[`v3.1.3`](https://redirect.github.com/kairos-io/kairos/releases/tag/v3.1.3)

[Compare
Source](https://redirect.github.com/kairos-io/kairos/compare/v3.1.2...v3.1.3)

##### Release highlights:

- In the previous release, we introduced a fix for the broken
permissions of the user's home directory. It turned out that the fix
only applied to users created by the top level `users:` key in the
Kairos configuration file. In this release, users created in various
stages will also get their home directory permissions fixed. If for some
reason, you don't want [the script]() to recursively fix the home
directory permissions, you can create [a sentinel
file](https://redirect.github.com/kairos-io/packages/blob/2fce89f6499a722768b1c58b6eace5ed7e45742d/packages/static/kairos-overlay-files/files/usr/bin/fix-home-dir-ownership#L5-L7)
to skip the fix and apply it on your own as you see fit.
- Fixed an issue where we didn't calculate the upgrade image size and
the always created an image with the default size
([https://github.com/kairos-io/kairos/issues/2818](https://redirect.github.com/kairos-io/kairos/issues/2818))
- Fixed an issue in Kairos upgrades through Kuberentes, where various
host directories were also used in image size calculation
([https://github.com/kairos-io/kairos-agent/pull/537](https://redirect.github.com/kairos-io/kairos-agent/pull/537))
- We now display the webui url below the QR code to avoid people having
to plug a keyboard just to find the IP address of the node
([https://github.com/kairos-io/kairos/issues/2826](https://redirect.github.com/kairos-io/kairos/issues/2826))
- Fixed a bug in Alpine flavors where we passed the edgevpn arguments in
the openrc service file wrongly
([https://github.com/kairos-io/kairos/issues/2789](https://redirect.github.com/kairos-io/kairos/issues/2789))
-   Lots of version bumps on dependencies (mostly automated).

##### Known Issues

- \[Carry over from previous releases] RPi EFI booting no longer
supported on kernels shipped with Ubuntu 24.04+
[#&#8203;2249](https://redirect.github.com/kairos-io/kairos/issues/2249)

##### What's Changed

- Add permissions to generic arm release pipeline by
[@&#8203;mauromorales](https://redirect.github.com/mauromorales) in
[https://github.com/kairos-io/kairos/pull/2840](https://redirect.github.com/kairos-io/kairos/pull/2840)
- Update tj-actions/changed-files action to v45 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos/pull/2816](https://redirect.github.com/kairos-io/kairos/pull/2816)
- Add upgrade uki test by
[@&#8203;jimmykarily](https://redirect.github.com/jimmykarily) in
[https://github.com/kairos-io/kairos/pull/2776](https://redirect.github.com/kairos-io/kairos/pull/2776)
- Update dependency go to v1.23.1 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos/pull/2845](https://redirect.github.com/kairos-io/kairos/pull/2845)
- Generate relative paths to files by
[@&#8203;jimmykarily](https://redirect.github.com/jimmykarily) in
[https://github.com/kairos-io/kairos/pull/2846](https://redirect.github.com/kairos-io/kairos/pull/2846)
- 🤖 Make arm64 workers use docker mirror by
[@&#8203;Itxaka](https://redirect.github.com/Itxaka) in
[https://github.com/kairos-io/kairos/pull/2850](https://redirect.github.com/kairos-io/kairos/pull/2850)
- 🐛 Fix wifi cloud-config example by
[@&#8203;jimmyjones2](https://redirect.github.com/jimmyjones2) in
[https://github.com/kairos-io/kairos/pull/2820](https://redirect.github.com/kairos-io/kairos/pull/2820)
- 📖 Add alpine wifi cloud-config by
[@&#8203;jimmyjones2](https://redirect.github.com/jimmyjones2) in
[https://github.com/kairos-io/kairos/pull/2819](https://redirect.github.com/kairos-io/kairos/pull/2819)
- Update anchore/grype Docker tag to v0.80.1 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos/pull/2852](https://redirect.github.com/kairos-io/kairos/pull/2852)
- Update aquasec/trivy Docker tag to v0.55.0 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos/pull/2781](https://redirect.github.com/kairos-io/kairos/pull/2781)
- Update aquasec/trivy Docker tag to v0.55.1 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos/pull/2854](https://redirect.github.com/kairos-io/kairos/pull/2854)
- Update github/codeql-action action to v3.26.6 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos/pull/2799](https://redirect.github.com/kairos-io/kairos/pull/2799)
- Fix test printing old value for debugging by
[@&#8203;jimmykarily](https://redirect.github.com/jimmykarily) in
[https://github.com/kairos-io/kairos/pull/2855](https://redirect.github.com/kairos-io/kairos/pull/2855)
- Update google/osv-scanner-action action to v1.8.5 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos/pull/2853](https://redirect.github.com/kairos-io/kairos/pull/2853)
- Update quay.io/kairos/framework Docker tag to v2.11.5 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos/pull/2856](https://redirect.github.com/kairos-io/kairos/pull/2856)
- Update github/codeql-action action to v3.26.7 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos/pull/2858](https://redirect.github.com/kairos-io/kairos/pull/2858)
- Update quay.io/kairos/framework Docker tag to v2.11.7 by
[@&#8203;renovate](https://redirect.github.com/renovate) in
[https://github.com/kairos-io/kairos/pull/2859](https://redirect.github.com/kairos-io/kairos/pull/2859)
- Split the uploading of trivy and grype results by
[@&#8203;jimmykarily](https://redirect.github.com/jimmykarily) in
[https://github.com/kairos-io/kairos/pull/2860](https://redirect.github.com/kairos-io/kairos/pull/2860)

##### New Contributors

- [@&#8203;jimmyjones2](https://redirect.github.com/jimmyjones2) made
their first contribution in
[https://github.com/kairos-io/kairos/pull/2820](https://redirect.github.com/kairos-io/kairos/pull/2820)

**Full Changelog**:
kairos-io/kairos@v3.1.2...v3.1.3

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/marinatedconcrete/config).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC44MC4wIiwidXBkYXRlZEluVmVyIjoiMzguODAuMCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOltdfQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants