Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error due to "Helm manifest was not ready for retrieval" when Kubernetes Object Status for Helm is enabled #9182

Closed
APErebus opened this issue Dec 18, 2024 · 3 comments
Assignees
Labels
kind/bug This issue represents a verified problem we are committed to solving

Comments

@APErebus
Copy link

APErebus commented Dec 18, 2024

Version

2025.1

Latest Version

I could reproduce the problem in the latest build

What happened?

When Kubernetes Object Status for Helm (also referred to as Step Verification in the Step configuration) is enabled, users may see the following verbose logs.

Helm manifest was not ready for retrieval. Retrying in 1s.

followed by a TaskCanceledException, causing the step to fail.

This can be due to an empty helm chart (resulting in no manifests), but also, the version of Helm and the number of previous upgrades for that helm chart can also cause this error.

It is possible that users running a version earlier than 2025.1.3480 may also run into this issue, even with Step Verification/KOS disabled. See #9160.

Reproduction

  1. Use Helm < 3.13
  2. Deploy the same chart > 10 times
  3. Enable Step Verification for the helm step
  4. Run the deployment again
  5. Note the error

Error and Stacktrace

10:59:08   Verbose  |       Helm manifest was not ready for retrieval. Retrying in 1s.
10:59:08   Verbose  |       Helm manifest was not ready for retrieval. Retrying in 1s.
10:59:08   Verbose  |       Helm manifest was not ready for retrieval. Retrying in 1s.
10:59:08   Verbose  |       Helm manifest was not ready for retrieval. Retrying in 1s.
10:59:13   Verbose  |       Helm manifest was not ready for retrieval. Retrying in 1s.
10:59:13   Verbose  |       Helm manifest was not ready for retrieval. Retrying in 1s.
10:59:13   Verbose  |       Helm manifest was not ready for retrieval. Retrying in 1s.
10:59:13   Error    |       System.Threading.Tasks.TaskCanceledException: A task was canceled.
10:59:13   Error    |       at Calamari.Kubernetes.Conventions.Helm.HelmManifestAndStatusReporter.PollForManifest(RunningDeployment deployment, HelmCli helmCli, String releaseName, Int32 revisionNumber) in C:\BuildAgent\work\e61a42f6adc5dcb6\source\Calamari\Kubernetes\Conventions\Helm\HelmManifestAndStatusReporter.cs:line 77
10:59:13   Error    |       at Calamari.Kubernetes.Conventions.Helm.HelmManifestAndStatusReporter.<>c__DisplayClass5_0.<<StartBackgroundMonitoringAndReporting>b__0>d.MoveNext() in C:\BuildAgent\work\e61a42f6adc5dcb6\source\Calamari\Kubernetes\Conventions\Helm\HelmManifestAndStatusReporter.cs:line 43
10:59:13   Error    |       --- End of stack trace from previous location ---

More Information

This issue is due to the lack of the get metadata command in Helm versions earlier than 3.13.

We retrieve the current revision from the metadata command, so we can retrieve the manifest for the deploying revision.

However, if the get metadata command fails or doesn't exist, we use revision 1.

Also, due to the --history-max flag on the helm upgrade command defaulting to 10, it's possible the revision 1 manifest no longer exists, as a result, the get manifest call will never return anything, timing out with the TaskCanceledException.

Workaround

There are a couple of easy workarounds:

  1. Upgrade helm executable to a version later than 3.13. For reference, the current version of Helm is 3.16.2.
  2. If using the octopusdeploy/worker-tools execution container, upgrade to the latest version (octopusdeploy/worker-tools:6.3.0-ubuntu.22.04)
  3. Change from using a Kubernetes API target to using the new Kubernetes Agent target. It contains it's own tooling that is continuously updated to be the latest version of helm and kubectl to match the target cluster.
@APErebus APErebus added the kind/bug This issue represents a verified problem we are committed to solving label Dec 18, 2024
@Clare-Octopus
Copy link

One thing to note if you are upgrading your helm version ensure it is compatible with the Kubernetes version on your cluster. Please see Helms official documentation on this here - https://helm.sh/docs/topics/version_skew/#supported-version-skew

@octoreleasebot
Copy link

Release Note: Mitigated an issue where Helm manifests could not be retrieved for Kubernetes Object Status. This feature requires Helm 3.13 or later.

@Octobob
Copy link
Member

Octobob commented Dec 19, 2024

🎉 The fix for this issue has been released in:

Release stream Release
2025.1 2025.1.4505

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This issue represents a verified problem we are committed to solving
Projects
None yet
Development

No branches or pull requests

4 participants