Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary instances of app.kubernetes.io/managed-by #3074

Merged
merged 12 commits into from
Sep 5, 2024

Conversation

jaronoff97
Copy link
Contributor

@jaronoff97 jaronoff97 commented Jun 25, 2024

Description:
This PR resolves a bug where we applying the app.kubernetes.io/managed-by to resources unnecessarily. We also relied on this label to be present which caused a nasty bug with instrumentation. Because we expect this to be on all CRDs and we use that for querying for upgrades, when we go to query for instrumentations, any that were applied by helm (which sets this label itself) would be omitted, causing them to fail to be upgraded. This change is technically breaking, but is also a bug fix because we are no longer failing the user expectation for auto instrumentation fixes.

Link to tracking Issue(s):

Testing: Unit, manual

Documentation:

@jaronoff97 jaronoff97 requested a review from a team June 25, 2024 19:21
Copy link
Contributor

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok, but I'm not 100% confident there's no edge case when upgrading to the operator version with this change.

@jaronoff97
Copy link
Contributor Author

jaronoff97 commented Jun 26, 2024

@swiatekm-sumo i think if we were going the other way (beginning to filter) that would be problematic, but this should encapsulate all collector CRs now, not just the ones created without helm. I will test this prior to merging.

@jpkrohling
Copy link
Member

Hey folks, are there plans to merge this? Someone asked me about this one today, they would benefit from this fix.

@jaronoff97
Copy link
Contributor Author

@jpkrohling i still need to do some testing on this. I want to be sure we're not going to break existing installations more than we need to. I'll see if I can prioritize that for this week, but I'm juggling a few responsibilities currently.

@kennytrytek-wf
Copy link

kennytrytek-wf commented Aug 23, 2024

I want to be sure we're not going to break existing installations

@jaronoff97 Release https://github.com/open-telemetry/opentelemetry-operator/releases/tag/v0.107.0 has already broken existing installations, since upgrade 105 is not being applied due to this bug. If the feature flag that was removed in the collector is present in the custom resource, then the collector pod will begin crashing due to invalid configuration. A workaround is to edit the CR to remove the offending flag, though this is tedious if you manage many collectors.

@jaronoff97
Copy link
Contributor Author

@kennytrytek-wf thanks for the report. I will make this my priority and this will block our next release. Sincere apologies for the tediousness there.

@kennytrytek-wf
Copy link

@jaronoff97 Thank you! We're currently updating all our CRs. :)

@swiatekm swiatekm self-requested a review August 27, 2024 10:03
@swiatekm
Copy link
Contributor

At this point I think we should either split the upgrade fix into its own PR or add another changelog entry. What we're doing here is way beyond just removing the labels.

@jaronoff97 jaronoff97 merged commit 09d0758 into open-telemetry:main Sep 5, 2024
33 checks passed
@jaronoff97 jaronoff97 deleted the fix-managed-by-gross branch September 5, 2024 12:00
@kennytrytek-wf
Copy link

Thank you for the quick turnaround on getting this merged and released!

@jaronoff97
Copy link
Contributor Author

no problem sorry it took so long. Testing this was really difficult 😭

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue with Deploying Otel Operator and Instrumentation CR instance via Helm
9 participants