You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using OpenTelemetry in our EKS cluster via the ADOT addon, the Python auto-instrumentation init container occasionally fails with exit code 2, preventing the main application from starting. The issue occurs non-deterministically and can be temporarily resolved by deleting the affected pod.
Logs from opentelemetry-auto-instrumentation-python (init):
This issue might be specific to the ADOT implementation, as upstream OpenTelemetry uses the standard cp command while ADOT uses a custom Rust utility. The switch to the Rust utility was made in this PR and the error is thrown here.
What did you expect to see?
Autoinstrumentation init container starts with no error.
What did you see instead?
Autoinstrumentation init container fails with exit code 2.
Open Questions
I'm struggling with a few questions around this issue:
What could cause this issue where the cp command is run twice? a restart of the initContainer?
Is it possible to ignore this error and let the application start? I couldn't see any option here.
I switched from adot-operator-autoinstrumentation-python to adot-autoinstrumentation-python image and now its looking better. But its opens up more questions:
What is the difference between the two images and which one is recommended?
Where is the source code for adot-operator-autoinstrumentation-python image? However, this is irrelevant if the image is deprecated.
Where can I find the source code or helm chart of adot-operator where the --auto-instrumentation-python-image is configured?
Describe the bug
When using OpenTelemetry in our EKS cluster via the ADOT addon, the Python auto-instrumentation init container occasionally fails with exit code 2, preventing the main application from starting. The issue occurs non-deterministically and can be temporarily resolved by deleting the affected pod.
Logs from opentelemetry-auto-instrumentation-python (init):
Pod definition:
This issue might be specific to the ADOT implementation, as upstream OpenTelemetry uses the standard cp command while ADOT uses a custom Rust utility. The switch to the Rust utility was made in this PR and the error is thrown here.
What did you expect to see?
Autoinstrumentation init container starts with no error.
What did you see instead?
Autoinstrumentation init container fails with exit code 2.
Open Questions
I'm struggling with a few questions around this issue:
cp
command is run twice? a restart of the initContainer?Environment
Platform: EKS 1.30
EKS Addon ADOT Version: v0.109.0-eksbuild.2
Auto-instrumentation Image: public.ecr.aws/aws-observability/adot-operator-autoinstrumentation-python:0.48b0
The text was updated successfully, but these errors were encountered: