-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support timeouts in otel-cli exec #55
Comments
It's maybe not ideal to use the same timeout for various parts of otel-cli but right now I am leaning towards only having one to keep the UX simple and to make it default in every part of otel-cli. We never want to hang someone's script because of otel-cli if we can avoid it. I wrote up #55 to make sure someone circles back and adds a timeout to cmd/exec.go implements OTEL_EXPORTER_OTLP_TIMEOUT and `timeout` in the config file Restores default listen host on just server tui and server json. This could maybe have been cleaner but this seems ok to move things forward. Signed-off-by: Amy Tobey <atobey@equinix.com>
Modifies otel-cli so that it fails gracefully when no explicit configuration has been set. This means that the default endpoint is no longer present, and you will need to pass --endpoint or set OTEL_EXPORTER_OTLP_ENDPOINT to get otel-cli to emit spans like it used to by default. This change makes it so otel-cli can be added to programs unconfigured, shipped, and later enabled via adding the OTEL_EXPORTER_OTLP_ENDPOINT environment variable. * remove default endpoint, stub initTracer() Having a default means each invocation of otel-cli might timeout on trying to connect over OTLP. With this change, in the absence of --endpoint or OTEL_EXPORTER_OTLP_ENDPOINT otel-cli will now not configure opentelemetry-go at all, but the rest of otel-cli will work as usual, just no spans are sent out. Signed-off-by: Amy Tobey <atobey@equinix.com> * use a single global timeout flag It's maybe not ideal to use the same timeout for various parts of otel-cli but right now I am leaning towards only having one to keep the UX simple and to make it default in every part of otel-cli. We never want to hang someone's script because of otel-cli if we can avoid it. I wrote up #55 to make sure someone circles back and adds a timeout to cmd/exec.go implements OTEL_EXPORTER_OTLP_TIMEOUT and `timeout` in the config file Restores default listen host on just server tui and server json. This could maybe have been cleaner but this seems ok to move things forward. Signed-off-by: Amy Tobey <atobey@equinix.com> * add notes about fail-open behavior to README Signed-off-by: Amy Tobey <atobey@equinix.com> * rename short variable to be more readable per code review from @ahayworth Co-authored-by: Andrew Hayworth <ahayworth@gmail.com> Signed-off-by: Amy Tobey <atobey@equinix.com> * update help text for --timeout per code review Signed-off-by: Amy Tobey <atobey@equinix.com> Co-authored-by: Andrew Hayworth <ahayworth@gmail.com>
Hi 🙂 That flag seems to default to 1s which came as a bit of a surprise when I tried Since the timeout is also relevant for sending the span at the end of an exec this error is a bit misleading IMO:
The result is (I guess in combination with #258) that if I wrap a command that takes more than the allowed deadline, this error is printed and no root-span is submitted to the endpoint 🙁 |
@zerok gotcha. Yeah that's awkward. The automatic timeout is a leftover from early development when I was worried about otel-cli hanging and messing up peoples' scripts. I feel that otel-cli should never run without some timeout, so it doesn't end up lingering on machines. So, my inclination is to wrap that error and return better information so you know you need to set a big timeout for your commands. |
0.4.1 is released and the issue should be solved. Please reopen if the issue persists. |
When #54 merges, there will be a global timeout flag. As things are,
otel-cli exec
will accept that flag and then ignore it. Even a simple take on timing out the subprocess seems very handy for users.It perhaps overlaps with the timeout(1) command, but can set status after too so it seems worth it.
The text was updated successfully, but these errors were encountered: