Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eager exporting for BatchSpanProcessor #3094

Closed
1 of 2 tasks
seemk opened this issue Jul 14, 2022 · 8 comments · Fixed by #3958
Closed
1 of 2 tasks

Eager exporting for BatchSpanProcessor #3094

seemk opened this issue Jul 14, 2022 · 8 comments · Fixed by #3958

Comments

@seemk
Copy link
Contributor

seemk commented Jul 14, 2022

Batch span processor currently exports batches with a strict interval: 5s interval, 512 batch size as per OTel's default conf (comes to around 100 spans per second for the throughput). However it does not do eager batch exporting. Assume after exporting there are still more than 512 spans in the span queue, with the current logic BSP will wait another 5 seconds instead of trying a new export. If the application generates more than 100 spans per second, the BSP queue will eventually be full and spans will be dropped.

Most SDKs seem to do eager exporting as per my comment here.

Even though the delay and batch size is configurable by environment variables, I think a default throughput of 100 spans per second, while silently dropping all other spans, could be somewhat improved.

  • This only affects the JavaScript OpenTelemetry library
  • This may affect other libraries, but I would like to get opinions here first
@dyladan
Copy link
Member

dyladan commented Jul 14, 2022

Would you suggest eager exporting to be the default or a configurable setting? Is there relevant spec guidance?

@seemk
Copy link
Contributor Author

seemk commented Jul 14, 2022

I guess if to strictly read the spec, the current BSP is compliant.

What about if we'd change the current implementation so that if after exporting there are more than OTEL_BSP_MAX_EXPORT_BATCH_SIZE spans in the buffer, another export is started right away? And if there's less than that, the usual timer is started.

@dyladan
Copy link
Member

dyladan commented Jul 14, 2022

I made a comment on the spec but there are 2 ways to solve this. 1 is to export multiple batches every x ms. The other is to export a batch as soon as the queue meets or exceeds the max export size.

@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Sep 19, 2022
@github-actions
Copy link

This issue was closed because it has been stale for 14 days with no activity.

@morigs
Copy link
Contributor

morigs commented Nov 7, 2022

I think this should be reopened. The problem is quite serious actually. It's too easy to lose spans when you don't know span rate beforehand. It works as rate-limiter now, not as batcher...
@dyladan Both solutions are good. I personally like the second because it allows to keep memory usage on the same level and avoid OOMs
What do you think? How this should be implemented? Are changes to the spec required?

@dyladan
Copy link
Member

dyladan commented Nov 7, 2022

I agree this is not stale. Thanks for bringing it back up. I also think both solutions are acceptable.

@GregLahaye
Copy link

Hey all, the BSP spec has been updated to specify that it should perform eager exporting open-telemetry/opentelemetry-specification#3024

I see there's already a PR open (#3458) but I'm happy to help on a fix for this if needed, as we've had many issues of spans being silently dropped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants