batchprocessor: send_batch_max_size_bytes limit #6046

jangaraj · 2022-09-10T07:19:30Z

Is your feature request related to a problem? Please describe.
Golang GRPC server has default message limit 4MB. Batch processor can generate bigger message size, so receiver will reject batch and whole batch can be dropped:

"msg": "Exporting failed. The error is not retryable. Dropping data.",
"kind": "exporter",
"data_type": "traces",
"name": "otlp",
"error": "Permanent error: rpc error: code = ResourceExhausted desc = grpc: received message after decompression larger than max (5297928 vs. 4194304)",
"dropped_items": 4725,

Current batchprocessor config options doesn't provide opportunity to prevent this situation, because they works only with span counts, but not with whole batch size. send_batch_max_size is also count of spans.

Describe the solution you'd like
New config option send_batch_max_size_bytes (maybe there can be better name), where will be defined default GRPC 4MB size (4194304), which will ensure that batch won't exceed this size.

Describe alternatives you've considered
At the moment user can customize send_batch_size/send_batch_max_size, but in theory there can be a few traces with huge spans (e.g. Java backtraces with logs) and default 4MB grpc message limit can be exceeded. Maybe OTLP exporter may handle this message limitation.

The text was updated successfully, but these errors were encountered:

evandam · 2022-10-13T21:15:19Z

👍 for this feature. We're currently doing some trial and error to figure out the right balance of send_batch_size and send_batch_max_size and hoping it stays under 4MB but having a guarantee would definitely be preferred.

jangaraj · 2022-10-13T21:36:05Z

@evandam I made some recommendations here https://github.com/monitoringartist/opentelemetry-trace-pipeline-poisoning#mitigation-of-huge-4mb-trace

evandam · 2022-10-14T16:37:50Z

Nice link, thank you! It definitely still relies on some back-of-the-envelope math which is bound to be wrong sooner or later, and it would be great to have an easy way to do this at the exporter/collector level.

dmitryax · 2023-04-05T18:47:29Z

The size-based batching will only work if the processor is being used with OTLP exporter, but other exporters will have different batch sizes due to different encoding. I believe if we go with #4646, we should be able to provided this for any exporter

cwegener · 2023-10-17T03:38:13Z

Describe alternatives you've considered
At the moment user can customize send_batch_size/send_batch_max_size, but in theory there can be a few traces with huge spans (e.g. Java backtraces with logs) and default 4MB grpc message limit can be exceeded. Maybe OTLP exporter may handle this message limitation.

For those willing to configure a different amount of memory to be allocated for each GRPC message on the downstream OTLP Collectors' Receiver config, there also is the max_recv_msg_size_mib option.

#1122 (comment)

elysiumHL · 2023-10-31T01:46:42Z

Describe alternatives you've considered
At the moment user can customize send_batch_size/send_batch_max_size, but in theory there can be a few traces with huge spans (e.g. Java backtraces with logs) and default 4MB grpc message limit can be exceeded. Maybe OTLP exporter may handle this message limitation.

For those willing to configure a different amount of memory to be allocated for each GRPC message on the downstream OTLP Collectors' Receiver config, there also is the max_recv_msg_size_mib option.

#1122 (comment)

this param did not work for otlp exporter
this my conf

exporters:
  debug:
    # verbosity: detailed
    verbosity: normal
  otlp/tempo:
    max_recv_msg_size_mib: 200
    endpoint: tempo:4317
    tls:
      insecure: true
    auth:
      authenticator: headers_setter

and this is the output of log

2023/10/31 09:43:11 collector server run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'exporters': error reading configuration for "otlp/tempo": 1 error(s) decoding:

* '' has invalid keys: max_recv_msg_size_mib

cwegener · 2023-10-31T02:49:09Z

this param did not work for otlp exporter

No it won't. the maximum receive message size is only for the gRPC server side.

On the gRPC client side, the client's max receive message size must be provided in the call options when the client makes a call to the gRPC server.

What is your OTEL collector use case where the exporter receives such large messages from the remote OTLP receiver though? I cannot think of a scenario where this would even be the case.

lmnogues · 2024-05-30T12:17:02Z

did you manage to solves this issue ?

ptodev · 2024-06-21T11:53:33Z

I think this is the issue which would resolve this eventually.

smoke · 2024-07-02T16:40:09Z

Is there a way to dump / debug spans causing that?

Update: I have figured it out configuring the otel collector this way, so it prints both the error message and the all of the Span details it sends

...
config:
  exporters:
    otlp:
      endpoint: "otel-lb-collector:4317"
      tls:
        insecure: true
    debug: {}
    debug/detailed:
      verbosity: detailed
  extensions:
    health_check: {}
  processors:
    resourcedetection:
      detectors: [env, system]
    batch:
      send_batch_size: 1
      send_batch_max_size: 1
  ...
  service:
  ...
    pipelines:
    ...
      traces:
        exporters:
          - debug
          - debug/detailed
          - otlp
    ...

In my case the culprit was python Pymongo instrumentation with enabled capture_statement, so all of the content of a insert statement was captured
It was sent to otel-agent through otlp/http fine and then error happens when otel-agent send through otlp/grpc to otel-gw.

yashumitsu mentioned this issue Apr 24, 2023

tail_sampling fails on traces larger than 4Mb grafana/alloy#481

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

batchprocessor: send_batch_max_size_bytes limit #6046

batchprocessor: send_batch_max_size_bytes limit #6046

jangaraj commented Sep 10, 2022

evandam commented Oct 13, 2022

jangaraj commented Oct 13, 2022

evandam commented Oct 14, 2022

dmitryax commented Apr 5, 2023

cwegener commented Oct 17, 2023

elysiumHL commented Oct 31, 2023

cwegener commented Oct 31, 2023

lmnogues commented May 30, 2024

ptodev commented Jun 21, 2024

smoke commented Jul 2, 2024 •

edited

Loading

batchprocessor: send_batch_max_size_bytes limit #6046

batchprocessor: send_batch_max_size_bytes limit #6046

Comments

jangaraj commented Sep 10, 2022

evandam commented Oct 13, 2022

jangaraj commented Oct 13, 2022

evandam commented Oct 14, 2022

dmitryax commented Apr 5, 2023

cwegener commented Oct 17, 2023

elysiumHL commented Oct 31, 2023

cwegener commented Oct 31, 2023

lmnogues commented May 30, 2024

ptodev commented Jun 21, 2024

smoke commented Jul 2, 2024 • edited Loading

smoke commented Jul 2, 2024 •

edited

Loading