Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/elasticsearch] add support for batcher config #34238

Merged
merged 7 commits into from
Jul 30, 2024

Conversation

axw
Copy link
Contributor

@axw axw commented Jul 24, 2024

Description:

Add opt-in support for the experimental batch sender (open-telemetry/opentelemetry-collector#8122). When opting into this functionality the exporter's Consume* methods will make synchronous bulk requests to Elasticsearch, without additional batching/buffering in the exporter.

By default the exporter continues to use its own buffering, which supports thresholds for time, number of documents, and size of encoded documents in bytes. The batch sender does not currently support a bytes-based threshold, and is experimental, hence why we are not yet making it the default for the Elasticsearch exporter.

This PR is based on #32632, but made to be non-breaking.

Link to tracking Issue:

#32377

Testing:

Added unit and integration tests.

Manually tested with Elasticsearch, using the following config to highlight that client metadata can now flow through all the way:

receivers:
  otlp:
    protocols:
      http:
        endpoint: 0.0.0.0:4318
        include_metadata: true

exporters:
  elasticsearch:
    endpoint: "http://localhost:9200"
    auth:
      authenticator: headers_setter
    batcher:
      enabled: false

extensions:
  headers_setter:
    headers:
      - action: insert
        key: Authorization
        from_context: authorization

service:
  extensions: [headers_setter]
  pipelines:
    traces:
      receivers: [otlp]
      processors: []
      exporters: [elasticsearch]

I have Elasticsearch running locally, with an "admin" user with the password "changeme". Sending OTLP/HTTP to the collector with telemetrygen traces --otlp-http --otlp-insecure http://localhost:4318 --otlp-header "Authorization=\"Basic YWRtaW46Y2hhbmdlbWU=\"", I observe the following:

  • Without the batcher config, the exporter fails to index data into Elasticsearch due to an auth error. That's because the exporter is buffering and dropping the context with client metadata, so there's no Authorization header attached to the requests going out.
  • With batcher: {enabled: true}, same behaviour as above. Unlike the batch processor, the batch sender does not maintain client metadata.
  • With batcher: {enabled: false}, the exporter successfully indexes data into Elasticsearch.

Documentation:

Updated the README.

Add opt-in support for the experimental batcher helper -
open-telemetry/opentelemetry-collector#8122

When opting into this functionality the exporter's Consume*
methods will make synchronous bulk requests to Elasticsearch,
without additional batching/buffering in the exporter.
@axw axw marked this pull request as ready for review July 24, 2024 06:36
@axw axw requested review from a team and fatsheep9146 July 24, 2024 06:36
Copy link
Contributor

@carsonip carsonip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! looks great at a high level

exporter/elasticsearchexporter/README.md Outdated Show resolved Hide resolved
exporter/elasticsearchexporter/factory.go Show resolved Hide resolved
@axw axw requested a review from carsonip July 25, 2024 05:37
@axw
Copy link
Contributor Author

axw commented Jul 25, 2024

Apparently the backoff function is stateful, as can be seen from the data races in the tests. Working on a fix.

axw and others added 2 commits July 25, 2024 18:19
Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
- Add flush timeout to sync bulk indexer, disable timeout_sender.
- Create a new backoff function for each call to Flush in the sync
  bulk indexer, since the function is stateful and not safe for
  concurrent calls to Flush.
@mx-psi mx-psi merged commit 91dce71 into open-telemetry:main Jul 30, 2024
154 checks passed
@github-actions github-actions bot added this to the next release milestone Jul 30, 2024
@axw axw deleted the elasticsearch-sync-bulkindexer branch July 30, 2024 09:41
dmitryax added a commit to open-telemetry/opentelemetry-collector that referenced this pull request Aug 17, 2024
#### Description

This PR adds opt-in support to oltp exporter for the experimental batch
sender
(#8122).
By default batch sender is not enabled.

Similar:
*
open-telemetry/opentelemetry-collector-contrib#34238
*
open-telemetry/opentelemetry-collector-contrib#32563

#### Link to tracking issue

Resolves
#10834

#### Testing

`exporter/otlpexporter/config_test.go`

#### Documentation

Updated the `oltpexporter` README to point to `exporterhelper` README
for batching.

---------

Co-authored-by: Dmitrii Anoshin <anoshindx@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants