Skip to content

Elasticsearch sanitization does not work for bulk queries #1868

@phillipuniverse

Description

@phillipuniverse

Describe your environment

Discovered in elasticsearch 5.5.3 and elasticsearch-dsl 5.4.0 and caused by moving to the default sanitization in #1758.

The issue is illustrated here where body comes in as a string, not as a dictionary:

image

This is caseud by the bulk flow specifically as the body gets translated to a string here:

image

which looks like this:

image

Steps to reproduce

I don't have a super straightforward way to reproduce other than to use the bulk API from elasticsearch.

What is the expected behavior?
What did you expect to see?

What is the actual behavior?

The below stacktrace:

  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/someenv/lib/python3.11/site-packages/elasticsearch/helpers/__init__.py", line 95, in _process_bulk_chunk
    resp = client.bulk('\n'.join(bulk_actions) + '\n', **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/someenv/lib/python3.11/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
    return func(*args, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/someenv/lib/python3.11/site-packages/elasticsearch/client/__init__.py", line 1173, in bulk
    return self.transport.perform_request('POST', _make_path(index,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/someenv/lib/python3.11/site-packages/opentelemetry/instrumentation/elasticsearch/__init__.py", line 224, in wrapper
    attributes[SpanAttributes.DB_STATEMENT] = sanitize_body(
                                              ^^^^^^^^^^^^^^
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/someenv/lib/python3.11/site-packages/opentelemetry/instrumentation/elasticsearch/utils.py", line 54, in sanitize_body
    flatten_body = _flatten_dict(body)
                   ^^^^^^^^^^^^^^^^^^^
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/someenv/lib/python3.11/site-packages/opentelemetry/instrumentation/elasticsearch/utils.py", line 30, in _flatten_dict
    for k, v in d.items():
                ^^^^^^^
AttributeError: 'str' object has no attribute 'items'

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions