Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch.__init__() got an unexpected keyword argument 'use_ssl' #34099

Closed
1 of 2 tasks
cesar-vermeulen opened this issue Sep 5, 2023 · 10 comments · Fixed by #34119
Closed
1 of 2 tasks

Elasticsearch.__init__() got an unexpected keyword argument 'use_ssl' #34099

cesar-vermeulen opened this issue Sep 5, 2023 · 10 comments · Fixed by #34119

Comments

@cesar-vermeulen
Copy link

cesar-vermeulen commented Sep 5, 2023

Apache Airflow version

2.7.0

What happened

When upgrading apache-airflow-providers-elasticsearch to the newest provider (5.0.1), Airflow is unable to spin up. The scheudler and airflow-migrations both crash on following error:

....................
ERROR! Maximum number of retries (20) reached.

Last check result:
$ airflow db check
Unable to load the config, contains a configuration error.
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/logging/config.py", line 573, in configure
    handler = self.configure_handler(handlers[name])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/logging/config.py", line 758, in configure_handler
    result = factory(**kwargs)
             ^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/elasticsearch/log/es_task_handler.py", line 110, in __init__
    self.client = elasticsearch.Elasticsearch(host, **es_kwargs)  # type: ignore[attr-defined]
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Elasticsearch.__init__() got an unexpected keyword argument 'use_ssl'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 5, in <module>
    from airflow.__main__ import main
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/__init__.py", line 68, in <module>
    settings.initialize()
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/settings.py", line 524, in initialize
    LOGGING_CLASS_PATH = configure_logging()
                         ^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/logging_config.py", line 74, in configure_logging
    raise e
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/logging_config.py", line 69, in configure_logging
    dictConfig(logging_config)
  File "/usr/local/lib/python3.11/logging/config.py", line 823, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/local/lib/python3.11/logging/config.py", line 580, in configure
    raise ValueError('Unable to configure handler '
ValueError: Unable to configure handler 'task'
Exception ignored in atexit callback: <function shutdown at 0x7fb5255f22a0>
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/logging/__init__.py", line 2193, in shutdown
    h.close()
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/elasticsearch/log/es_task_handler.py", line 396, in close
    if not self.mark_end_on_close or getattr(self, "ctx_task_deferred", None):
           ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'ElasticsearchTaskHandler' object has no attribute 'mark_end_on_close'
2023-09-05T06:04:58.154163010Z

What you think should happen instead

Deploy Airflow without any crashing schedulers

How to reproduce

I'm currently running airflow on version 2.7.0 on AKS. I'm trying to upgrade elasticsearch to package 5.0.1.

Following config is present for elasticsearch:

    [elasticsearch]

  frontend = https://*redacted*.kb.westeurope.azure.elastic-cloud.com/app/discover#/?_a=(columns:!(message),filters:!(),index:'6551970b-aa72-4e2a-b255-d296a6fcdc09',interval:auto,query:(language:kuery,query:'log_id:"{log_id}"'),sort:!(log.offset,asc))&amp;_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-1m,to:now))

    json_format = True

    log_id_template = {dag_id}_{task_id}_{execution_date}_{try_number}

    [elasticsearch_configs]

    max_retries = 3

    retry_timeout = True

    timeout = 30

Operating System

Debian GNU/Linux 11 (bullseye)

Versions of Apache Airflow Providers

apache-airflow-providers-cncf-kubernetes==7.4.2
apache-airflow-providers-docker==3.7.4
apache-airflow-providers-microsoft-azure==4.3.0
apache-airflow-providers-elasticsearch==5.0.1

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@cesar-vermeulen cesar-vermeulen added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Sep 5, 2023
@eladkal
Copy link
Contributor

eladkal commented Sep 5, 2023

cc @Owen-CH-Leung

@Owen-CH-Leung
Copy link
Contributor

I'll look at it when I return from vacation next week

@Taragolis
Copy link
Contributor

I guess the issue that Elasticsearch 8 doesn't support all default parameters from https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#elasticsearch-configs section which previously supported by one of the removed libraries.

@cesar-vermeulen I guess it is not problem in core, if you switch to 4.5.1 of provider, I guess it would fix your logging

@potiuk
Copy link
Member

potiuk commented Sep 5, 2023

I am quite positive that this should be fixed in 2.7.1 via #33135

There was a bit of problem with 2.7.0 that elasticsearch configuration has been still provided by Apache Airflow (we have not removed the configuration from Airflow and then we added the new configuration was added to elasticsearch provider. That was the only "provider" configuration left in 2.7.0 and it's been removed in 2.7.1.

We are currently testing 2.7.1 rc2 - see #34065

Can you please check @cesar-vermeulen if this problem is reproducible in 2.7.1rc2 ?

Closing for now - we can always reopen if not fixed.

@potiuk potiuk closed this as completed Sep 5, 2023
@Taragolis
Copy link
Contributor

@potiuk I this it also need to be fixed on provider level, otherwise it would be broken in Airflow 2.4-2.7.0

@potiuk potiuk reopened this Sep 5, 2023
@potiuk
Copy link
Member

potiuk commented Sep 5, 2023

Correct. We need to fix it in the provider. We will likely need to yank the current provider

@Taragolis
Copy link
Contributor

Let me try to fix it, seem like it is "20 minutes adventure"

@potiuk
Copy link
Member

potiuk commented Sep 5, 2023

Let me try to fix it, seem like it is "20 minutes adventure"

I have a better fix :)

@Taragolis
Copy link
Contributor

I have a better fix :)

Airflow 3 🤔

@potiuk
Copy link
Member

potiuk commented Sep 5, 2023

#34119

potiuk added a commit to potiuk/airflow that referenced this issue Sep 6, 2023
The elasticsearch handler got all configuraiton parameters
from the "elasticsearch_config" section but it means that in
airflow versions pre 2.7 it could get old config keys which renders
the new provider unusable.

This PR filters out configuration parameter to only pass valid
parameters for the new handler.

Fixes: apache#34099
potiuk added a commit that referenced this issue Sep 7, 2023
…34119)

The elasticsearch handler got all configuraiton parameters
from the "elasticsearch_config" section but it means that in
airflow versions pre 2.7 it could get old config keys which renders
the new provider unusable.

This PR filters out configuration parameter to only pass valid
parameters for the new handler.

Fixes: #34099
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment