Newly-created DAG is loaded in DB but existing DagBag instances are unable to get the DAG #10341

shivanshs9 · 2020-08-15T13:40:07Z

Apache Airflow version: 1.10.11

Kubernetes version (if you are using kubernetes) (use kubectl version):

Environment:

Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

What happened:

I am using an Airflow plugin to generate dynamic DAGs and Airflow is able to successfully load the new ORM DAG in DB. Hence the DAGs listing at home page is also updated. However, trying to refresh the DAG or opening the graph view causes an error:

[2020-08-15 13:12:01,862] {{app.py:1891}} ERROR - Exception on /refresh [POST]
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/www_rbac/decorators.py", line 121, in wrapper
    return f(self, *args, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/flask_appbuilder/security/decorators.py", line 109, in wraps
    return f(self, *args, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/www_rbac/decorators.py", line 56, in wrapper
    return f(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/db.py", line 74, in wrapper
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/www_rbac/views.py", line 1941, in refresh
    appbuilder.sm.sync_perm_for_dag(dag_id, dag.access_control)
AttributeError: 'NoneType' object has no attribute 'access_control'

What you expected to happen:

I expected the refresh/trigger and other functionalities to work fine.

How to reproduce it:

Launch the airflow webserver and scheduler as usual.
Create a new DAG in runtime by using dagen-airflow plugin.
Use the Dagen UI to create a new DAG and approve it.
Go to Airflow homepage and you'll find the newly-created DAG listed there.
Click on refresh link and the error pops up.

Anything else we need to know:

I understand that waiting for all airflow workers to restart and tweaking worker_refresh_interval config would help here. After all, the issue is due to in-memory instances of DagBag not able to collect the new DAG beforehand.
While a restart would help, I propose that there could be a configuration bool option like attempt_refresh_dagbag (by default it is False for backwards compatibility). If it is True and if DagBag doesn't have the DAG loaded (in this case, DagBag.get_dag() returns None), it would attempt to load the DAG directly by processing the file stored in the DagModel.
This would be a better option for those who'd like to not wait for the new DAG to sync with all workers. Plus, they can focus on improving performance by increasing the worker_refresh_interval value and still work with the new DAGs ASAP.

The text was updated successfully, but these errors were encountered:

boring-cyborg · 2020-08-15T13:40:09Z

Thanks for opening your first issue here! Be sure to follow the issue template!

kaxil · 2020-08-17T14:29:21Z

#10328 - should provide you with an endpoint to force refresh all the DAGs

shivanshs9 · 2020-08-17T19:20:15Z

@kaxil that would refresh DAGs from the DB only for the process that received the POST request, right?
I think it would still randomly fail when opening the DAG even after "refresh all" button is clicked.

Although it still would be a better solution since it would allow one to attempt to refresh the dagbag, in memory, from DB in all the workers. 🤔

kaxil · 2020-08-17T19:34:38Z

@kaxil that would refresh DAGs from the DB only for the process that received the POST request, right?
I think it would still randomly fail when opening the DAG even after "refresh all" button is clicked.

Although it still would be a better solution since it would allow one to attempt to refresh the dagbag, in memory, from DB in all the workers. 🤔

That will refresh DAGs from DB if DAG Serialization is enabled, if not it will refresh them from DAG files

shivanshs9 · 2020-08-18T18:30:37Z

That will refresh DAGs from DB if DAG Serialization is enabled, if not it will refresh them from DAG files

I understand that but I'd again like to confirm that it'll only refresh the DAGs for the in-memory instance of the DagBag for that specific gunicorn worker process that received the request.
With more than 1 web worker, trying to open DAG details or trigger will still randomly fail since the "refresh all" request may be POSTed to some other worker.

I understand that waiting for all airflow workers to restart and tweaking worker_refresh_interval config would help here. After all, the issue is due to in-memory instances of DagBag not able to collect the new DAG beforehand.
While a restart would help, I propose that there could be a configuration bool option like attempt_refresh_dagbag (by default it is False for backwards compatibility). If it is True and if DagBag doesn't have the DAG loaded (in this case, DagBag.get_dag() returns None), it would attempt to load the DAG directly by processing the file stored in the DagModel.
This would be a better option for those who'd like to not wait for the new DAG to sync with all workers. Plus, they can focus on improving performance by increasing the worker_refresh_interval value and still work with the new DAGs ASAP.

What I meant above for a catch-all handler (optional and off by default) to get around this randomness of the bug.

eladkal · 2021-10-10T08:11:19Z

This issue is reported against old version of Airflow (which is end of life).
If the issue is still present in latest airflow version please let us know.

shivanshs9 added the kind:bug This is a clearly a bug label Aug 15, 2020

shivanshs9 mentioned this issue Aug 15, 2020

Newly created and approved DAG takes time to load in the DagBag headout/dagen-airflow#1

Open

eladkal closed this as completed Oct 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Newly-created DAG is loaded in DB but existing DagBag instances are unable to get the DAG #10341

Newly-created DAG is loaded in DB but existing DagBag instances are unable to get the DAG #10341

shivanshs9 commented Aug 15, 2020

boring-cyborg bot commented Aug 15, 2020

kaxil commented Aug 17, 2020

shivanshs9 commented Aug 17, 2020

kaxil commented Aug 17, 2020

shivanshs9 commented Aug 18, 2020

eladkal commented Oct 10, 2021

Newly-created DAG is loaded in DB but existing DagBag instances are unable to get the DAG #10341

Newly-created DAG is loaded in DB but existing DagBag instances are unable to get the DAG #10341

Comments

shivanshs9 commented Aug 15, 2020

boring-cyborg bot commented Aug 15, 2020

kaxil commented Aug 17, 2020

shivanshs9 commented Aug 17, 2020

kaxil commented Aug 17, 2020

shivanshs9 commented Aug 18, 2020

eladkal commented Oct 10, 2021