Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError raised on Celery exception #3280

Closed
jezdez opened this issue Jan 14, 2019 · 6 comments
Closed

TypeError raised on Celery exception #3280

jezdez opened this issue Jan 14, 2019 · 6 comments

Comments

@jezdez
Copy link
Member

jezdez commented Jan 14, 2019

Issue Summary

As part of the API to get the status of a job we've seen increased number of TypeError raised when the query for the job in question fails (for an unrelated reason) with an sqlalchemy.exc.IntegrityError exception. The cause for that exception is not clear (possibly fork related), but the purpose of this ticket is to investigate the way errors are handled in general and the cause of the error in not relevant per se.

celery/celery#5057 describes the result pretty well, but here's our traceback:

TypeError: __init__() takes at least 4 arguments (2 given)
  File "flask/app.py", line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File "flask/app.py", line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "flask_restful/__init__.py", line 477, in wrapper
    resp = resource(*args, **kwargs)
  File "flask_login/utils.py", line 228, in decorated_view
    return func(*args, **kwargs)
  File "flask/views.py", line 84, in view
    return self.dispatch_request(*args, **kwargs)
  File "redash/handlers/base.py", line 31, in dispatch_request
    return super(BaseResource, self).dispatch_request(*args, **kwargs)
  File "flask_restful/__init__.py", line 587, in dispatch_request
    resp = meth(*args, **kwargs)
  File "redash/handlers/query_results.py", line 297, in get
    return {'job': job.to_dict()}
  File "redash/tasks/queries.py", line 159, in to_dict
    task_info = self._async_result._get_task_meta()
  File "celery/result.py", line 410, in _get_task_meta
    return self._maybe_set_cache(self.backend.get_task_meta(self.id))
  File "celery/backends/base.py", line 365, in get_task_meta
    meta = self._get_task_meta_for(task_id)
  File "celery/backends/base.py", line 680, in _get_task_meta_for
    return self.decode_result(meta)
  File "celery/backends/base.py", line 284, in decode_result
    return self.meta_from_decoded(self.decode(payload))
  File "celery/backends/base.py", line 280, in meta_from_decoded
    meta['result'] = self.exception_to_python(meta['result'])
  File "celery/backends/base.py", line 260, in exception_to_python
    exc = cls(*exc_msg if isinstance(exc_msg, tuple) else exc_msg)

Steps to Reproduce

  1. Have a job with a NULL value in the data column (which is not allowed AFAIK).
  2. Try to fetch status of job via /api/jobs/

Technical details:

  • Redash Version: 6.0.x
  • Browser/OS: Firefox
  • How did you install Redash: Docker
@tyburn117
Copy link

tyburn117 commented Jan 23, 2019

My error is same.

Environment

  • redash v6.0.0.b8537 (docker image)
  • brower : chrome
  • run on docker containers
  • Data source : druid

Error Message

[2019-01-23 01:23:40,355] ERROR in app: Exception on /api/jobs/27477790-1fc2-4583-8d39-4ab56c5130f1 [GET]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/local/lib/python2.7/dist-packages/flask_restful/__init__.py", line 477, in wrapper
    resp = resource(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/flask_login/utils.py", line 228, in decorated_view
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/flask/views.py", line 84, in view
    return self.dispatch_request(*args, **kwargs)
  File "/app/redash/handlers/base.py", line 31, in dispatch_request
    return super(BaseResource, self).dispatch_request(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/flask_restful/__init__.py", line 587, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/app/redash/handlers/query_results.py", line 270, in get
    return {'job': job.to_dict()}
  File "/app/redash/tasks/queries.py", line 159, in to_dict
    task_info = self._async_result._get_task_meta()
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 410, in _get_task_meta
    return self._maybe_set_cache(self.backend.get_task_meta(self.id))
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/base.py", line 365, in get_task_meta
    meta = self._get_task_meta_for(task_id)
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/base.py", line 680, in _get_task_meta_for
    return self.decode_result(meta)
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/base.py", line 284, in decode_result
    return self.meta_from_decoded(self.decode(payload))
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/base.py", line 280, in meta_from_decoded
    meta['result'] = self.exception_to_python(meta['result'])
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/base.py", line 260, in exception_to_python
    exc = cls(*exc_msg if isinstance(exc_msg, tuple) else exc_msg)
TypeError: __init__() takes at least 4 arguments (2 given)

@boutibi
Copy link

boutibi commented Feb 9, 2019

Same error here.

Environment:

  • redash v6 / docker image
  • browser : chrome
  • run on docker containers
  • Data source : Spark.

error:

[2019-02-09 03:25:22,738][PID:417572][ERROR][redash] Exception on /api/jobs/82d32a80-3223-4668-9c0c-68e4b2a78654 [GET]
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 477, in wrapper
    resp = resource(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/flask_login/utils.py", line 228, in decorated_view
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/flask/views.py", line 84, in view
    return self.dispatch_request(*args, **kwargs)
  File "/opt/redash/redash/handlers/base.py", line 31, in dispatch_request
    return super(BaseResource, self).dispatch_request(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 587, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/opt/redash/redash/handlers/query_results.py", line 317, in get
    return {'job': job.to_dict()}
  File "/opt/redash/redash/tasks/queries.py", line 159, in to_dict
    task_info = self._async_result._get_task_meta()
  File "/usr/lib/python2.7/site-packages/celery/result.py", line 410, in _get_task_meta
    return self._maybe_set_cache(self.backend.get_task_meta(self.id))
  File "/usr/lib/python2.7/site-packages/celery/backends/base.py", line 365, in get_task_meta
    meta = self._get_task_meta_for(task_id)
  File "/usr/lib/python2.7/site-packages/celery/backends/base.py", line 680, in _get_task_meta_for
    return self.decode_result(meta)
  File "/usr/lib/python2.7/site-packages/celery/backends/base.py", line 284, in decode_result
    return self.meta_from_decoded(self.decode(payload))
  File "/usr/lib/python2.7/site-packages/celery/backends/base.py", line 280, in meta_from_decoded
    meta['result'] = self.exception_to_python(meta['result'])
  File "/usr/lib/python2.7/site-packages/celery/backends/base.py", line 260, in exception_to_python
    exc = cls(*exc_msg if isinstance(exc_msg, tuple) else exc_msg)
TypeError: __init__() takes at least 4 arguments (2 given)```
 

@washort
Copy link

washort commented Feb 25, 2019

This appears intractable from our side. This is a result of celery's assumptions about exception objects not matching SQLAlchemy exception behavior, and celery not having any way to query task state without trying to serialize exception objects.

@jezdez
Copy link
Member Author

jezdez commented Feb 26, 2019

Note, I've left some details about my investigation in the upstream ticket celery/celery#5057.

My understanding:

a) SQLAlchemy uses rich exception classes that Celery can't currently serialize using the (default in 4.x) json serializer for broker

b) Redash switched from the pickle serializer to the JSON serializer with the upgrade from 3.x to 4.x assuming exceptions keep being serialized successfully

Options to move forward:

  1. Switch to pickle for serialization again (easy, but not exciting)

  2. Wrap all Celery tasks in exception handling for SQLAlchemy exceptions, simplifying them into an own non-rich exception class that Celery can handle (harder, but future-proof)

@arikfr What do you think?
@washort Does that make sense to you, too?

@washort
Copy link

washort commented Feb 26, 2019

I believe I've solved the underlying issue. #3499

@rcoup
Copy link

rcoup commented Dec 12, 2019

I'm still seeing this issue on 8.0.0+b32245 (a16f551e) running from the released Docker images:

exc_msg = ['(psycopg2.OperationalError) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.']
cls = <class 'sqlalchemy.exc.OperationalError'>

(I have a Sentry trace if you need more information)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants