Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWX left in an endless loop after upgrading #1869

Closed
seafoodbuffet opened this issue May 10, 2018 · 1 comment
Closed

AWX left in an endless loop after upgrading #1869

seafoodbuffet opened this issue May 10, 2018 · 1 comment

Comments

@seafoodbuffet
Copy link

ISSUE TYPE
  • Bug Report
COMPONENT NAME
  • Installer
SUMMARY

After upgrading to the latest AWX (1.0.6 docker images), the system seems stuck in an endless loop and the UI continuously spins and says AWX is upgrading. We see this error in the postgres container:

postgres_1   | ERROR:  current transaction is aborted, commands ignored until end of transaction block
postgres_1   | STATEMENT:  SELECT pg_advisory_unlock(1226251610)
postgres_1   | ERROR:  column main_unifiedjob.emitted_events does not exist at character 291
postgres_1   | STATEMENT:  SELECT "main_unifiedjob"."id", "main_unifiedjob"."polymorphic_ctype_id", "main_unifiedjob"."created", "main_unifiedjob"."modified", "main_unifiedjob"."description", "main_unifiedjob"."created_by_id", "main_unifiedjob"."modified_by_id", "main_unifiedjob"."name", "main_unifiedjob"."old_pk", "main_unifiedjob"."emitted_events", "main_unifiedjob"."unified_job_template_id", "main_unifiedjob"."launch_type", "main_unifiedjob"."schedule_id", "main_unifiedjob"."execution_node", "main_unifiedjob"."cancel_flag", "main_unifiedjob"."status", "main_unifiedjob"."failed", "main_unifiedjob"."started", "main_unifiedjob"."finished", "main_unifiedjob"."elapsed", "main_unifiedjob"."job_args", "main_unifiedjob"."job_cwd", "main_unifiedjob"."job_env", "main_unifiedjob"."job_explanation", "main_unifiedjob"."start_args", "main_unifiedjob"."result_traceback", "main_unifiedjob"."celery_task_id", "main_unifiedjob"."instance_group_id", "main_job"."unifiedjob_ptr_id", "main_job"."survey_passwords", "main_job"."diff_mode", "main_job"."job_type", "main_job"."inventory_id", "main_job"."project_id", "main_job"."playbook", "main_job"."forks", "main_job"."limit", "main_job"."verbosity", "main_job"."extra_vars", "main_job"."job_tags", "main_job"."force_handlers", "main_job"."skip_tags", "main_job"."start_at_task", "main_job"."become_enabled", "main_job"."allow_simultaneous", "main_job"."timeout", "main_job"."use_fact_cache", "main_job"."job_template_id", "main_job"."artifacts", "main_job"."scm_revision", "main_job"."project_update_id" FROM "main_job" INNER JOIN "main_unifiedjob" ON ("main_job"."unifiedjob_ptr_id" = "main_unifiedjob"."id") WHERE "main_unifiedjob"."status" IN ('waiting', 'running', 'pending') ORDER BY "main_job"."unifiedjob_ptr_id" ASC

and we see the following in the task container

task_1       | 2018-05-10 22:50:40,560 DEBUG    awx.main.tasks Cluster node heartbeat task.
task_1       | 2018-05-10 22:50:40,572 DEBUG    awx.main.scheduler Running Tower task manager.
task_1       | 2018-05-10 22:50:40,579 DEBUG    awx.main.scheduler Starting Scheduler
task_1       | [2018-05-10 22:50:40,585: DEBUG/Worker-10] Start from server, version: 0.9, properties: {u'information': u'Licensed under the MPL.  See http://
www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2018 Pivotal Software, Inc.', u'capabilities': {u'exchange_exchange_bindings':
 True, u'connection.blocked': True, u'authentication_failure_close': True, u'direct_reply_to': True, u'basic.nack': True, u'per_consumer_qos': True, u'consume
r_priorities': True, u'consumer_cancel_notify': True, u'publisher_confirms': True}, u'cluster_name': u'rabbit@54a0e93d7920', u'platform': u'Erlang/OTP 20.3.5'
, u'version': u'3.7.5'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
task_1       | [2018-05-10 22:50:40,587: DEBUG/Worker-10] Open OK!
task_1       | [2018-05-10 22:50:40,587: DEBUG/Worker-10] using channel_id: 1
task_1       | [2018-05-10 22:50:40,588: DEBUG/Worker-10] Channel open
task_1       | [2018-05-10 22:50:40,589: DEBUG/Worker-10] Closed channel #1
task_1       | [2018-05-10 22:50:40,589: INFO/MainProcess] Received task: awx.main.tasks.handle_ha_toplogy_changes[57ae82a6-dc30-4e2d-a66f-77b7946c9c3d]
task_1       | [2018-05-10 22:50:40,590: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x1c5c320> (args:(u'awx.main.tasks.handle_ha_toplogy
_changes', u'57ae82a6-dc30-4e2d-a66f-77b7946c9c3d', [], {}, {u'utc': True, u'is_eager': False, u'chord': None, u'group': None, u'args': [], u'retries': 0, u'delivery_info': {u'priority': None, u'redelivered': False, u'routing_key': u'', u'exchange': u'tower_broadcast_all'}, u'expires': None, u'hostname': 'celery@awx', u'task': u'awx.main.tasks.handle_ha_toplogy_changes', u'callbacks': None, u'correlation_id': u'57ae82a6-dc30-4e2d-a66f-77b7946c9c3d', u'errbacks': None, u'timelimit': [None, None], u'taskset': None, u'kwargs': {}, u'eta': None, u'reply_to': u'2a7d2cd9-9f76-3347-a3fc-dc2a2cb2828a', u'id': u'57ae82a6-dc30-4e2d-a66f-77b7946c9c3d', u'headers': {}}) kwargs:{})
task_1       | 2018-05-10 22:50:40,590 ERROR    awx.main.tasks Task awx.main.scheduler.tasks.run_task_manager encountered exception.
task_1       | Traceback (most recent call last):
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
task_1       |     R = retval = fun(*args, **kwargs)
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
task_1       |     return self.run(*args, **kwargs)
task_1       |   File "/usr/lib/python2.7/site-packages/awx/main/scheduler/tasks.py", line 31, in run_task_manager
task_1       |     TaskManager().schedule()
task_1       |   File "/usr/lib/python2.7/site-packages/awx/main/scheduler/task_manager.py", line 641, in schedule
task_1       |     wfj.send_notification_templates('succeeded' if wfj.status == 'successful' else 'failed')
task_1       |   File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
task_1       |     self.gen.throw(type, value, traceback)
task_1       |   File "/usr/lib/python2.7/site-packages/awx/main/utils/pglock.py", line 14, in advisory_lock
task_1       |     yield internal_lock
task_1       |   File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
task_1       |     self.gen.throw(type, value, traceback)
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django_pglocks/__init__.py", line 80, in advisory_lock
task_1       |     cursor.execute(command)
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute
task_1       |     return self.cursor.execute(sql, params)
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/utils.py", line 94, in __exit__
task_1       |     six.reraise(dj_exc_type, dj_exc_value, traceback)
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/backends/utils.py", line 62, in execute
task_1       |     return self.cursor.execute(sql)
task_1       | InternalError: current transaction is aborted, commands ignored until end of transaction block
task_1       | 
task_1       | [2018-05-10 22:50:40,592: DEBUG/MainProcess] Task accepted: awx.main.tasks.handle_ha_toplogy_changes[57ae82a6-dc30-4e2d-a66f-77b7946c9c3d] pid:
234
task_1       | [2018-05-10 22:50:40,593: ERROR/MainProcess] Task awx.main.scheduler.tasks.run_task_manager[b1c5c54a-6ea9-4962-aa08-854f4b97c1d1] raised unexpe
cted: InternalError('current transaction is aborted, commands ignored until end of transaction block\n',)
task_1       | Traceback (most recent call last):
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
task_1       |     R = retval = fun(*args, **kwargs)
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
task_1       |     return self.run(*args, **kwargs)
task_1       |   File "/usr/lib/python2.7/site-packages/awx/main/scheduler/tasks.py", line 31, in run_task_manager
task_1       |     TaskManager().schedule()
task_1       |   File "/usr/lib/python2.7/site-packages/awx/main/scheduler/task_manager.py", line 641, in schedule
task_1       |     wfj.send_notification_templates('succeeded' if wfj.status == 'successful' else 'failed')
task_1       |   File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
task_1       |     self.gen.throw(type, value, traceback)
task_1       |   File "/usr/lib/python2.7/site-packages/awx/main/utils/pglock.py", line 14, in advisory_lock
task_1       |     yield internal_lock
task_1       |   File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
task_1       |     self.gen.throw(type, value, traceback)
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django_pglocks/__init__.py", line 80, in advisory_lock
task_1       |     cursor.execute(command)
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute
task_1       |     return self.cursor.execute(sql, params)
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/utils.py", line 94, in __exit__
task_1       |     six.reraise(dj_exc_type, dj_exc_value, traceback)
task_1       |   File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/django/db/backends/utils.py", line 62, in execute
task_1       |     return self.cursor.execute(sql)
task_1       | InternalError: current transaction is aborted, commands ignored until end of transaction block
ENVIRONMENT
  • AWX version: 1.0.6
  • AWX install method: docker for linux
  • Ansible version: 2.4.3
  • Operating System: RHEL 7.4
  • Web Browser: Firefox
STEPS TO REPRODUCE
  1. Stop existing working AWX using docker-compose stop
  2. Update containers using docker-compose pull
  3. Start everything using docker-compose up -d
  4. The UI spins upgrading and the errors from above repeat every minute or so
EXPECTED RESULTS

The upgrade eventually should finish and let us use AWX once more

ACTUAL RESULTS

Web UI spins forever on Upgrading.

We've tried to stop all containers and restart, to no avail

@seafoodbuffet
Copy link
Author

On further inspection, I suspect that I'm running into #1817 I'm going to close my issue and try to go back to 1.0.5

tvo318 added a commit to oraNod/awx that referenced this issue Aug 23, 2023
oraNod pushed a commit to oraNod/awx that referenced this issue Aug 28, 2023
oraNod pushed a commit to oraNod/awx that referenced this issue Aug 29, 2023
oraNod pushed a commit to oraNod/awx that referenced this issue Aug 31, 2023
oraNod pushed a commit to oraNod/awx that referenced this issue Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant