Bionic inventory CKAN requests hang #1068

adborden · 2019-10-25T16:35:54Z

After putting Bionic into production for Inventory, some requests hang indefinitely. This manifests as uptrends alerts because some requests will timeout.

What we know so far...

The behavior is somewhat intermittent. Sometimes it works, sometimes not.
This did not seem to be an issue for staging, only production.
This looks similar to what @adborden was seeing on Catalog when testing bionic.
strace does not show a clear culprit.
The hang seems to happen within CKAN's plugin initialization phase.
We've verified connections to database (inventory and datastore), solr, and s3.

How to reproduce

sudo -u www-data /usr/lib/ckan/bin/paster --plugin=ckan serve /etc/ckan/production.ini
curl -v -L -k localhost:5000/api/action/status_show

Expected behavior

Request returns 200, JSON response of the server status

Actual behavior

Request hangs, timeouts after ~5 minutes.

The text was updated successfully, but these errors were encountered:

adborden · 2019-10-25T16:36:31Z

We could try disabling all the plugins, and re-enable them one by one.

mogul · 2019-10-25T19:07:45Z

I think what we're seeing is happening at the end of the plugin-initialization process. And again: Why do we only see this behavior on these two particular hosts, when the same plugins work on the staging hosts?

adborden · 2019-12-06T00:57:58Z

I think I'm seeing this behavior on staging Trusty now.

adborden · 2020-02-13T23:23:44Z

@FuhuXia has spent some time on this and hasn't been able to reproduce it in any environment. We are going to go ahead with the deploy, checking along the way in case it comes back.

adborden · 2020-03-24T21:05:49Z

Seeing this issue on production again.

This reverts commit 7e2d211. We've ran into #1068 again. Rolling back.

FuhuXia · 2020-03-24T21:23:13Z

When one Bionic hangs, it also brings down other Trusty instances. Trusty instance will take forever to load /dataset and eventually gives 500 error. The log shows it is related to postgres QueuePool limit.

adborden · 2020-03-27T23:14:17Z

Closing as a duplicate of #1375, we're at least talking about these as if they're the same issue.

adborden added bug Software defect or bug component/inventory Inventory playbooks/roles labels Oct 25, 2019

adborden self-assigned this Oct 25, 2019

adborden assigned FuhuXia Nov 18, 2019

adborden added this to the Sprint 20191129 milestone Nov 22, 2019

adborden added the skill/ckan label Dec 2, 2019

adborden modified the milestones: Sprint 20191129, Sprint 20191213 Dec 2, 2019

mogul mentioned this issue Dec 30, 2019

Ubuntu Bionic rollout for inventory-web host(s) #717

Closed

6 tasks

mogul removed this from the Sprint 20191213 milestone Dec 30, 2019

mogul unassigned adborden Jan 17, 2020

mogul added this to the Sprint 20200220 milestone Feb 20, 2020

mogul closed this as completed Feb 20, 2020

adborden reopened this Mar 24, 2020

adborden added a commit that referenced this issue Mar 24, 2020

Revert "[inventory] enable v2, disable v1"

4e434bc

This reverts commit 7e2d211. We've ran into #1068 again. Rolling back.

adborden mentioned this issue Mar 24, 2020

[hotfix][inventory] disable bionic #1493

Merged

FuhuXia mentioned this issue Mar 25, 2020

Truncated or oversized response headers received from daemon process 'ckan' #1375

Closed

adborden assigned adborden and unassigned FuhuXia Mar 26, 2020

adborden mentioned this issue Mar 26, 2020

[hotfix][inventory] Fix Bionic Inventory #1500

Closed

adborden closed this as completed Mar 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bionic inventory CKAN requests hang #1068

Bionic inventory CKAN requests hang #1068

adborden commented Oct 25, 2019 •

edited

Loading

adborden commented Oct 25, 2019

mogul commented Oct 25, 2019

adborden commented Dec 6, 2019

adborden commented Feb 13, 2020

adborden commented Mar 24, 2020

FuhuXia commented Mar 24, 2020

adborden commented Mar 27, 2020

Bionic inventory CKAN requests hang #1068

Bionic inventory CKAN requests hang #1068

Comments

adborden commented Oct 25, 2019 • edited Loading

How to reproduce

Expected behavior

Actual behavior

adborden commented Oct 25, 2019

mogul commented Oct 25, 2019

adborden commented Dec 6, 2019

adborden commented Feb 13, 2020

adborden commented Mar 24, 2020

FuhuXia commented Mar 24, 2020

adborden commented Mar 27, 2020

adborden commented Oct 25, 2019 •

edited

Loading