Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Root cause Bandersnatch hitting Stale Cached JSON API Requests #4892

Closed
cooperlees opened this issue Oct 18, 2018 · 10 comments
Closed

Root cause Bandersnatch hitting Stale Cached JSON API Requests #4892

cooperlees opened this issue Oct 18, 2018 · 10 comments
Labels

Comments

@cooperlees
Copy link
Contributor

cooperlees commented Oct 18, 2018

Describe the bug
Summary: Bandersnatch installs hitting stale JSON API Data
Full Details: pypa/bandersnatch#56
Purge Hack: pypa/bandersnatch#57

Expected behavior
Bandersnatch not having to PURGE the cache.

To Reproduce
Unknown here, but I’d suggest lets add logging / collection of more data about Bandersnatch user-agent calling purges

My Platform
Bandersnatch with requests EDIT: Now aiohttp

@igotcha
Copy link

igotcha commented Oct 23, 2018

we have the same problem here:

2018-10-23 16:30:58,066 INFO: bandersnatch/3.0.0.dev0 (cpython 3.6.6-final0, Linux x86_64)
2018-10-23 16:30:58,123 INFO: Syncing with https://pypi.python.org.
2018-10-23 16:30:58,124 INFO: Current mirror serial: 4344762
2018-10-23 16:30:58,124 INFO: Resuming interrupted sync from local todo list.
2018-10-23 16:30:58,188 INFO: Trying to reach serial: 4346516
2018-10-23 16:30:58,188 INFO: 7 packages to sync.
2018-10-23 16:30:58,190 INFO: Syncing package: basetuyaapi (serial 4346273)
2018-10-23 16:30:58,191 INFO: Syncing package: disparitySMD (serial 4345765)
2018-10-23 16:30:58,192 INFO: Syncing package: nixcms (serial 4346344)
2018-10-23 16:31:00,187 INFO: disparitySMD no longer exists on PyPI
2018-10-23 16:31:00,187 INFO: Syncing package: pacifica-python-downloader (serial 4346407)
2018-10-23 16:31:00,260 INFO: basetuyaapi no longer exists on PyPI
2018-10-23 16:31:00,260 INFO: Syncing package: portus (serial 4345746)
2018-10-23 16:31:01,468 INFO: nixcms no longer exists on PyPI
2018-10-23 16:31:01,468 INFO: Syncing package: python-todict (serial 4346125)
2018-10-23 16:31:01,686 INFO: portus no longer exists on PyPI
2018-10-23 16:31:01,687 INFO: Syncing package: toil-ionox0 (serial 4346027)
2018-10-23 16:31:02,318 INFO: pacifica-python-downloader no longer exists on PyPI
2018-10-23 16:31:02,941 INFO: python-todict no longer exists on PyPI
2018-10-23 16:31:04,242 ERROR: Stale serial for package toil-ionox0 - Attempt 1
2018-10-23 16:31:05,243 INFO: Syncing package: toil-ionox0 (serial 4346027)
2018-10-23 16:31:08,199 ERROR: Stale serial for package toil-ionox0 - Attempt 2
2018-10-23 16:31:10,201 INFO: Syncing package: toil-ionox0 (serial 4346027)
2018-10-23 16:31:12,318 ERROR: Stale serial for package toil-ionox0 - Attempt 3
2018-10-23 16:31:12,318 ERROR: Stale serial for toil-ionox0 (4346027) not updating. Giving up.
2018-10-23 16:31:12,319 INFO: Generating global index page.

@di
Copy link
Member

di commented Nov 7, 2018

Another report of a stale cache: #5017.

@di
Copy link
Member

di commented Jan 22, 2019

Yet another: #5323.

@brainwane
Copy link
Contributor

@cooperlees checking in since it's been a few months -- you said on the Bandersnatch issue

Can we somehow get more logging about how often Bandersnatch is hitting this and then think about how from this we will try and debug why requests is hitting this

You mean logging from the Bandersnatch side or from the PyPI side?

@cooperlees
Copy link
Contributor Author

From warehouse logs. We should be able to store where requests from the "bandersnatch" User-Agent hitting this endpoint are coming from and see if that helps work out why it's happening.

Bandersnatch logs stay out on individuals computers and we can not analyze them centrally like warehouse.

@cooperlees
Copy link
Contributor Author

Does anyone have time for debugging this? Do we want to remove the hack in pypa/bandersnatch#57 for Bandersnatch where it tries and clear stale caches.

@R1j1t
Copy link

R1j1t commented May 10, 2020

Hi, I also have the same issue. I recently uploaded my first package to pypi but it is not appearning in search. I tried searching by name, no luck. Then i used license -> date last updated there are also I cannot find the package. I am assuming this is not an expected behaviors. Is there something, I am missing or something wrong with my upload?

Here is the link https://pypi.org/project/contextualSpellCheck/.

@cooperlees
Copy link
Contributor Author

Hi, I also have the same issue. I recently uploaded my first package to pypi but it is not appearning in search. I tried searching by name, no luck. Then i used license -> date last updated there are also I cannot find the package. I am assuming this is not an expected behaviors. Is there something, I am missing or something wrong with my upload?

Here is the link https://pypi.org/project/contextualSpellCheck/.

@R1j1t - I don't think your issue and this CDN stale cache issue are related ... This is forcing the CDN to refetch from origin.

I need to workout if this is still happening. Might try and add more verbose logging to bandersnatch around this and see if people notice.

@dstufft
Copy link
Member

dstufft commented May 31, 2024

With #13936 merged now (along with a few follow up PRs to fix some deadlocks), I think that the primary cause of this has been fixed now.

The tl;dr is that our mirroring relied on the serial to be a monotonically increasing integer, but due to the way PostgreSQL works, concurrent transactions could end up with serials being "out of order", and #13936 changes that so that transactions that generate new serial numbers are serialized behind what is effectively a mutex.

I'm going to close this now, but if anyone sees any new reports of this happening after today, we can re-open this issue.

@dstufft dstufft closed this as completed May 31, 2024
@cooperlees
Copy link
Contributor Author

Wow. Thanks Donald. Will hola if anyone sees anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants