Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-110481: Implement inter-thread queue for biased reference counting #114824

Merged
merged 11 commits into from
Feb 9, 2024

Conversation

colesbury
Copy link
Contributor

@colesbury colesbury commented Jan 31, 2024

Biased reference counting maintains two refcount fields in each object: ob_ref_local and ob_ref_shared. The true refcount is the sum of these two fields. In some cases, when refcounting operations are split across threads,
the ob_ref_shared field can be negative (although the total refcount must be at least zero). In this case, the thread that decremented the refcount requests that the owning thread give up ownership and merge the refcount fields.

@corona10
Copy link
Member

corona10 commented Feb 1, 2024

I will take a look by tomorrow :)

@ericsnowcurrently
Copy link
Member

FTR, biased refcounting was implemented in gh-110764.

Copy link
Member

@ericsnowcurrently ericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach generally makes sense. I'm pretty sure I was able to follow along. There were a few places where I'm concerned there may be some extra complexity with multiple interpreters but they may be non-issues. I've left a handful of other minor comments.

Include/internal/pycore_object_stack.h Outdated Show resolved Hide resolved
Python/brc.c Show resolved Hide resolved
Python/brc.c Show resolved Hide resolved
Python/brc.c Outdated Show resolved Hide resolved
Python/brc.c Show resolved Hide resolved
Include/internal/pycore_brc.h Show resolved Hide resolved
Include/internal/pycore_brc.h Show resolved Hide resolved
Include/internal/pycore_brc.h Outdated Show resolved Hide resolved
Include/internal/pycore_brc.h Outdated Show resolved Hide resolved
Python/brc.c Show resolved Hide resolved
@bedevere-app
Copy link

bedevere-app bot commented Feb 1, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be put in the comfy chair!

@ericsnowcurrently
Copy link
Member

FWIW, every once in a while I'm reminded about the extra complexity that no-gil brings to GC and refcounting. I still worry that the extra complexity will have a meaningful long-term impact on maintainability for CPython and community extension modules. Is there anything we can do to help contributors/maintainers understand how it all works under no-gil and especially how to track down leaks/crashes without breaking their brains? Is there any extra tooling that would help?

Copy link
Contributor Author

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything we can do to help contributors/maintainers understand how it all works under no-gil and especially how to track down leaks/crashes without breaking their brains? Is there any extra tooling that would help?

There are some updates to the devguide (for GC) under review. We'll want to do more along with user facing docs, but I think these are best addressed holistically once the integration is farther along.

On tooling, we already have a nogil refleaks buildbot. We'll have a thread sanitizer buildbot after we get to the point where we can actually disable the GIL.

Include/internal/pycore_brc.h Show resolved Hide resolved
Include/internal/pycore_object_stack.h Outdated Show resolved Hide resolved
Lib/test/test_code.py Show resolved Hide resolved
Python/brc.c Show resolved Hide resolved
Python/brc.c Show resolved Hide resolved
Python/brc.c Show resolved Hide resolved
Python/brc.c Outdated Show resolved Hide resolved
Python/gc_free_threading.c Show resolved Hide resolved
Python/gc_free_threading.c Show resolved Hide resolved
Python/brc.c Show resolved Hide resolved
@colesbury
Copy link
Contributor Author

The "Ubuntu (free-threading)" build crashes due to the problem described in #115035. I think that will need to be fixed before this.

@colesbury
Copy link
Contributor Author

Hi @ericsnowcurrently, I believe I've addressed the comments from your review. Would you please take another look?

Copy link
Member

@ericsnowcurrently ericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@colesbury
Copy link
Contributor Author

!buildbot nogil

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @colesbury for commit 459638f 🤖

The command will test the builders whose names match following regular expression: nogil

The builders matched are:

  • x86-64 MacOS Intel NoGIL PR
  • AMD64 Ubuntu NoGIL Refleaks PR
  • ARM64 MacOS M1 Refleaks NoGIL PR
  • AMD64 Ubuntu NoGIL PR
  • AMD64 Windows Server 2022 NoGIL PR
  • x86-64 MacOS Intel ASAN NoGIL PR
  • ARM64 MacOS M1 NoGIL PR

Revert back to Py_DECREF() instead of the Py_SET_REFCNT(dict, 0) and fix
check that this thread owns the only reference to the dict. That ensures
that the final Py_DECREF call immediately frees the dict instead of
possibly enqueuing it to be merged.
@colesbury colesbury added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Feb 8, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @colesbury for commit 51c6625 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Feb 8, 2024
@colesbury colesbury added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Feb 8, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @colesbury for commit c631856 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Feb 8, 2024
@colesbury colesbury added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Feb 9, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @colesbury for commit 8311e18 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Feb 9, 2024
@colesbury colesbury added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Feb 9, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @colesbury for commit 97cf964 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Feb 9, 2024
@colesbury
Copy link
Contributor Author

There were some issues with refleak tracking. After some failed attempts at whack-a-mole, I addressed the refleak issues in a separate PR (#115188), and re-merged main back into this PR. It looks like the buildbots are finally passing 😅

@colesbury colesbury merged commit a3af3cb into python:main Feb 9, 2024
49 checks passed
@colesbury colesbury deleted the gh-110481-inter-thread-queue branch February 9, 2024 22:09
fsc-eriker pushed a commit to fsc-eriker/cpython that referenced this pull request Feb 14, 2024
…unting (python#114824)

Biased reference counting maintains two refcount fields in each object:
`ob_ref_local` and `ob_ref_shared`. The true refcount is the sum of these two
fields. In some cases, when refcounting operations are split across threads,
the ob_ref_shared field can be negative (although the total refcount must be
at least zero). In this case, the thread that decremented the refcount
requests that the owning thread give up ownership and merge the refcount
fields.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants