Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-1635741: Fix refleaks of encodings module by removing the encodings._aliases #21896

Closed

Conversation

shihai1991
Copy link
Member

@shihai1991 shihai1991 commented Aug 16, 2020

Fix refleaks of encodings._aliases by using encodings.aliases directly in encodings.search_function.

Co-authored-by: Victor Stinner vstinner@python.org

https://bugs.python.org/issue1635741

@shihai1991 shihai1991 changed the title WIP: Move encodings._aliases to search_function bpo-1635741: Move encodings._aliases to search_function Aug 16, 2020
@shihai1991
Copy link
Member Author

I use the test case of https://bugs.python.org/issue1635741#msg355187 to test the refleaks in debug mode.

Before this PR:
sys.gettotalrefcount: 14288
sys.gettotalrefcount: 18042
sys.gettotalrefcount: 21796
sys.gettotalrefcount: 25550
sys.gettotalrefcount: 29304
sys.gettotalrefcount: 33058
sys.gettotalrefcount: 36812
sys.gettotalrefcount: 40566
sys.gettotalrefcount: 44320
sys.gettotalrefcount: 48074

After this PR:
sys.gettotalrefcount: 13641
sys.gettotalrefcount: 16748
sys.gettotalrefcount: 19855
sys.gettotalrefcount: 22962
sys.gettotalrefcount: 26069
sys.gettotalrefcount: 29176
sys.gettotalrefcount: 32283
sys.gettotalrefcount: 35390
sys.gettotalrefcount: 38497
sys.gettotalrefcount: 41604

@shihai1991
Copy link
Member Author

@vstinner Hi, victor. Pls take a look if you have free time, thanks.

@@ -69,6 +69,7 @@ def normalize_encoding(encoding):

def search_function(encoding):

_aliases = aliases.aliases
Copy link

@ghost ghost Aug 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand the problem.
This statement should be placed at below so that it does not affect the performance of the cache.

Copy link
Member Author

@shihai1991 shihai1991 Aug 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand the problem.

Thanks for your comment. It will affect the encodings module's refcount in C level and reduce the refleaks.

This statement should be placed at below so that it does not affect the performance of the cache.

MAYBE removing this line and using aliases.aliases to replace _aliases is fine too :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking this comment: #21896 (comment)

Copy link

@ghost ghost Aug 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking this comment: #21896 (comment)

The usage of aliases.aliasesis is very normal, maybe the root of the problem is not here.

@shihai1991 shihai1991 changed the title bpo-1635741: Move encodings._aliases to search_function bpo-1635741: Fix refleaks of encodings module by removing the encodings._aliases Aug 16, 2020
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how using encodings._aliases in search_function() creates a "reference leak". A leak is when calling a function multiple times leaks memory. Here, there is no leak.

Maybe you're talking about a "reference cycle".

I guess that you're trying to clear variables at exit.

You should try to trigger an explicit GC collection after calling PyInterpreterState_Clear(). In finalize_interp_clear(), try to replace:

    /* Trigger a GC collection on subinterpreters*/
    if (!is_main_interp) {
        _PyGC_CollectNoFail();
    }

with:

    // Last explicit GC collection
    _PyGC_CollectNoFail();

(without this change)

Does it fix your issue?

PyInterpreterState_Clear() clears the reference to the search function: Py_CLEAR(interp->codec_search_path).

@shihai1991
Copy link
Member Author

shihai1991 commented Aug 17, 2020

Maybe you're talking about a "reference cycle".

Thanks, victor. "reference cycle" would be more exact. And I will try your idea in my interpreter.

@shihai1991
Copy link
Member Author

    /* Trigger a GC collection on subinterpreters*/
    if (!is_main_interp) {
        _PyGC_CollectNoFail();
    }

with:

    // Last explicit GC collection
    _PyGC_CollectNoFail();

Oh, amazing result:

sys.gettotalrefcount: 10537
sys.gettotalrefcount: 10540
sys.gettotalrefcount: 10543
sys.gettotalrefcount: 10546
sys.gettotalrefcount: 10549
sys.gettotalrefcount: 10552
sys.gettotalrefcount: 10555
sys.gettotalrefcount: 10558
sys.gettotalrefcount: 10561
sys.gettotalrefcount: 10564

the pr in: #21902

@shihai1991
Copy link
Member Author

shihai1991 commented Aug 17, 2020

Pablo created this PR(don't calling explict collection in main interpreter): #17457
I am not sure I missed some info or not~

@vstinner
Copy link
Member

Since #17457 is merged, is this PR still relevant/useless? If not, please close it.

@shihai1991
Copy link
Member Author

Since #17457 is merged, is this PR still relevant/useless? If not, please close it.

#17457 is worked, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants