Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long running memory leak problem #14227

Merged
merged 3 commits into from
Dec 16, 2023

Conversation

kingljl
Copy link
Contributor

@kingljl kingljl commented Dec 6, 2023

Description

Problem: The memory will slowly increase with the drawing until restarting.
Observation: GC analysis shows that no occupation has occurred, so it is suspected to be a problem with the underlying allocator.
Reason: Under Linux, glibc is used to allocate memory. glibc uses brk and mmap to allocate memory, and the memory allocated by brk cannot be released until the high-address memory is released. That is to say, if you apply for two pieces of memory A and B through brk, it is impossible to release A before B is released, and it is still occupied by the process. Check the suspected "memory leak" through TOP.
So I replaced TCMalloc, but found that libtcmalloc_minimal could not find ptthread_Key_Create. After analysis, it was found that pthread was not entered during compilation.

As shown in the figure:
example

detail steps:
1、nm -D /lib64/libc.so.6 | grep -e pthread_key_create
2、nm -D /lib64/libpthread.so | grep -e pthread_key_create
3、ldd /usr/local/lib/libtcmalloc_minimal.so.4
4、 ldd /usr/local/lib/libtcmalloc.so.4
dependents:
1、gperftools-2.13 (gcc-c++: c++11)
2、libunwind-1.6.2

Screenshots/videos:

Checklist:

Problem: The memory will slowly increase with the drawing until restarting.
Observation: GC analysis shows that no occupation has occurred, so it is suspected to be a problem with the underlying allocator.
Reason: Under Linux, glibc is used to allocate memory. glibc uses brk and mmap to allocate memory, and the memory allocated by brk cannot be released until the high-address memory is released. That is to say, if you apply for two pieces of memory A and B through brk, it is impossible to release A before B is released, and it is still occupied by the process. Check the suspected "memory leak" through TOP.
So I replaced TCMalloc, but found that libtcmalloc_minimal could not find ptthread_Key_Create. After analysis, it was found that pthread was not entered during compilation.
@AUTOMATIC1111
Copy link
Owner

needs comments and review from other linux users

@kingljl
Copy link
Contributor Author

kingljl commented Dec 14, 2023

needs comments and review from other linux users

OK, thanks for the reply!!
Detailed description:
We encountered this problem when using it, and it is now very stable. There is tcmalloc logic in webui.sh, but that library does not compile the libpthread library, so I made a judgment and gave priority to libtcmalloc_minimal.so. If the libpthread library was not linked, I changed to libtcmalloc.so.

@AUTOMATIC1111
Copy link
Owner

I'll merge it in into dev and if it breaks things for people we'll have to revert or fix it.

@AUTOMATIC1111 AUTOMATIC1111 merged commit 4f5281a into AUTOMATIC1111:dev Dec 16, 2023
3 checks passed
@voyageur
Copy link

This break tcmalloc usage on any current Linux systems, as pthread was merged back in glibc since glibc 2.34:
https://developers.redhat.com/articles/2021/12/17/why-glibc-234-removed-libpthread#the_system_administrator_view
RHEL9/Centos9 also has "All threading APIs now merged into libc.so.6":
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/considerations_in_adopting_rhel_9/index#ref_changes-to-dotnet_assembly_compilers-and-development-tools

So at least on current glibc, this change should be skipped.
But I wonder also about the change in general, this adds logic in stable-diffusion-webui to catch potential errors from this specific system, where local gperftools was probably incorrectly built. IMHO not the job of programs depending on tcmalloc

@kingljl
Copy link
Contributor Author

kingljl commented Dec 18, 2023

This break tcmalloc usage on any current Linux systems, as pthread was merged back in glibc since glibc 2.34: https://developers.redhat.com/articles/2021/12/17/why-glibc-234-removed-libpthread#the_system_administrator_view RHEL9/Centos9 also has "All threading APIs now merged into libc.so.6": https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/considerations_in_adopting_rhel_9/index#ref_changes-to-dotnet_assembly_compilers-and-development-tools

So at least on current glibc, this change should be skipped. But I wonder also about the change in general, this adds logic in stable-diffusion-webui to catch potential errors from this specific system, where local gperftools was probably incorrectly built. IMHO not the job of programs depending on tcmalloc

I actually did two things without changing the underlying logic. It has the logic of tcmalloc in it. It only adds a layer to determine whether the current library contains the libpthread library. The original logic remains unchanged, adding a library and a judgment.
Specific modifications: webui.sh file

  1. ldconfig determines whether libtcmalloc_minimal and libtcmalloc exist respectively (libtcmalloc_minimal takes precedence)
  2. Based on step 1, use ldd to determine whether there is a libpthread library.

@voyageur
Copy link

I actually did two things without changing the underlying logic. It has the logic of tcmalloc in it. It only adds a layer to determine whether the current library contains the libpthread library. The original logic remains unchanged, adding a library and a judgment. Specific modifications: webui.sh file

  1. ldconfig determines whether libtcmalloc_minimal and libtcmalloc exist respectively (libtcmalloc_minimal takes precedence)
    Old code already did that, $(ldconfig -p | grep -Po "libtcmalloc(_minimal|).so.\d") returns first occurences of libtcmalloc_minimal,and then libctmalloc:
$ ldconfig -p | grep -Po "libtcmalloc(_minimal|)\.so\.\d"
libtcmalloc_minimal.so.4
libtcmalloc_minimal.so.4
libtcmalloc.so.4
libtcmalloc.so.4
  1. Based on step 1, use ldd to determine whether there is a libpthread library.
    Please read the links I put in previous comment, this is incorrect - on all systems with glibc >= 2.34, libraries should NOT link to libpthread. So the ldd check will always fail on current systems (CentOS 9, Ubuntu 22.04 LTS, and Fedora/Arch/...)

For example, with glibc-2.38 here:

$ nm -D /lib64/libc.so.6|grep pthread_key_create
000000000008e050 T __pthread_key_create@GLIBC_2.2.5
000000000008e050 T __pthread_key_create@@GLIBC_2.34
000000000008e050 T pthread_key_create@@GLIBC_2.34
000000000008e050 T pthread_key_create@GLIBC_2.2.5
$nm -D /usr/lib64/libpthread.so.0|grep -e pthread_key
[no entries, this is a placeholder leftover]

Also, I have not found any reports of issues with incorrectly linked libtcmalloc, this sounds like a local problem, or a bug in gperftools build system, which fix should rather be there than in user applications (https://github.com/gperftools/gperftools). Note it has some build logic to support pthread on several environments, so you should not force-add -lpthread and let the build system detect it (and most probably use "-pthread")

AUTOMATIC1111 added a commit that referenced this pull request Feb 11, 2024
fix prepare_tcmalloc (fix: #14227)(Fixed memory leak issue in Ubuntu 22.04 or modern linux environment)
@w-e-w w-e-w mentioned this pull request Feb 17, 2024
@pawel665j pawel665j mentioned this pull request Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants