Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors under valgrind when openmp is enabled #8238

Closed
kropacf opened this issue Sep 9, 2022 · 2 comments
Closed

Errors under valgrind when openmp is enabled #8238

kropacf opened this issue Sep 9, 2022 · 2 comments

Comments

@kropacf
Copy link

kropacf commented Sep 9, 2022

xgboost: 1.6.1
model: 1.1.1
ubuntu 22.04

I found an error when I run predictions under valgrind. According to valgrind log the error is somewhere in openmp. So I tried to build xgboost without openMP (-DUSE_OPENMP=0) and error is gone. I know there are some known false positive errors when running openmp under valgrind (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36298) but I want to be sure this errors are caused by openmp not by xgboost.
xgboost_valgrind_example.zip

Without openMP:

==7== Memcheck, a memory error detector
==7== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==7== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==7== Command: ./xgboost_valgrind
==7== 
[08:25:42] WARNING: /xgboost/src/learner.cc:749: Found JSON model saved before XGBoost 1.6, please save the model using current version again. The support for old JSON model will be discontinued in XGBoost 2.3.
==7== 
==7== HEAP SUMMARY:
==7==     in use at exit: 0 bytes in 0 blocks
==7==   total heap usage: 262,933 allocs, 262,933 frees, 45,755,755 bytes allocated
==7== 
==7== All heap blocks were freed -- no leaks are possible
==7== 
==7== For lists of detected and suppressed errors, rerun with: -s
==7== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

With openMP:

==7== Memcheck, a memory error detector
==7== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==7== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==7== Command: ./xgboost_valgrind
==7== 
[08:23:52] WARNING: /xgboost/src/learner.cc:749: Found JSON model saved before XGBoost 1.6, please save the model using current version again. The support for old JSON model will be discontinued in XGBoost 2.3.
==7== 
==7== HEAP SUMMARY:
==7==     in use at exit: 7,856 bytes in 15 blocks
==7==   total heap usage: 262,957 allocs, 262,942 frees, 45,825,843 bytes allocated
==7== 
==7== 3,520 bytes in 11 blocks are possibly lost in loss record 4 of 5
==7==    at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==7==    by 0x40147D9: calloc (rtld-malloc.h:44)
==7==    by 0x40147D9: allocate_dtv (dl-tls.c:375)
==7==    by 0x40147D9: _dl_allocate_tls (dl-tls.c:634)
==7==    by 0x514A834: allocate_stack (allocatestack.c:430)
==7==    by 0x514A834: pthread_create@@GLIBC_2.34 (pthread_create.c:647)
==7==    by 0x52FD1EF: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==7==    by 0x52F3A10: GOMP_parallel (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==7==    by 0x4AE2B70: xgboost::gbm::GBTreeModel::LoadModel(xgboost::Json const&) (in /usr/local/lib/libxgboost.so)
==7==    by 0x4AB9DB1: xgboost::gbm::GBTree::LoadModel(xgboost::Json const&) (in /usr/local/lib/libxgboost.so)
==7==    by 0x4B02C57: xgboost::LearnerIO::LoadModel(xgboost::Json const&) (in /usr/local/lib/libxgboost.so)
==7==    by 0x4B0BB2F: xgboost::LearnerIO::LoadModel(dmlc::Stream*) (in /usr/local/lib/libxgboost.so)
==7==    by 0x10B581: main (main.cpp:13)
==7== 
==7== LEAK SUMMARY:
==7==    definitely lost: 0 bytes in 0 blocks
==7==    indirectly lost: 0 bytes in 0 blocks
==7==      possibly lost: 3,520 bytes in 11 blocks
==7==    still reachable: 4,336 bytes in 4 blocks
==7==         suppressed: 0 bytes in 0 blocks
==7== Reachable blocks (those to which a pointer was found) are not shown.
==7== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==7== 
==7== For lists of detected and suppressed errors, rerun with: -s
==7== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

I made small example in docker

docker build . -t valgrind_example
docker run -it --rm valgrind_example
@trivialfis
Copy link
Member

We run address sanitizer with leak sanitizer on CI. I think it's false positive from valgrind.

@trivialfis
Copy link
Member

Feel free to reopen if there's any sign of real memory leak.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants