Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{lib}[GCCcore/10.3.0] tbb v2021.3.0 #13404

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

robert-mijakovic
Copy link
Contributor

(created using eb --new-pr)

@robert-mijakovic
Copy link
Contributor Author

@boegelbot please test @ generoso

@boegelbot
Copy link
Collaborator

@robert-mijakovic: Request for testing this PR well received on generoso

PR test command 'EB_PR=13404 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_13404 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 17759

Test results coming soon (I hope)...

- notification for comment with ID 877159197 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
generoso-c4-s-2 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/d03d9c4c339956c59c07ce1aef9df382 for a full test report.

@smoors
Copy link
Contributor

smoors commented Jul 12, 2021

@robert-mijakovic you changed from the custom tbb easyblock to the generic CMakeMake,
but the custom easyblock seems to do a lot more than your easyconfig.
did you check that it can automatically install on the different architectures, and that it correctly sets LIBRARY_PATH, CPATH and TBBROOT?

can you also add a sanity_check_commands entry?

@robert-mijakovic
Copy link
Contributor Author

@smoors Ok, I will have a look and try to turn it back to the custom tbb easyblock.

@robert-mijakovic
Copy link
Contributor Author

@smoors The reason for not using the custom EB is that tbb no longer uses ConfigureMake but rather CMakeMake

@robert-mijakovic
Copy link
Contributor Author

robert-mijakovic commented Jul 13, 2021

@smoors The paths are properly set, CMAKE_PREFIX_PATH, CPATH, LD_LIBRARY_PATH, LIBRARY_PATH, PKG_CONFIG_PATH and XDG_DATA_DIRS, EBROOTTBB, EBVERSIONTBB, and EBDEVELTBB.

I haven't tried to build it on more than on different architectures (only on the system we have in the house).

I can add sanity_check_commands.

@robert-mijakovic
Copy link
Contributor Author

@smoors Basically, there are no binaries inside of the installation only the source code of tests that could be built with:

~/oneTBB-2021.3.0/examples/getting_started/sub_string_finder $ mkdir build && cd build && cmake ../ && cmake --build . && ./sub_string_finder_simple

I'm trying to figure out how to do it properly. @Flamefire told me that it can be done with test_step.

@smoors
Copy link
Contributor

smoors commented Jul 13, 2021

@robert-mijakovic you still need to add TBBROOT (using modextrapaths), which is used by Intel Advisor, and possibly other tools as well.
for the sanity check command, I guess that indeed will not be trivial as there are no binaries generated, unless you know another way than compiling an example program?
please also add:

configopts = '-DTBB_EXAMPLES=ON'

runtest = 'test'

Comment on lines 25 to 27
'files': ['include/%(namelower)s/%(namelower)s.h'] +
['lib64/lib%%(namelower)s.%s' % SHLIB_EXT],
'dirs': ['lib64', 'include']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'files': ['include/%(namelower)s/%(namelower)s.h'] +
['lib64/lib%%(namelower)s.%s' % SHLIB_EXT],
'dirs': ['lib64', 'include']
'files': ['include/%(name)s/%(name)s.h'] +
['lib/lib%%(name)s.%s' % SHLIB_EXT],
'dirs': ['lib', 'include']

name is enough and the default dir is lib and the lib64 is created by EB. Also the lib folder is more important as e.g. embree explicitely searches that and that is what we ensure in the easyblock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

@Flamefire
Copy link
Contributor

did you check that it can automatically install on the different architectures, and that it correctly sets LIBRARY_PATH, CPATH and TBBROOT?

Their CMake files are A LOT better. So no arch specific stuff anymore, all is good as-is and those paths are correctly set (except for TBBROOT. BTW: Are you sure about TBBROOT? I've (only) seen TBB_ROOT being used so far. But the latter doesn't matter much and can be omitted.

+1 for the examples and test run.

I wouldn't really add a sanity check command as this is a library. What would you run there? And compiling stuff is already done
So I'd keep it simple with configopts and runtest :)

@smoors
Copy link
Contributor

smoors commented Jul 14, 2021

Are you sure about TBBROOT?

yeah, I may have misread the documentation. TBBROOT seems to be only used when installing Intel Advisor, so it's not strictly necessary.

I wouldn't really add a sanity check command as this is a library. What would you run there? And compiling stuff is already done
So I'd keep it simple with configopts and runtest :)

I agree, that's what I meant to say.

@smoors
Copy link
Contributor

smoors commented Jul 14, 2021

@boegelbot: please test @ generoso

@boegelbot
Copy link
Collaborator

@smoors: Request for testing this PR well received on generoso

PR test command 'EB_PR=13404 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_13404 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 17775

Test results coming soon (I hope)...

- notification for comment with ID 879904499 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
generoso-c1-s-3 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/a11f623f159980f9facf641d99e5f5c9 for a full test report.

@robert-mijakovic
Copy link
Contributor Author

== 2021-07-14 15:50:20,767 build_log.py:169 ERROR EasyBuild crashed with an error (at easybuild/easybuild-framework/easybuild/base/exceptions.py:124 in __init__): Lock /users/boegelbot/CentOS8/haswell/software/.locks/_users_boegelbot_CentOS8_haswell_software_tbb_2021.3.0-GCCcore-10.3.0.lock already exists, aborting! (at easybuild/easybuild-framework/easybuild/tools/filetools.py:1873 in check_lock)
== 2021-07-14 15:50:20,767 easyblock.py:3706 WARNING build failed (first 300 chars): Lock /users/boegelbot/CentOS8/haswell/software/.locks/_users_boegelbot_CentOS8_haswell_software_tbb_2021.3.0-GCCcore-10.3.0.lock already exists, aborting!

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
generoso-c1-s-2 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/5f2d5d5153545181a36d0c45d8543d04 for a full test report.

@robert-mijakovic
Copy link
Contributor Author

 44/131 Test  #44: test_limiter_node ........................***Exception: SegFault  0.71 sec
TBB Warning: The number of workers is currently limited to 3. The request for 4 workers is ignored. Further requests for more workers will be silently ignored until the limit changes.

�[0;36m[doctest] �[0mdoctest version is "2.3.5"
�[0;36m[doctest] �[0mrun with "--help" for options
�[0m�[0m�[0m�[0;33m===============================================================================
�[0;37m/tmp/boegelbot/tbb/2021.3.0/GCCcore-10.3.0/oneTBB-2021.3.0/test/tbb/test_limiter_node.cpp:522:
�[0mTEST CASE:  �[0mMessage is released if successor does not accept

�[0;37m/tmp/boegelbot/tbb/2021.3.0/GCCcore-10.3.0/oneTBB-2021.3.0/test/tbb/test_limiter_node.cpp:522: �[0;31mFATAL ERROR: �[0;31mtest case CRASHED: �[0;36mSIGSEGV - Segmentation violation signal

�[0m�[0m�[0;33m===============================================================================
�[0;36m[doctest] �[0mtest cases:      4 | �[0m     3 passed�[0m | �[0;31m     1 failed�[0m | �[0;33m     2 skipped�[0m
�[0;36m[doctest] �[0massertions: 217181 | �[0m217181 passed�[0m | �[0m     0 failed�[0m |
�[0;36m[doctest] �[0mStatus: �[0;31mFAILURE!�[0m

@Flamefire
Copy link
Contributor

Test report by @Flamefire
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
taurusi6606.taurus.hrsk.tu-dresden.de - Linux RHEL 7.9, x86_64, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (haswell), Python 2.7.5
See https://gist.github.com/34a7a972a0fc635dc0be174a6af770e3 for a full test report.

@robert-mijakovic
Copy link
Contributor Author

The same issue is with generoso.

@Flamefire
Copy link
Contributor

Yeah and a kinda-valid failure. Opened a bug report at uxlfoundation/oneTBB#489

Try to add a patch which changes the mentioned codeline to #if 1

@sassy-crick
Copy link
Collaborator

Chiming in here, as I and @SethosII would need the latest version of TBB for usher:
I tried building it using @robert-mijakovic EC by adding
configopts += '-DCMAKE_C_FLAGS=-fno-ipa-cp-clone '
to the EC and the tests crashed as above. The same happens when I am trying to patch it with this patch file, created from this open issue:

diff --git a/oneTBB-2021.3.0.orig/test/tbb/test_limiter_node.cpp b/oneTBB-2021.3.0/test/tbb/test_limiter_node.cpp
index 4d39008..585f9fb 100644
--- a/oneTBB-2021.3.0.orig/test/tbb/test_limiter_node.cpp
+++ b/oneTBB-2021.3.0/test/tbb/test_limiter_node.cpp
@@ -419,19 +419,38 @@ void test_decrementer() {
         CHECK_MESSAGE( actual == expected2[m++], "" );
     CHECK_MESSAGE( ( sizeof(expected2) / sizeof(expected2[0]) == m), "Not all messages have been processed." );
     g.wait_for_all();
+
+    const size_t threshold3 = 10;
+    tbb::flow::limiter_node<int, long long> limit3(g, threshold3);
+    make_edge(limit3, queue);
+    long long decrement_value3 = 3;
+    CHECK_MESSAGE( limit3.decrementer().try_put( -decrement_value3 ),
+                   "Limiter node decrementer's port does not accept message" );
+
+    m = 0;
+    while( limit3.try_put( m ) ){ m++; };
+    CHECK_MESSAGE( m == threshold3 - decrement_value3, "Not all messages have been accepted." );
+
+    actual = -1; m = 0;
+    while( queue.try_get(actual) ){
+        CHECK_MESSAGE( actual == m++, "Not all messages have been processed." );
+    }
+
+    g.wait_for_all();
+    CHECK_MESSAGE( m == threshold3 - decrement_value3, "Not all messages have been processed." );
 }
 
 void test_try_put_without_successors() {
     tbb::flow::graph g;
-    std::size_t try_put_num{3};
+    int try_put_num{3};
     tbb::flow::buffer_node<int> bn(g);
     tbb::flow::limiter_node<int> ln(g, try_put_num);
     tbb::flow::make_edge(bn, ln);
-    std::size_t i = 1;
+    int i = 1;
     for (; i <= try_put_num; i++)
         bn.try_put(i);
 
-    std::atomic<std::size_t> counter{0};
+    std::atomic<int> counter{0};
     tbb::flow::function_node<int, int> fn(g, tbb::flow::unlimited,
         [&](int input) {
             counter += input;

Any idea where I am going wrong here?

@Flamefire
Copy link
Contributor

Flamefire commented Aug 10, 2021

@sassy-crick You need to set this on CMAKE_CXX_FLAGS not CMAKE_C_FLAGS or create a patch as written in #13404 (comment)

Where is that patch from? It is not from the mentioned issue and doesn't solve any issue as far as I can tell

@sassy-crick
Copy link
Collaborator

@Flamefire Just my luck again, I picked the wrong flag. Thanks for pointing that out to me.
As for the patch: I had the 'clever' idea to download that file mentioned in the issue I quote being under the impression that file is the working one. Then I done the usual git diff ... command.
I will try it with the CMAKE_CXX_FLAGS flag and report back.

@sassy-crick
Copy link
Collaborator

I can confirm that configopts += '-DCMAKE_CXX_FLAGS=-fno-ipa-cp-clone ' added to the EC file works. Thanks for your prompt help @Flamefire

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants