nvc++: integer conversion resulted in a change of sign #229

marioroy · 2024-01-06T04:03:46Z

I tried nvc++ (NVIDIA HPC SDK) and thought to pass this along. I ran git pull and have the latest from master.

$ nvc++ -o llil4hmap -std=c++20 -fopenmp -Wall -O3 llil4hmap.cc -I./parallel-hashmap
"./parallel-hashmap/parallel_hashmap/phmap.h", line 434: warning: integer conversion resulted in a change of sign [integer_sign_change]
          auto special = _mm_set1_epi8(static_cast<uint8_t>(kSentinel));
                                       ^
Remark: individual warnings can be suppressed with "--diag_suppress <warning-name>"

"./parallel-hashmap/parallel_hashmap/phmap.h", line 442: warning: integer conversion resulted in a change of sign [integer_sign_change]
          auto special = _mm_set1_epi8(static_cast<uint8_t>(kSentinel));
                                       ^

The llil4hmap.cc demonstration is found in my gist repo.

https://gist.github.com/marioroy/3924c48e140f8330f25f67cd98a815ef

The text was updated successfully, but these errors were encountered:

marioroy · 2024-01-09T14:55:31Z

I reached out to NVIDIA. The parallel-hashmap usage in llil4hmap is not a typical pattern, but something I tried out of curiosity at the time. I replied to try llil4map.cc instead for investigating why nvc++ is noticeably slower compared to clang++.

https://forums.developer.nvidia.com/t/nvfortran-23-9-problem/270618/13?u=marioeroy

greg7mdp · 2024-01-09T15:34:34Z

Thank you @marioroy , I'm very busy right now, but I'll look into this when I can I promise.

easypickings · 2024-01-19T16:04:30Z

Have met the same warnings with nvcc. Seems simply including the header will trigger the warning:

#include <phmap.h>

int main() {}

Compiled with nvcc -Iinclude/parallel_hashmap test.cu -o test. Warnings:

include/parallel_hashmap/phmap.h(434): warning #68-D: integer conversion resulted in a change of sign

include/parallel_hashmap/phmap.h(442): warning #68-D: integer conversion resulted in a change of sign

include/parallel_hashmap/phmap.h(434): warning #68-D: integer conversion resulted in a change of sign

include/parallel_hashmap/phmap.h(442): warning #68-D: integer conversion resulted in a change of sign

marioroy · 2024-01-25T12:43:58Z

I tried the suggestion, emitted by nvc++. Individual warnings can be suppressed with --diag_suppress <warning-name>.

nvc++ --diag_suppress=integer_sign_change ...

I also tried nvc++ to build a Python extension, Accelerating Python on GPUs with nvc++ and Cython.

nvc++ --diag_suppress=declared_but_not_referenced,identifier_not_keyword,set_but_not_used ...

greg7mdp · 2024-02-04T15:20:37Z

Warnings are fixed with latest version.

marioroy · 2024-03-30T17:53:17Z

Hi, @greg7mdp

The NVIDIA nvc++ compiler is interesting. I fixed the three map demonstrations llil4map.cc, llil4hmap.cc, and llil4emh.cc. No more slowness using nvc++.

Changed std::mutex to included spinlock_mutex. Noticeably better performance. However, this resolves nvc++ taking greater than 40 seconds to compile llil4hmap.cc and llil4emh.cc.
Fixed out_properties. This ran poorly using nvc++, taking > 4 seconds. The program built with clang++ outputs in 0.7 seconds. It turns out that allocating a basic string inside a parallel loop causes nvc++ OpenMP to perform worst than non-parallel.

#ifdef MAX_STR_LEN_L
      // std::basic_string<char> s { it->first.data(), MAX_STR_LEN_L };
      // str.append(s.c_str());
         str.append(it->first.data());
#else
         str.append(it->first.data(), it->first.size());
#endif

Results:

llil4map consumes the least memory. The llil4hmap demonstration is similar to llil4map, but computes the hash_value one time per key. This was done out of curiosity. The llil4emh demonstration uses emhash7::HashMap for comparison.

$ NUM_THREADS=60 ./llil4map in/big* in/big* in/big* | cksum
llil4map (fixed string length=12) start
use OpenMP
use boost sort        g++            clang++        nvc++
get properties         7.864 secs     7.866 secs     7.973 secs
map to vector          1.140 secs     0.881 secs     0.876 secs
vector stable sort     1.681 secs     1.098 secs     1.105 secs
write stdout           0.624 secs     0.595 secs     0.590 secs
total time            11.311 secs    10.441 secs    10.546 secs
    count lines     970195200
    count unique    200483043
2057246516 1811140689

$ NUM_THREADS=60 ./llil4hmap in/big* in/big* in/big* | cksum
llil4hmap (fixed string length=12) start
use OpenMP
use boost sort        g++            clang++        nvc++
get properties         7.245 secs     7.197 secs     7.617 secs
map to vector          1.179 secs     0.841 secs     0.809 secs
vector stable sort     1.682 secs     1.103 secs     1.113 secs
write stdout           0.647 secs     0.573 secs     0.571 secs
total time            10.754 secs     9.715 secs    10.112 secs
    count lines     970195200
    count unique    200483043
2057246516 1811140689

$ NUM_THREADS=60 ./llil4emh in/big* in/big* in/big* | cksum
llil4emh (fixed string length=12) start
use OpenMP
use boost sort        g++            clang++        nvc++
get properties         6.150 secs     6.250 secs     6.392 secs
map to vector          1.016 secs     0.877 secs     0.832 secs
vector stable sort     1.681 secs     1.127 secs     1.094 secs
write stdout           0.608 secs     0.561 secs     0.595 secs
total time             9.457 secs     8.817 secs     8.914 secs
    count lines     970195200
    count unique    200483043
2057246516 1811140689

Thank you for the tip to reclaim memory.

MyMap().swap(map); // swap map with an empty temporary, which is immediately destroyed

greg7mdp · 2024-03-30T18:07:45Z

Thanks, interesting data, emhash seems to be pretty good. Personally, I'm not really focusing on benchmarks, but more on having the hash map perform well in most cases.

marioroy · 2024-03-30T19:18:23Z

Previously, the program ran poorly using nvc++, particularly "write stdout". The spinlock_mutex sped up "get properties".

$ ./llil4hmap /data1/input/big* | cksum
llil4hmap (fixed string length=12) start
use OpenMP
use boost sort        Before          After
get properties         3.804 secs      2.797 secs
hmap to vector         0.697 secs      0.792 secs
vector stable sort     1.104 secs      1.148 secs
write stdout           5.071 secs      0.564 secs
total time            10.678 secs      5.302 secs
    count lines     323398400
    count unique    200483043
701308064 1804347429

greg7mdp · 2024-03-30T22:57:38Z

That's great, thanks for doing that. Maybe I should add the spinlock_mutex to phmap so people can easily try it out?

greg7mdp added a commit that referenced this issue Feb 4, 2024

Fix nvcc warnings - issue #229

67c2461

greg7mdp closed this as completed Feb 4, 2024

marioroy changed the title ~~integer conversion resulted in a change of sign~~ nvc++: integer conversion resulted in a change of sign Mar 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nvc++: integer conversion resulted in a change of sign #229

nvc++: integer conversion resulted in a change of sign #229

marioroy commented Jan 6, 2024

marioroy commented Jan 9, 2024

greg7mdp commented Jan 9, 2024

easypickings commented Jan 19, 2024

marioroy commented Jan 25, 2024 •

edited

Loading

greg7mdp commented Feb 4, 2024

marioroy commented Mar 30, 2024 •

edited

Loading

greg7mdp commented Mar 30, 2024

marioroy commented Mar 30, 2024 •

edited

Loading

greg7mdp commented Mar 30, 2024

nvc++: integer conversion resulted in a change of sign #229

nvc++: integer conversion resulted in a change of sign #229

Comments

marioroy commented Jan 6, 2024

marioroy commented Jan 9, 2024

greg7mdp commented Jan 9, 2024

easypickings commented Jan 19, 2024

marioroy commented Jan 25, 2024 • edited Loading

greg7mdp commented Feb 4, 2024

marioroy commented Mar 30, 2024 • edited Loading

greg7mdp commented Mar 30, 2024

marioroy commented Mar 30, 2024 • edited Loading

greg7mdp commented Mar 30, 2024

marioroy commented Jan 25, 2024 •

edited

Loading

marioroy commented Mar 30, 2024 •

edited

Loading

marioroy commented Mar 30, 2024 •

edited

Loading