Add NVTX C++ wrappers #2

jrhemstad · 2020-06-04T20:44:06Z

This PR adds C++ convenience wrappers for the NVTX C API.

Leverages C++ objects, RAII, and "initialize on first use" C++ paradigms to make it easier to add NVTX ranges to C++ code. See README.md and Doxygen for more information.

nvtx3.hpp is the single header the provides all of the C++ wrappers.

Figure out directory structure
Figure out integration of C++ doxygen into existing Doxygen of C APIs
Tests
Benchmarks

cpp/nvtx3.hpp

jrhemstad · 2020-06-04T20:52:49Z

cpp/nvtx3.hpp

+
+  color() = delete;
+  ~color() = default;
+  color(color const&) = default;


@harrism reports that clang complains here because the copy ctor is implicitly deleted since the members are const. Should explicitly mark this ctor as deleted or remove the const from the members.

…-cpp-bindings

… for dlopen/dlclose.

jcohen-nvidia

Perfect, thanks

jrhemstad

We should wrap as much of the C++ functionality as makes sense with #ifndef NVTX_DISABLE to eliminate any unnecessary extra work being done at the C++ layer.

harrism · 2020-06-18T06:48:08Z

cpp/benchmarks/CMakeLists.txt

+###################################################################################################
+# - compiler options ------------------------------------------------------------------------------
+set(CMAKE_CXX_STANDARD 14)
+set(CMAKE_C_COMPILER $ENV{tCC})


Is tCC intentional?

jcohen-nvidia · 2020-06-25T02:47:12Z

@jrhemstad I haven't forgotten about you! I still need to finish some tests on symbols with inline namespaces, and compare some of my perf-affecting changes to your benchmarks.

I got myself admin permissions finally, and I've created the new branches. We are going to avoid using "master" as a branch name since LLVM and a number of other projects are starting to switch away from that name. I made "dev" and "release-v3". All development work should be direct merges to or pull requests on "dev". Then when we get to states that are in good shape, we can fast-forward merge all the changes in "dev" to "release-v3" and make a new tag on "release-v3". So far I've just made a "v3.0.1" tag on that branch. I set the default branch of the repo to be "release-v3" because that's what people will see when first arriving at the repo page, and that's the thing we want them to get if they just hit Clone or Download. So now we'll have to remember when creating PRs to not go against the default but to switch it to "dev". I read various opinions and found a bunch of agreement that this tends to work well in practice, better than having "dev" be default.

So, can you switch this PR to the "dev" branch? I think putting the wrappers in cpp in the root is fine -- I moved the include directory in the root to be c/include. You will have to update the include paths for those headers in your tests. Also I made some C and C++ sample injections -- I think you should switch your tests to these instead of using CUPTI. Removing the deps on CUDA will make it easier to build.

Also, I am going to add you to have write access to the repo.

cpp/nvtx3.hpp

jcohen-nvidia · 2020-07-13T17:08:32Z

cpp/nvtx3.hpp

+ *
+ * @param r Handle to a range started by a prior call to `start_range`.
+ */
+void end_range(range_handle r)


Make inline since function is defined in header file

So this one does indeed need to be made inline. Unless we want to require a domain as part of the end_range. The NVTX C docs say this isn't necessary, but we can have stricter requirements on the C++ API if we ever want to change the underlying C behavior.

jcohen-nvidia · 2020-07-13T17:23:23Z

Since the MSVC compiler doesn't handle the ISO646 versions of logical operators (and/or/not instead of &&/||/!) without including an additional header <iso646.h>, I would prefer sticking with the &&/||/! spellings of those. I imagine there could be Windows code that breaks if they include that header and were using and/or/not for symbols, and this is an easy enough fix to avoid that breakage.

…ting the move/dtor operations.

germasch · 2020-08-14T00:57:15Z

cpp/CMakeLists.txt

+set(CMAKE_C_COMPILER $ENV{CC})
+set(CMAKE_CXX_COMPILER $ENV{CXX})


I've seen this used elsewhere (in rapids), but unless there's a specific reason for it, I'd recommend not do this. The default behavior of cmake is to use the environment variables CC and CXX as default for CMAKE_{C,CXX}_COMPILER, anyway. So setting them this is way is not necessary, and in fact I believe it makes cmake -DCMAKE_{C,CXX}_COMPILER=/path/whatever not working as intended.

That's probably because this file is straight up copy and pasted from RMM or cuDF :)

I'm far from a CMake expert, so I would be super appreciative of how to improve this cmake file.

germasch

cmake isn't my main thing either, but anyway, here are some more obvious cmake changes. More could be done in terms of using targets rather than setting global include / link directories, but that'd require some actual careful work and testing.

germasch · 2020-08-14T01:34:10Z

cpp/CMakeLists.txt

+
+    option(CMAKE_CXX11_ABI "Enable the GLIBCXX11 ABI" ON)
+    if(CMAKE_CXX11_ABI)
+        message(STATUS "NVTX3: Enabling the GLIBCXX11 ABI")
+    else()
+        message(STATUS "NVTX3: Disabling the GLIBCXX11 ABI")
+        set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -D_GLIBCXX_USE_CXX11_ABI=0")
+        set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_GLIBCXX_USE_CXX11_ABI=0")
+        set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -Xcompiler -D_GLIBCXX_USE_CXX11_ABI=0")
+    endif(CMAKE_CXX11_ABI)


It seems unlikely to me that you'd want to do mess with the GLIBCXX11 ABI here, ie, this probably could all be removed.

Suggested change

option(CMAKE_CXX11_ABI "Enable the GLIBCXX11 ABI" ON)

if(CMAKE_CXX11_ABI)

message(STATUS "NVTX3: Enabling the GLIBCXX11 ABI")

else()

message(STATUS "NVTX3: Disabling the GLIBCXX11 ABI")

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -D_GLIBCXX_USE_CXX11_ABI=0")

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_GLIBCXX_USE_CXX11_ABI=0")

set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -Xcompiler -D_GLIBCXX_USE_CXX11_ABI=0")

endif(CMAKE_CXX11_ABI)

After I made this change I get a segfault when I run the benchmarks. I suspect this has something to do with the way gbench is configured, but I don't know enough to triage the exact cause.

@kkraus14 helped me figure out that I just needed to delete all the C11 ABi stuff in the other cmake modules as well. Once I did that it worked: 48d9679

Well, removing it all is obviously what should be done, and goes to show that one (that is, me) shouldn't just make suggestions based on what I see in the PR, but rather look at the whole thing.

I guess I'm surprised that you got segfaults rather than link errors -- when I dealt with libstdc++ ABI changes before, they had put the new ABI into a std::__1:: namespace, so one would at least notice the incompatible ABIs, rather than have things silently go bad.

cpp/CMakeLists.txt

cpp/benchmarks/CMakeLists.txt

jcohen-nvidia · 2020-08-14T19:57:07Z

@jrhemstad Quick question -- did you update this to pick up the fact that the C API stuff moved from "include" to "c/include"? I noticed when I last checked out the PR that it still thought the C API's include directory was at the repo root. If that's good to go, I'll merge the PR and then do normal pushes to make the versioning changes, so we don't have to do it all through review comments and such

jrhemstad · 2020-08-14T19:59:32Z

@jrhemstad Quick question -- did you update this to pick up the fact that the C API stuff moved from "include" to "c/include"? I noticed when I last checked out the PR that it still thought the C API's include directory was at the repo root. If that's good to go, I'll merge the PR and then do normal pushes to make the versioning changes, so we don't have to do it all through review comments and such

This is the include in the nvtx3.hpp header: #include <nvtx3/nvToolsExt.h>

Did you want it to be something other than that?

jcohen-nvidia · 2020-08-14T20:12:01Z

@jrhemstad No no, that #include <nvtx3/nvToolsExt.h> is perfect -- I was just thinking about your cmake scripts for building tests and things. If they are in the /cpp directory and setting include search paths like ../include, they'll need to change to ../c/include to handle the current state of the dev branch.

jrhemstad · 2020-08-14T21:07:23Z

If they are in the /cpp directory and setting include search paths like ../include, they'll need to change to ../c/include to handle the current state of the dev branch.

Ah, the cmake config doesn't have anything special to look first for the NVTX headers from the c/include in this repo. For most people building tests/benchmarks, it will just pick up the NVTX headers from the users CUDA toolkit.

Co-authored-by: Kai Germaschewski <kai.germaschewski@gmail.com>

This reverts commit 52e43a3.

kkraus14 · 2020-08-15T02:46:33Z

cpp/cmake/Modules/ConfigureGoogleBenchmark.cmake

@@ -0,0 +1,49 @@
+set(GBENCH_ROOT "${CMAKE_BINARY_DIR}/googlebenchmark")


I know you just copied this file from cudf / rmm, but this file is kinda janky because we wanted to force GBench to be built at configure time instead of at built time, and in hindsight that was probably a mistake. We should probably use an ExternalProject_Add command instead and then use the benchmark target that is produced from that command for the rest of the cmake.

After digging some more, turns out this actually isn't as bad practice as previously thought. Grabbing / building a package during configuration is the only way to be able to use cmake exports from that package, which in this case we'd like to take advantage of.

kkraus14 · 2020-08-15T02:47:20Z

cpp/cmake/Modules/ConfigureGoogleTest.cmake

@@ -0,0 +1,49 @@
+set(GTEST_ROOT "${CMAKE_BINARY_DIR}/googletest")


Similar to google benchmark above, this is a hack to build at configuration time which isn't a good idea. Should similarly make this a ExternalProject_add.

Initial commit.

b1d8ea8

jrhemstad commented Jun 4, 2020

View reviewed changes

jrhemstad added 11 commits June 4, 2020 16:04

Update CMakeLists.txt

8aa5ce4

Add .clang-format.

c644e3e

Rename registered_message to registered_string.

a6021df

Merge branch 'add-cpp-bindings' of github.com:jrhemstad/NVTX into add…

6cd7d64

…-cpp-bindings

Add benchmark infrastructure.

0086b6f

Rename to example benchmark.

42f1fbf

Add benchmark comparing RAII against raw C calls.

93698a1

Add linking against libdl for tests to resolve undefined symbol error…

b6d0441

… for dlopen/dlclose.

Consistently use nvtx3 headers.

4f75b6f

Default to building tests.

e2e17c2

Initialize reserved portion of event attributes struct.

db90682

jcohen-nvidia approved these changes Jun 15, 2020

View reviewed changes

jrhemstad added 4 commits June 15, 2020 14:31

Format.

140782f

Rename C++ benchmark.

72d2c43

Add more benchmarks.

5ce9414

Benchmark func range.

d4aedca

jrhemstad commented Jun 16, 2020

View reviewed changes

harrism reviewed Jun 18, 2020

View reviewed changes

jrhemstad added 2 commits June 19, 2020 13:25

Add initial include guards for NVTX_DISABLE.

2aae81f

Delete operator new.

f507556

jrhemstad changed the base branch from master to dev June 25, 2020 12:18