Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NVTX C++ wrappers #2

Merged
merged 36 commits into from
Aug 17, 2020
Merged

Conversation

jrhemstad
Copy link
Contributor

@jrhemstad jrhemstad commented Jun 4, 2020

This PR adds C++ convenience wrappers for the NVTX C API.

Leverages C++ objects, RAII, and "initialize on first use" C++ paradigms to make it easier to add NVTX ranges to C++ code. See README.md and Doxygen for more information.

nvtx3.hpp is the single header the provides all of the C++ wrappers.

  • Figure out directory structure

  • Figure out integration of C++ doxygen into existing Doxygen of C APIs

  • Tests

  • Benchmarks

cpp/nvtx3.hpp Outdated Show resolved Hide resolved

color() = delete;
~color() = default;
color(color const&) = default;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harrism reports that clang complains here because the copy ctor is implicitly deleted since the members are const. Should explicitly mark this ctor as deleted or remove the const from the members.

Copy link
Contributor

@jcohen-nvidia jcohen-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks

Copy link
Contributor Author

@jrhemstad jrhemstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wrap as much of the C++ functionality as makes sense with #ifndef NVTX_DISABLE to eliminate any unnecessary extra work being done at the C++ layer.

###################################################################################################
# - compiler options ------------------------------------------------------------------------------
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_C_COMPILER $ENV{tCC})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is tCC intentional?

@jcohen-nvidia
Copy link
Contributor

@jrhemstad I haven't forgotten about you! I still need to finish some tests on symbols with inline namespaces, and compare some of my perf-affecting changes to your benchmarks.

I got myself admin permissions finally, and I've created the new branches. We are going to avoid using "master" as a branch name since LLVM and a number of other projects are starting to switch away from that name. I made "dev" and "release-v3". All development work should be direct merges to or pull requests on "dev". Then when we get to states that are in good shape, we can fast-forward merge all the changes in "dev" to "release-v3" and make a new tag on "release-v3". So far I've just made a "v3.0.1" tag on that branch. I set the default branch of the repo to be "release-v3" because that's what people will see when first arriving at the repo page, and that's the thing we want them to get if they just hit Clone or Download. So now we'll have to remember when creating PRs to not go against the default but to switch it to "dev". I read various opinions and found a bunch of agreement that this tends to work well in practice, better than having "dev" be default.

So, can you switch this PR to the "dev" branch? I think putting the wrappers in cpp in the root is fine -- I moved the include directory in the root to be c/include. You will have to update the include paths for those headers in your tests. Also I made some C and C++ sample injections -- I think you should switch your tests to these instead of using CUPTI. Removing the deps on CUDA will make it easier to build.

Also, I am going to add you to have write access to the repo.

@jrhemstad jrhemstad changed the base branch from master to dev June 25, 2020 12:18
cpp/nvtx3.hpp Outdated Show resolved Hide resolved
cpp/nvtx3.hpp Outdated
*
* @param r Handle to a range started by a prior call to `start_range`.
*/
void end_range(range_handle r)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make inline since function is defined in header file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this one does indeed need to be made inline. Unless we want to require a domain as part of the end_range. The NVTX C docs say this isn't necessary, but we can have stricter requirements on the C++ API if we ever want to change the underlying C behavior.

@jcohen-nvidia
Copy link
Contributor

Since the MSVC compiler doesn't handle the ISO646 versions of logical operators (and/or/not instead of &&/||/!) without including an additional header <iso646.h>, I would prefer sticking with the &&/||/! spellings of those. I imagine there could be Windows code that breaks if they include that header and were using and/or/not for symbols, and this is an easy enough fix to avoid that breakage.

Comment on lines 42 to 43
set(CMAKE_C_COMPILER $ENV{CC})
set(CMAKE_CXX_COMPILER $ENV{CXX})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen this used elsewhere (in rapids), but unless there's a specific reason for it, I'd recommend not do this. The default behavior of cmake is to use the environment variables CC and CXX as default for CMAKE_{C,CXX}_COMPILER, anyway. So setting them this is way is not necessary, and in fact I believe it makes cmake -DCMAKE_{C,CXX}_COMPILER=/path/whatever not working as intended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably because this file is straight up copy and pasted from RMM or cuDF :)

I'm far from a CMake expert, so I would be super appreciative of how to improve this cmake file.

Copy link
Contributor

@germasch germasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmake isn't my main thing either, but anyway, here are some more obvious cmake changes. More could be done in terms of using targets rather than setting global include / link directories, but that'd require some actual careful work and testing.

Comment on lines 47 to 56

option(CMAKE_CXX11_ABI "Enable the GLIBCXX11 ABI" ON)
if(CMAKE_CXX11_ABI)
message(STATUS "NVTX3: Enabling the GLIBCXX11 ABI")
else()
message(STATUS "NVTX3: Disabling the GLIBCXX11 ABI")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -D_GLIBCXX_USE_CXX11_ABI=0")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_GLIBCXX_USE_CXX11_ABI=0")
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -Xcompiler -D_GLIBCXX_USE_CXX11_ABI=0")
endif(CMAKE_CXX11_ABI)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems unlikely to me that you'd want to do mess with the GLIBCXX11 ABI here, ie, this probably could all be removed.

Suggested change
option(CMAKE_CXX11_ABI "Enable the GLIBCXX11 ABI" ON)
if(CMAKE_CXX11_ABI)
message(STATUS "NVTX3: Enabling the GLIBCXX11 ABI")
else()
message(STATUS "NVTX3: Disabling the GLIBCXX11 ABI")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -D_GLIBCXX_USE_CXX11_ABI=0")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_GLIBCXX_USE_CXX11_ABI=0")
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -Xcompiler -D_GLIBCXX_USE_CXX11_ABI=0")
endif(CMAKE_CXX11_ABI)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After I made this change I get a segfault when I run the benchmarks. I suspect this has something to do with the way gbench is configured, but I don't know enough to triage the exact cause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kkraus14 helped me figure out that I just needed to delete all the C11 ABi stuff in the other cmake modules as well. Once I did that it worked: 48d9679

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, removing it all is obviously what should be done, and goes to show that one (that is, me) shouldn't just make suggestions based on what I see in the PR, but rather look at the whole thing.

I guess I'm surprised that you got segfaults rather than link errors -- when I dealt with libstdc++ ABI changes before, they had put the new ABI into a std::__1:: namespace, so one would at least notice the incompatible ABIs, rather than have things silently go bad.

cpp/CMakeLists.txt Show resolved Hide resolved
cpp/benchmarks/CMakeLists.txt Outdated Show resolved Hide resolved
@jcohen-nvidia
Copy link
Contributor

@jrhemstad Quick question -- did you update this to pick up the fact that the C API stuff moved from "include" to "c/include"? I noticed when I last checked out the PR that it still thought the C API's include directory was at the repo root. If that's good to go, I'll merge the PR and then do normal pushes to make the versioning changes, so we don't have to do it all through review comments and such

@jrhemstad
Copy link
Contributor Author

@jrhemstad Quick question -- did you update this to pick up the fact that the C API stuff moved from "include" to "c/include"? I noticed when I last checked out the PR that it still thought the C API's include directory was at the repo root. If that's good to go, I'll merge the PR and then do normal pushes to make the versioning changes, so we don't have to do it all through review comments and such

This is the include in the nvtx3.hpp header: #include <nvtx3/nvToolsExt.h>

Did you want it to be something other than that?

@jcohen-nvidia
Copy link
Contributor

@jrhemstad No no, that #include <nvtx3/nvToolsExt.h> is perfect -- I was just thinking about your cmake scripts for building tests and things. If they are in the /cpp directory and setting include search paths like ../include, they'll need to change to ../c/include to handle the current state of the dev branch.

@jrhemstad
Copy link
Contributor Author

If they are in the /cpp directory and setting include search paths like ../include, they'll need to change to ../c/include to handle the current state of the dev branch.

Ah, the cmake config doesn't have anything special to look first for the NVTX headers from the c/include in this repo. For most people building tests/benchmarks, it will just pick up the NVTX headers from the users CUDA toolkit.

@@ -0,0 +1,49 @@
set(GBENCH_ROOT "${CMAKE_BINARY_DIR}/googlebenchmark")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you just copied this file from cudf / rmm, but this file is kinda janky because we wanted to force GBench to be built at configure time instead of at built time, and in hindsight that was probably a mistake. We should probably use an ExternalProject_Add command instead and then use the benchmark target that is produced from that command for the rest of the cmake.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After digging some more, turns out this actually isn't as bad practice as previously thought. Grabbing / building a package during configuration is the only way to be able to use cmake exports from that package, which in this case we'd like to take advantage of.

@@ -0,0 +1,49 @@
set(GTEST_ROOT "${CMAKE_BINARY_DIR}/googletest")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to google benchmark above, this is a hack to build at configuration time which isn't a good idea. Should similarly make this a ExternalProject_add.

@jcohen-nvidia jcohen-nvidia merged commit 1d72ece into NVIDIA:dev Aug 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants