Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move RMM_LOGGING_ASSERT into separate header #1241

Merged
merged 5 commits into from
Apr 20, 2023

Conversation

ahendriksen
Copy link
Contributor

@ahendriksen ahendriksen commented Apr 4, 2023

The inclusion of rmm/logger.hpp can increase compile times significantly. Before this PR, rmm/logger.hpp was included in rmm/error.hpp which by necessity was included almost everywhere. This PR moves RMM_LOGGING_ASSERT into its own header, thereby drastically cutting down on the headers which transitively include rmm/logger.hpp.

This PR will make it possible to reduce compile times of RAFT by ~10 seconds per translation unit.

Description

Related to issue #1222 and also PR #1232. Compared to #1232, this PR might make it able to also have fast builds without precompiling spdlog.

I include a table below showing which headers transitively include rmm/logger.hpp before and after PR (in debug and release builds). These are the rmm headers used by RAFT.

Header Before After
rmm/cuda_device.hpp debug release
rmm/cuda_stream.hpp debug release debug
rmm/cuda_stream_pool.hpp debug release debug
rmm/cuda_stream_view.hpp debug release
rmm/device_buffer.hpp debug release
rmm/device_scalar.hpp debug release
rmm/device_uvector.hpp debug release
rmm/device_vector.hpp debug release
rmm/exec_policy.hpp debug release
rmm/logger.hpp debug release debug release
rmm/mr/device/aligned_resource_adaptor.hpp debug release
rmm/mr/device/arena_memory_resource.hpp debug release debug release
rmm/mr/device/binning_memory_resource.hpp debug release debug release
rmm/mr/device/callback_memory_resource.hpp debug release
rmm/mr/device/cuda_async_memory_resource.hpp debug release
rmm/mr/device/cuda_async_view_memory_resource.hpp debug release
rmm/mr/device/cuda_memory_resource.hpp debug release
rmm/mr/device/device_memory_resource.hpp debug release
rmm/mr/device/failure_callback_resource_adaptor.hpp debug release
rmm/mr/device/fixed_size_memory_resource.hpp debug release debug release
rmm/mr/device/limiting_resource_adaptor.hpp debug release
rmm/mr/device/logging_resource_adaptor.hpp debug release debug release
rmm/mr/device/managed_memory_resource.hpp debug release
rmm/mr/device/owning_wrapper.hpp debug release
rmm/mr/device/per_device_resource.hpp debug release
rmm/mr/device/polymorphic_allocator.hpp debug release
rmm/mr/device/pool_memory_resource.hpp debug release debug release
rmm/mr/device/statistics_resource_adaptor.hpp debug release
rmm/mr/device/thread_safe_resource_adaptor.hpp debug release
rmm/mr/device/thrust_allocator_adaptor.hpp debug release
rmm/mr/device/tracking_resource_adaptor.hpp debug release debug release
rmm/mr/host/host_memory_resource.hpp
rmm/mr/host/new_delete_resource.hpp
rmm/mr/host/pinned_memory_resource.hpp debug release

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Avoiding the inclusion of rmm/logger.hpp can reduce compile times
significantly.
@ahendriksen ahendriksen requested a review from a team as a code owner April 4, 2023 14:39
@github-actions github-actions bot added the cpp Pertains to C++ code label Apr 4, 2023
@ahendriksen
Copy link
Contributor Author

Since this is moving an internal macro from one detail header to another detail header, it should be considered a non-breaking change. In practice, it might break downstream projects that are expecting rmm/logger.hpp to be included. I had to fix up some files in the rmm tree for this reason. I am not sure how common it is for downstream projects to depend on rmm/logger.hpp to be included.

@harrism
Copy link
Member

harrism commented Apr 4, 2023

I include a table below showing which headers transitively include rmm/logger.hpp before and after PR (in debug and release builds). These are the rmm headers used by RAFT.

Does this PR only update the RMM headers used by RAFT? It would be better to update all headers, I think.

@harrism harrism added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Apr 4, 2023
@harrism
Copy link
Member

harrism commented Apr 4, 2023

So does this replace #1232?

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Mark, we should update all headers in one PR. Afterwards, we should test that at least the core RAPIDS libraries continue to build successfully against this PR before merging.

If building cudf/cuml/cugraph is difficult for you to do locally, you can also do it fairly easily in CI by opening PRs to each repo that use the build artifacts from this PR rather than pulling the latest librmm nightly. There have been a few examples of this lately, including this commit in this raft PR. I would recommend following that pattern for at least cudf/cuml/cugraph to make sure that they compile successfully. Let me know if you need help with that.

@harrism
Copy link
Member

harrism commented Apr 5, 2023

Building locally should be easy with rapids-compose, if not devcontainers. Not sure if all repos have the devcontainers yet, but the cuspatial devcontainer can be used to build RMM, cuDF, and cuSpatial, for example.

@ahendriksen
Copy link
Contributor Author

ahendriksen commented Apr 5, 2023

Does this PR only update the RMM headers used by RAFT? It would be better to update all headers, I think.

No, this PR updates all headers and makes sure all tests compile and run. I have included the table to show that the impact on one particular downstream library is meaningful. If you want to know for other headers what the impact is, feel free to edit and run this bash snippet:

export FILES=( rmm/cuda_stream.hpp rmm/cuda_stream_pool.hpp rmm/cuda_stream_view.hpp rmm/device_buffer.hpp rmm/device_scalar.hpp rmm/device_uvector.hpp rmm/device_vector.hpp rmm/exec_policy.hpp rmm/mr/device/device_memory_resource.hpp rmm/mr/device/limiting_resource_adaptor.hpp rmm/mr/device/managed_memory_resource.hpp rmm/mr/device/per_device_resource.hpp rmm/mr/device/pool_memory_resource.hpp rmm/mr/host/new_delete_resource.hpp rmm/mr/host/pinned_memory_resource.hpp rmm/mr/device/per_device_resource.hpp)

for f in ${FILES[@]}; do
    echo -n ${f};
    g++ -x c++  -H -I./include/ -I./build/_deps/spdlog-src/include -I./build/_deps/fmt-src/include -c <(echo "#include <${f}>") 2>&1| grep -q 'spdlog' && echo -n ' debug '
    g++ -x c++ -DNDEBUG -H -I./include/ -I./build/_deps/spdlog-src/include -I./build/_deps/fmt-src/include  -c <(echo "#include <${f}>") 2>&1| grep -q 'spdlog' && echo -n ' release '
    echo ""
done

So does this replace #1232?

That really depends on the direction that RMM sets for using logging. If RMM_LOGGING_ASSERT starts to be used in core include files, like cuda_stream_view, device_uvector, etc, then #1232 will be essential to preserve compile times for downstream libraries. On the other hand, if logging is used conservatively in relatively constrained places (like memory pool), then it is easy to work around in downstream libraries and PR #1232 might be more hassle than it is worth.

I agree with Mark, we should update all headers in one PR. Afterwards, we should test that at least the core RAPIDS libraries continue to build successfully against this PR before merging.

This PR updates all headers. I will open PRs on cudf/cuml/cugraph and report back.

If building cudf/cuml/cugraph is difficult for you to do locally, you can also do it fairly easily in CI by opening PRs to each repo that use the build artifacts from this PR rather than pulling the latest librmm nightly.

Thanks for the example! I have opened

As a general question, if moving a single macro in the detail headers requires rerunning CI in downstream libraries, would it make sense to automate it?

@harrism
Copy link
Member

harrism commented Apr 5, 2023

As a general question, if moving a single macro in the detail headers requires rerunning CI in downstream libraries, would it make sense to automate it?

Testing all of RAPIDS in every push to a RMM PR would be a bit of a burden both on our CI systems and costs, and on developer productivity.

However I believe this is automated in nightly tests.

@ahendriksen
Copy link
Contributor Author

Status update:

  • the cuml cpp build + tests have succeeded.
  • the cugraph cpp build + tests have succeeded.
  • the cudf cpp build has failed because tests/utilities/identify_stream_usage.cpp uses strcmp but does not include string:
/cpp/tests/utilities/identify_stream_usage.cpp:162:35: error: 'strcmp' was not declared in this scope
2023-04-05T09:38:17.2091899Z   162 |     if (env_stream_error_mode && !strcmp(env_stream_error_mode, "print")) {
2023-04-05T09:38:17.2092304Z       |                                   ^~~~~~
2023-04-05T09:38:17.2092842Z $SRC_DIR/cpp/tests/utilities/identify_stream_usage.cpp:26:1: note: 'strcmp' is defined in header '<cstring>'; did you forget to '#include <cstring>'?

I had a typo in the python channel, so all python builds have failed.

How do you suggest I proceed?

@ahendriksen
Copy link
Contributor Author

ahendriksen commented Apr 5, 2023

I have filed rapidsai/cudf#13066 to fix the missing include and pushed to the cuml, cudf, and cugraph PR a fix to the typo in the Python RMM channel.

@ahendriksen
Copy link
Contributor Author

All test PRs are passing CI in cuml, cugraph, and cudf. I fixed two includes in cudf. One for algorithm and one for cstring.

I think we should be good to go. 👍

@vyasr
Copy link
Contributor

vyasr commented Apr 8, 2023

As a general question, if moving a single macro in the detail headers requires rerunning CI in downstream libraries, would it make sense to automate it?

I mainly requested this because in the specific case of macros we have had many issues before where dependencies were using macros that were never intended to be public. In general, I don't think this level of testing is required.

That said, there are many cases where we do want certain changes to a library to trigger tests of the dependencies. I know that you see this all the time with raft->cuml/cugraph, and any rmm or rapids-cmake change will affect all of RAPIDS. We are interested in automating at least the dispatch process, but there are many tasks ahead of that in the queue. Running all of RAPIDS CI on every rmm PR is definitely overkill though.

@ahendriksen
Copy link
Contributor Author

I mainly requested this because in the specific case of macros we have had many issues before where dependencies were using macros that were never intended to be public.

I agree that it is good to test the change on downstream libraries and I am happy that this was able to catch some missing includes in cudf.

That said, there are many cases where we do want certain changes to a library to trigger tests of the dependencies. [..] We are interested in automating at least the dispatch process, but there are many tasks ahead of that in the queue.

Great! Good to know that automating the dispatch process is in the queue.

I have tested this PR on a current branch of RAFT that I am working on. It reduced build times of libraft + tests by 10% (29 min to 26 min) and reduced build times per translation unit by 11s (median) and 8.5s (mean). There are still some big 'chunks' in the RAFT process that limit the overall impact. For incremental compilation, the impact can be quite large: most test files go from 20s to 10s compile time.

At the suggestion of @teju85, I have also checked with github code search for uses of RMM_LOGGING_ASSERT, but found none in other libraries (except for vendored copies of RMM in their directory structure).

Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a nice improvement.

@ahendriksen
Copy link
Contributor Author

@vyasr : is there anything holding up this PR?

@vyasr
Copy link
Contributor

vyasr commented Apr 19, 2023

@vyasr : is there anything holding up this PR?

Nope sorry I've just been on vacation and hadn't had a chance to review again. Looking now.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, could you update the table in the PR description? I'm not sure that it is up-to-date, there are a number of files that now include both logging_assert.hpp and logger.hpp. Was the goal of the table to only capture transitive includes, or to generally list all files where the state of logger.hpp inclusion has changed? Examples like fixed_size_memory_resource.hpp or pool_memory_resource.hpp added an include, but I suppose the end result is the same in terms of what symbols end up defined, so perhaps that's what you were aiming for?

Aside from that the change looks great, thanks!

@ahendriksen
Copy link
Contributor Author

ahendriksen commented Apr 20, 2023

I have updated the table. It now contains all non-detail headers. Indeed it lists per header if it transitively includes spdlog. It was not the intention to list all files where the inclusion of logger.hpp has changed.

@vyasr
Copy link
Contributor

vyasr commented Apr 20, 2023

Awesome thanks for that @ahendriksen, looks great!

@vyasr
Copy link
Contributor

vyasr commented Apr 20, 2023

/merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp Pertains to C++ code improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants