Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAGRA binary size too large. #1459

Open
Tracked by #1392
tfeher opened this issue Apr 24, 2023 · 1 comment
Open
Tracked by #1392

CAGRA binary size too large. #1459

tfeher opened this issue Apr 24, 2023 · 1 comment
Assignees

Comments

@tfeher
Copy link
Contributor

tfeher commented Apr 24, 2023

Note that rapids is compiled for sm 60, 70, 75, 80, 86, 90. Compiling for float only, with one team specilization per team adds 125 MiB to libraft.so. Four data types and multiple team size specialization goes over 1 GiB size increase. Find out what is essential for good perf, and what is acceptable size increase for RAFT.

@tfeher tfeher mentioned this issue Apr 24, 2023
16 tasks
@tfeher tfeher changed the title Binary size too large. Note that rapids is compiled for sm 60, 70, 75, 80, 86, 90. Compiling for float only, with one team specilization per team adds 125 MiB to libraft.so. Four data types and multiple team size specialization goes over 1 GiB size increase. Find out what is essential for good perf, and what is acceptable size increase for RAFT. CAGRA binary size too large. Apr 24, 2023
@tfeher
Copy link
Contributor Author

tfeher commented Apr 25, 2023

See #1428 (comment) details in binary size (when compiled only for float input type).

If we could have block size as runtime param (instead of template) that would lead a significant reduction in binary size
https://github.com/rapidsai/raft/blob/branch-23.06/cpp/include/raft/neighbors/detail/cagra/search_single_cta.cuh#L827-L841

@tfeher tfeher assigned tfeher and enp1s0 and unassigned tfeher Apr 25, 2023
rapids-bot bot pushed a commit that referenced this issue Jun 9, 2023
This PR adds padding to the dataset (if necessary) to make reading any of its rows compatible with 128bit vectorized loads. This change also enables handling arbitrary number of input features (before this PR each row had to be at least 64bit aligned, which constrained the acceptable number of input features).

Fixes #1458.

With this change, it is sufficient to keep a single "load type" specialization for the search kernels, which shall cut the binary size by half (#1459).

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - tsuki (https://github.com/enp1s0)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1505
rapids-bot bot pushed a commit that referenced this issue Sep 26, 2023
This PR removes block size template parameters from CAGRA search kernel functions to reduce the library size and build time.

rel: #1459

Authors:
  - tsuki (https://github.com/enp1s0)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: #1740
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants