Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small optimization for block hist matrix. #4501

Closed
wants to merge 15 commits into from
Closed

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented May 25, 2019

This is an early WIP. Please ignore copied and pasted code. Initial benchmark with HIGGS:

  • Platform : Ubuntu 18.10
  • CPU : AMD Ryzen 7 2700 Eight-Core Processor

Results:

  • Time spent in GHistIndexBlockMatrix::Init
Before Now
7.56465 s 4.09427 s (-45.87%)
  • High water marked (peak) memory usage with GHistIndexBlockMatrix::Init:
Before Now
Child high-water RSS 12138724 KiB 8359052 KiB (-31.13%)
Recursive and acc. high-water RSS+CACHE 19922500 KiB 8371424 KiB (-57.98%)

This is mostly for addressing memory usage issue. The benchmark is carried out by tests/benchmark/block_hist_init.cc. Hopefully with more effort we can bring back feature grouping, which might be beneficial to GPU Hist too.

One particular thing worth noting is that, I tried injecting OMP into many places, it ends up single thread version is the fastest one.

@trivialfis
Copy link
Member Author

Sadly this is not even close to fixing #2326 .

@trivialfis trivialfis changed the title [WIP] Small optimization for block hist matrix. Small optimization for block hist matrix. May 27, 2019
@trivialfis trivialfis requested a review from hcho3 May 27, 2019 04:40
@trivialfis
Copy link
Member Author

@hcho3 The benchmarking code is not necessary, but very useful for #4503 . You can decide whether we should keep it here.

@hcho3
Copy link
Collaborator

hcho3 commented May 28, 2019

@trivialfis I'll review this PR later in detail. I think we should keep the benchmark code. Later, we can expand it as part of a regular performance benchmark.

@trivialfis
Copy link
Member Author

@hcho3 Take your time. This is part of memory optimization for hist and is not the most critical one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants