Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve initial setup time and memory consumption in fast histogram #2543

Closed
wants to merge 1 commit into from

Conversation

hcho3
Copy link
Collaborator

@hcho3 hcho3 commented Jul 24, 2017

This is a response to issue #2326.

  • Modified implementation for approximate quantile sketch for reduced memory consumption
  • Parallel construction of columnar access structure -- this is a transposed version of input matrix to speed up column-wise access.
  • Option (use_columnar_access=0) to disable columnar access structure entirely, to further reduce initial setup time and memory usage. Column access may slow down with this option.
  • Slightly improved feature grouping algorithm

@hcho3
Copy link
Collaborator Author

hcho3 commented Jul 24, 2017

Here are some results on the URL dataset (see here for description), using a c3.8xlarge instance (60 GB RAM, 32 cores):

Configuration setup time peak memory usage full log
enable_feature_grouping=1, use_columnar_access=1 190 sec 30.5 GB log.txt
enable_feature_grouping=0, use_columnar_access=1 69 sec 12.4 GB log2.txt
enable_feature_grouping=0, use_columnar_access=0 40 sec 12.4 GB log3.txt

@codecov-io
Copy link

codecov-io commented Jul 24, 2017

Codecov Report

Merging #2543 into master will decrease coverage by 0.25%.
The diff coverage is 2.85%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2543      +/-   ##
==========================================
- Coverage   35.11%   34.86%   -0.26%     
==========================================
  Files          79       80       +1     
  Lines        6971     7062      +91     
  Branches      680      695      +15     
==========================================
+ Hits         2448     2462      +14     
- Misses       4422     4512      +90     
+ Partials      101       88      -13
Impacted Files Coverage Δ
src/common/hist_util.h 0% <ø> (ø) ⬆️
src/common/hist_util.cc 0% <0%> (ø) ⬆️
src/common/column_matrix.h 0% <0%> (ø) ⬆️
src/tree/updater_fast_hist.cc 1.08% <0%> (-0.15%) ⬇️
src/common/memory.h 0% <0%> (ø)
src/tree/fast_hist_param.h 100% <100%> (ø) ⬆️
src/c_api/c_api.cc 17.77% <0%> (-0.19%) ⬇️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0e06d18...bea26c9. Read the comment docs.

@tqchen
Copy link
Member

tqchen commented Jul 28, 2017

@hcho3 let me know if it is ready to merge

@Laurae2
Copy link
Contributor

Laurae2 commented Jul 28, 2017

@hcho3 Does this PR changes the training speed other than the setup time? (so I know if I have to redo my benchmarks or not based on this PR)

Laurae2 added a commit to Laurae2/ez_xgb that referenced this pull request Aug 1, 2017
@Laurae2
Copy link
Contributor

Laurae2 commented Aug 1, 2017

@hcho3 It crashes on my custom reputation dataset (2,250,000 observations x 23,636 features) when it reaches the grouping feature part. If I disable feature grouping, it works (after 18 minutes of columnar access generation).

image

Working without feature grouping:

> model <- xgb.train(params = list(nthread = 40,
+                                  #max_depth = 3,
+                                  num_leaves = 127,
+                                  tree_method = "hist",
+                                  grow_policy = "depthwise",
+                                  eta = 0.25,
+                                  max_bin = 255,
+                                  eval_metric = "auc",
+                                  debug_verbose = 2,
+                                  enable_feature_grouping = 0),
+                    data = train,
+                    nrounds = 10,
+                    watchlist = list(test = test),
+                    verbose = 2,
+                    early_stopping_rounds = 50)
[13:33:22] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[13:33:22] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[13:33:29] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 23636)...
[13:35:08] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[13:35:20] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[13:53:24] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 1201.81 sec
[13:54:20] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 98 extra nodes, 0 pruned nodes, max_depth=6
[13:54:20] amalgamation/../src/tree/updater_fast_hist.cc:273: 
InitData:          0.0497 ( 0.09%)
InitNewNode:       0.0312 ( 0.06%)
BuildHist:         51.9788 (92.85%)
EvaluateSplit:     3.8466 ( 6.87%)
ApplySplit:        0.0780 ( 0.14%)
========================================
Total:             55.9844
[13:54:20] amalgamation/../src/gbm/gbtree.cc:274: CommitModel(): 0.34489 sec
[13:54:20] amalgamation/../src/learner.cc:373: EvalOneIter(): 0 sec
[1]	test-auc:0.500636 
Will train until test_auc hasn't improved in 50 rounds.

[13:54:58] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 90 extra nodes, 0 pruned nodes, max_depth=6
[13:54:58] amalgamation/../src/tree/updater_fast_hist.cc:273: 
InitData:          0.0094 ( 0.02%)
InitNewNode:       0.0467 ( 0.12%)
BuildHist:         34.3774 (90.31%)
EvaluateSplit:     3.5781 ( 9.40%)
ApplySplit:        0.0546 ( 0.14%)
========================================
Total:             38.0662
[13:54:58] amalgamation/../src/gbm/gbtree.cc:274: CommitModel(): 0.0157399 sec
[13:54:58] amalgamation/../src/learner.cc:373: EvalOneIter(): 0.00942326 sec

I also took Bosch to test it, it takes a long time to start while it used to start nearly instantly (it now takes 8 min 30).

@hcho3
Copy link
Collaborator Author

hcho3 commented Aug 1, 2017

@Laurae2 Sorry for the late reply. I was out of town for a few days. The improvement was sorely concerned with improving setup time. Thanks again for your contribution. I do have a few questions for you.

  1. When you say "custom reputation dataset," are you referring to the same dataset you used in issue Fast histogram algorithm exhibits long start-up time and high memory usage #2326? I had run tests using the dataset given in Fast histogram algorithm exhibits long start-up time and high memory usage #2326 with feature grouping enabled, without any crashes. Let me take another look into this.

  2. Are you still using this machine?

  • Quad E7-88xx v4 (72 cores), but 3 CPUs were disabled to get rid of NUMA
  • 1TB RAM 1866MHz RAM (256GB RAM available to get rid of NUMA)
  • Not virtualized

I'm trying to guess whether feature grouping crashes due to lack of memory.

  1. Do you have the full log for the Bosch run (with debug_verbose=1)?

@tqchen I do need to take another look at this pull request. While I'm at it, I'm also inclined to completely re-write the feature grouping logic to make it more parallel. (The logic is inherently sequential now, as it has to inspect one feature at a time.) Let me get back to you on this.

@Laurae2
Copy link
Contributor

Laurae2 commented Aug 1, 2017

@hcho3 Issue is probably here: https://github.com/hcho3/xgboost/blob/14a33f6cdf2b388f64293b0ac0718d5ab945b37f/src/common/column_matrix.h#L157-L189 or in a part where there is OpenMP used on all cores (threads are probably sharing a common variable, leading to negative scalability)

  1. It is not exactly the same dataset. My custom reputation dataset can be found using this preprocessing, to do in order:

It leads to the creation of a very large RDS file which contains the dataset (see #2326 (comment) for the training script).

  1. Yes, still using that machine, but with all cores/RAM. I had over 950GB free RAM before it crashed. I noticed the CPU usage was around 5%, xgboost was trying to leverage all cores but it couldn't do so efficiently.

  2. Unfortunately my server is not available, therefore I tried to reproduce the log on my laptop, It seems I cannot reproduce it on the laptop (I had RAM swap at feature grouping, therefore it is normal it is slow at feature grouping).

However, when it took 8min30 on Bosch I had the same issue as 2., e.g low CPU usage. The very slow part was when xgboost was trying to use all available cores.

Log for my laptop, i7-4600U:

> model <- xgb.train(params = list(nthread = 4,
+                                  max_depth = 6,
+                                  num_leaves = 63,
+                                  tree_method = "hist",
+                                  grow_policy = "depthwise",
+                                  eta = 0.05,
+                                  max_bin = 255,
+                                  eval_metric = "auc",
+                                  debug_verbose = 2,
+                                  enable_feature_grouping = 1),
+                    data = train,
+                    nrounds = 1,
+                    watchlist = list(test = test),
+                    verbose = 2,
+                    early_stopping_rounds = 50)
[21:05:28] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[21:05:29] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[21:05:33] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 969)...
[21:05:33] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[21:05:37] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[21:05:49] amalgamation/../src/tree/updater_fast_hist.cc:82: Grouping features together...
[21:08:17] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 167.956 sec
[21:08:24] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 82 extra nodes, 0 pruned nodes, max_depth=6
[21:08:24] amalgamation/../src/tree/updater_fast_hist.cc:273: 
InitData:          0.0625 ( 0.83%)
InitNewNode:       0.0000 ( 0.00%)
BuildHist:         7.2841 (96.83%)
EvaluateSplit:     0.0499 ( 0.66%)
ApplySplit:        0.1260 ( 1.67%)
========================================
Total:             7.5224
[21:08:29] amalgamation/../src/gbm/gbtree.cc:274: CommitModel(): 4.39619 sec
[21:08:29] amalgamation/../src/learner.cc:373: EvalOneIter(): 0.0469153 sec
[1]	test-auc:0.606967 
Will train until test_auc hasn't improved in 50 rounds.

@khotilov
Copy link
Member

khotilov commented Aug 2, 2017

@hcho3 parallelizing this part would be useful. I've also noticed that during the initialization lots of time was spent with just a single thread working.

@hcho3 hcho3 reopened this Aug 2, 2017
@hcho3
Copy link
Collaborator Author

hcho3 commented Aug 2, 2017

I had accidentally closed the pull request. Sorry for any confusion caused.

@hcho3
Copy link
Collaborator Author

hcho3 commented Aug 2, 2017

@khotilov I was originally planning to postpone the re-write after this pull request, but I changed my mind. Let me go ahead and fix the feature grouping logic.

@hcho3
Copy link
Collaborator Author

hcho3 commented Aug 2, 2017

@Laurae2 Thanks! I will promptly investigate the issue and get back to you.

@Laurae2
Copy link
Contributor

Laurae2 commented Aug 7, 2017

@hcho3 You can reproduce the issue by doing the following on any machine, even if they do not have 40 threads, on Bosch dataset (or any other large dataset):

model <- xgb.train(params = list(nthread = 40,
                                 max_depth = 6,
                                 num_leaves = 63,
                                 tree_method = "hist",
                                 grow_policy = "depthwise",
                                 eta = 0.05,
                                 max_bin = 255,
                                 eval_metric = "auc",
                                 debug_verbose = 2,
                                 enable_feature_grouping = 1),
                   data = train,
                   nrounds = 2,
                   watchlist = list(test = test),
                   verbose = 2,
                   early_stopping_rounds = 50)

Log of the i7-7700K, which is roughly 2.5x faster (3 min vs 8 min here) than my 72 core server per thread:

[12:52:52] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[12:52:52] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[12:52:55] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 969)...
[12:52:55] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[12:52:56] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[12:56:04] amalgamation/../src/tree/updater_fast_hist.cc:82: Grouping features together...
[12:56:18] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 205.547 sec
[12:56:19] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 90 extra nodes, 0 pruned nodes, max_depth=6
[12:56:19] amalgamation/../src/tree/updater_fast_hist.cc:273: 
InitData:          0.0029 ( 0.24%)
InitNewNode:       0.0624 ( 5.17%)
BuildHist:         1.1251 (93.29%)
EvaluateSplit:     0.0157 ( 1.30%)
ApplySplit:        0.0000 ( 0.00%)
========================================
Total:             1.2060
[12:56:19] amalgamation/../src/gbm/gbtree.cc:274: CommitModel(): 0.0781817 sec
[12:56:19] amalgamation/../src/learner.cc:373: EvalOneIter(): 0.0336745 sec
[1]	test-auc:0.606966 
Will train until test_auc hasn't improved in 50 rounds.

[12:56:20] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 86 extra nodes, 0 pruned nodes, max_depth=6
[12:56:20] amalgamation/../src/tree/updater_fast_hist.cc:273: 
InitData:          0.0000 ( 0.00%)
InitNewNode:       0.0000 ( 0.00%)
BuildHist:         0.7504 (85.57%)
EvaluateSplit:     0.0953 (10.86%)
ApplySplit:        0.0313 ( 3.57%)
========================================
Total:             0.8770
[12:56:20] amalgamation/../src/gbm/gbtree.cc:274: CommitModel(): 0.0135319 sec
[12:56:20] amalgamation/../src/learner.cc:373: EvalOneIter(): 0.0468612 sec
[2]	test-auc:0.607018 

@hcho3
Copy link
Collaborator Author

hcho3 commented Aug 7, 2017

@Laurae2 Sorry for the delay. I've been spending most of my time working on the first release of dmlc/tree-lite. Let me look at it this week for sure. Thanks!

Laurae2 added a commit to Laurae2/ez_xgb that referenced this pull request Aug 8, 2017
Updating to dmlc/xgboost#2543 (02/08/2017)
@Laurae2
Copy link
Contributor

Laurae2 commented Aug 8, 2017

@hcho3 I changed this (https://github.com/Laurae2/ez_xgb/blob/devel/src/common/column_matrix.h#L157) and it's way faster now. Only feature grouping is now causing me issues (crashes).

Old:

[01:47:42] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[01:47:42] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[01:47:44] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 969)...
[01:47:45] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[01:47:45] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[01:55:08] amalgamation/../src/tree/updater_fast_hist.cc:82: Grouping features together...
[01:56:07] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 504.929 sec
[01:56:08] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 82 extra nodes, 0 pruned nodes, max_depth=6
[01:56:08] amalgamation/../src/tree/updater_fast_hist.cc:273: 
InitData:          0.0156 ( 1.70%)
InitNewNode:       0.0157 ( 1.70%)
BuildHist:         0.7186 (77.95%)
EvaluateSplit:     0.1563 (16.96%)
ApplySplit:        0.0157 ( 1.70%)
========================================
Total:             0.9219
[01:56:08] amalgamation/../src/gbm/gbtree.cc:274: CommitModel(): 0.0940418 sec
[01:56:08] amalgamation/../src/learner.cc:373: EvalOneIter(): 0.0105848 sec
[1]	test-auc:0.606967

New:

[11:10:58] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[11:10:58] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[11:11:00] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 969)...
[11:11:01] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[11:11:01] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[11:11:09] amalgamation/../src/tree/updater_fast_hist.cc:82: Grouping features together...
[11:12:07] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 69.2763 sec
[11:12:08] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 82 extra nodes, 0 pruned nodes, max_depth=6
[11:12:08] amalgamation/../src/tree/updater_fast_hist.cc:273: 
InitData:          0.0157 ( 1.71%)
InitNewNode:       0.0000 ( 0.00%)
BuildHist:         0.6918 (75.03%)
EvaluateSplit:     0.1832 (19.87%)
ApplySplit:        0.0312 ( 3.39%)
========================================
Total:             0.9220
[11:12:09] amalgamation/../src/gbm/gbtree.cc:205: CommitModel(): 0.0937798 sec
[11:12:09] amalgamation/../src/learner.cc:373: EvalOneIter(): 0.0104179 sec
[1]	test-auc:0.606967

It improved the column access structure generation, and is approx the same speed as LightGBM (clocked also at approximately 4 minutes) when used on my custom reputation dataset:

image

image

@hcho3
Copy link
Collaborator Author

hcho3 commented Aug 16, 2017

@Laurae2 With some difficulty, I've managed to reproduce the crash. The crash is really due to run-away memory usage of GHistIndexBlockMatrix. The xgboost process got killed by the OS as it attempted to gobble up all memory -- even 244 GB was not enough! I'm working to write a memory-efficient version of this.

The lines 157-189 of column_matrix.h are not related to the crash. I tried to re-produce negative scaling at this section but could not. With your custom reput dataset, I get 11 sec with 32 threads and 73 sec for 1 thread. What was your value of nthread? My haphazard guess is that this section cannot use too many threads well due to load imbalance issue. Since it is not a huge bottleneck, we could simply set the number of threads to at most 16.

@Laurae2
Copy link
Contributor

Laurae2 commented Aug 16, 2017

My value of nthread was 40.

I don't have access to my 1TB server, but I have some extra results below.

i7-7700, 1 thread, baremetal server:

[12:39:49] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[12:39:50] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[12:40:29] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 23636)...
[12:41:26] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[12:42:48] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[12:43:51] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 241.215 sec

i7-7700, 8 threads, baremetal server:

[12:31:22] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[12:31:22] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[12:31:29] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 23636)...
[12:32:33] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[12:32:57] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[12:33:28] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 126.432 sec

20 cores Ivy Bridge, 1 thread, virtualized:

[13:05:39] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[13:05:42] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[13:07:19] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 23636)...
[13:08:55] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[13:10:59] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[13:13:09] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 447.5 sec

20 cores Ivy Bridge, 10 threads, virtualized:

[13:27:20] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[13:27:20] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[13:27:28] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 23636)...
[13:29:06] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[13:29:23] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[13:33:30] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 370.016 sec

20 cores Ivy Bridge, 20 threads, virtualized:

[13:14:48] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[13:14:49] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[13:14:57] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 23636)...
[13:16:34] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[13:16:48] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[13:25:36] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 647.326 sec

20 cores Ivy Bridge, 40 threads, virtualized:

[12:43:17] Tree method is selected to be 'hist', which uses a single updater grow_fast_histmaker.
[12:43:18] amalgamation/../src/common/hist_util.cc:37: Generating sketches...
[12:43:25] amalgamation/../src/common/hist_util.cc:75: Computing quantiles for features [0, 23636)...
[12:45:07] amalgamation/../src/tree/updater_fast_hist.cc:70: Quantizing data matrix entries into quantile indices...
[12:45:15] amalgamation/../src/tree/updater_fast_hist.cc:75: Generating columnar access structure...
[13:03:55] amalgamation/../src/tree/updater_fast_hist.cc:92: Done initializing training: 1237.75 sec

@hcho3
Copy link
Collaborator Author

hcho3 commented Aug 16, 2017

With a larger instance, I've re-produced negative scaling:

[11:54:52] src/tree/../common/column_matrix.h:193: nthread = 64, time = 62.0626 sec
[11:55:03] src/tree/../common/column_matrix.h:193: nthread = 32, time = 10.8312 sec
[11:55:12] src/tree/../common/column_matrix.h:193: nthread = 16, time = 8.80852 sec
[11:55:24] src/tree/../common/column_matrix.h:193: nthread = 8, time = 12.4482 sec
[11:55:44] src/tree/../common/column_matrix.h:193: nthread = 4, time = 19.9162 sec
[11:56:23] src/tree/../common/column_matrix.h:193: nthread = 2, time = 39.0732 sec
[11:57:39] src/tree/../common/column_matrix.h:193: nthread = 1, time = 75.285 sec

(I've wrapped the lines 157-189 in a for loop so as to run it with different number of threads.)

But wow, negative scaling is really bad on your Ivy Bridge virtualized machine. I will try to modify this loop to eliminate all data sharing, but if it doesn't work out, I'd be happy with making this sequential.

@tqchen
Copy link
Member

tqchen commented Nov 29, 2017

close due to stale PR, @hcho3

@tqchen tqchen closed this Nov 29, 2017
@hcho3
Copy link
Collaborator Author

hcho3 commented Nov 29, 2017

@tqchen Sorry I haven't gotten around working on it for a while. I will send a new PR when it's ready.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants