Support column split in `approx` tree method #8847

rongou · 2023-02-25T21:41:36Z

Since each worker has a distinct set of columns/features, and we already build histograms locally (#8811) and partition rows collaboratively (#8828), the only change remaining is to handle split finding correctly. We first find the local best splits using the existing approach, then do a round of allgather to collect best splits from all the workers, and update splits to be globally best splits.

rongou · 2023-02-25T21:42:35Z

@trivialfis @hcho3

trivialfis · 2023-02-25T22:20:30Z

the only change remaining is to handle split finding correctly.

That's exciting! Do you have a complete example that I can run? Would love to try it out.

rongou · 2023-02-26T00:04:17Z

At least on a small dataset it seems to produce identical results as row split:

import filecmp
import multiprocessing

import xgboost as xgb
from xgboost import RabitTracker


def train(split, rank):
    dtrain = xgb.DMatrix('demo/data/agaricus.txt.train', data_split_mode=split)
    dtest = xgb.DMatrix('demo/data/agaricus.txt.test', data_split_mode=split)
    param = {"max_depth": 2, "eta": 1, "objective": "binary:logistic"}
    watchlist = [(dtest, "eval"), (dtrain, "train")]
    num_round = 2
    bst = xgb.train(param, dtrain, num_boost_round=num_round, evals=watchlist)
    if rank == 0:
        bst.save_model(f'agaricus.model.{split}.json')


def run_worker(rabit_env, rank):
    with xgb.collective.CommunicatorContext(**rabit_env):
        print("Training with row split")
        train(0, rank)
        print("Training with column split")
        train(1, rank)


def main():
    world_size = 2
    tracker = RabitTracker(host_ip='127.0.0.1', n_workers=world_size)
    tracker.start(world_size)

    workers = []
    for rank in range(world_size):
        worker = multiprocessing.Process(target=run_worker, args=(tracker.worker_envs(), rank))
        workers.append(worker)
        worker.start()
    for worker in workers:
        worker.join()
        assert worker.exitcode == 0

    result = filecmp.cmp('agaricus.model.0.json', 'agaricus.model.1.json', shallow=False)
    print(f'Two models are equal: {result}')


if __name__ == "__main__":
    main()

rongou added 2 commits February 25, 2023 13:25

Support column split in approx tree method

761e8a4

revere worker and entry index

7b340ec

rongou mentioned this pull request Feb 25, 2023

Vertical Federated Learning RFC #8424

Open

add logging for column split

a3b9d66

trivialfis approved these changes Mar 1, 2023

View reviewed changes

trivialfis merged commit 7cbaee9 into dmlc:master Mar 1, 2023

rongou deleted the colsplit-approx branch September 25, 2023 16:41

ZiyueXu77 mentioned this pull request Jan 15, 2024

Vertical Federated Learning with Secure Features (secure inference and encrypted training) RFC #9987

Closed

ShellLM mentioned this pull request Aug 11, 2024

Xgboost 2.0.0 · dmlc/xgboost irthomasthomas/undecidability#878

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support column split in `approx` tree method #8847

Support column split in `approx` tree method #8847

rongou commented Feb 25, 2023

rongou commented Feb 25, 2023

trivialfis commented Feb 25, 2023

rongou commented Feb 26, 2023

Support column split in approx tree method #8847

Support column split in approx tree method #8847

Conversation

rongou commented Feb 25, 2023

rongou commented Feb 25, 2023

trivialfis commented Feb 25, 2023

rongou commented Feb 26, 2023

Support column split in `approx` tree method #8847

Support column split in `approx` tree method #8847