Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gpugraph #41

Merged
merged 102 commits into from
Jun 13, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
3faa1d8
enable graph-engine to return all id
seemingwang Apr 27, 2022
3a70c42
change vector's dimension
seemingwang Apr 28, 2022
09455a6
change vector's dimension
seemingwang Apr 28, 2022
acb8ac0
enlarge returned ids dimensions
seemingwang Apr 28, 2022
ff5fa32
add actual_val
DesmonDay Apr 29, 2022
7950086
change vlog
DesmonDay Apr 29, 2022
24cb259
fix bug
DesmonDay Apr 29, 2022
7798771
bug fix
DesmonDay Apr 29, 2022
641fcac
bug fix
DesmonDay Apr 29, 2022
7762561
fix display test
DesmonDay Apr 29, 2022
7cfe661
Merge pull request #26 from DesmonDay/gpu_graph_engine2
seemingwang May 1, 2022
918bb56
singleton of gpu_graph_wrapper
seemingwang May 1, 2022
2e44d09
Merge branch 'gpu_graph_engine2' of https://github.com/seemingwang/Pa…
seemingwang May 1, 2022
b63e2e2
change sample result's structure to fit training
seemingwang May 1, 2022
cfe8d4a
recover sample code
seemingwang May 1, 2022
0d18f57
fix
seemingwang May 1, 2022
0ad3e46
secondary sample
seemingwang May 1, 2022
2606895
add graph partition
seemingwang May 4, 2022
3cfac05
add graph partition
seemingwang May 4, 2022
38f7b15
fix pybind
seemingwang May 4, 2022
d1a74f2
optimize buffer allocation
seemingwang May 10, 2022
2e2dd2a
resolve conflicts
seemingwang May 10, 2022
3c33403
fix node transfer problem
seemingwang May 11, 2022
080e5c9
remove log
seemingwang May 11, 2022
b00a116
support 32G+ graph on single gpu
seemingwang May 12, 2022
97c3f0c
remove logs
seemingwang May 12, 2022
aaa137e
fix
seemingwang May 12, 2022
08a301f
fix
seemingwang May 12, 2022
12168b0
fix cpu query
seemingwang May 12, 2022
ec89107
display info
seemingwang May 14, 2022
c221167
remove log
seemingwang May 15, 2022
ab005da
remove empyt file
seemingwang May 15, 2022
e5c8a41
graph generator
Thunderbrook May 16, 2022
715ed06
format
Thunderbrook May 16, 2022
79a6cd5
log
Thunderbrook May 17, 2022
c805cc0
distribute labeled data evenly in graph engine
seemingwang May 18, 2022
32a4341
merge
seemingwang May 18, 2022
b522ca5
OneDNN md-in-tensor refactoring part 3: Changes in quantize and dequa…
jakpiase May 19, 2022
313f5d0
Fix PD_INFER_DECL redefine (#42731)
KernelErr May 19, 2022
e726960
[MLU] add lookup_table_v2 and unstack op (#42847)
May 19, 2022
3f3185e
GraphInsGenerator
Thunderbrook May 23, 2022
676a92c
Merge pull request #1 from Thunderbrook/gpugraph_0523
xuewujiao May 23, 2022
8b0b194
split graph_table
seemingwang May 24, 2022
91f5c32
reuse clear graph
seemingwang May 24, 2022
6672b82
optimize vector allocation
seemingwang May 24, 2022
112cf61
fix
seemingwang May 24, 2022
88f13ec
deepwalk
Thunderbrook May 24, 2022
ed569b4
format
May 24, 2022
63e501d
Merge pull request #5 from Thunderbrook/gpugraph_deepwalk
Thunderbrook May 24, 2022
dc8dbf2
rename variables
seemingwang May 25, 2022
f43d085
remove log
seemingwang May 25, 2022
a9b5445
merge
seemingwang May 25, 2022
fc28b23
Merge pull request #3 from seemingwang/gpu_graph_engine2
xuewujiao May 25, 2022
f24db08
remove len_per_row and log
May 26, 2022
8daf1a2
config window
May 26, 2022
308a394
format
May 26, 2022
47b82ac
Merge pull request #6 from Thunderbrook/gpugraph_deepwalk
Thunderbrook May 26, 2022
ee7f0e8
fix compile error
May 26, 2022
319a201
Merge pull request #7 from huwei02/gpugraph
huwei02 May 26, 2022
c313938
support slot_feature, storage and loading
May 27, 2022
0fba9f2
fix log
May 28, 2022
1d95522
Merge branch 'xuewujiao:gpugraph' into gpugraph
huwei02 May 28, 2022
9f8d4c2
Merge pull request #8 from huwei02/gpugraph
huwei02 May 28, 2022
b6a4a0e
change sample interface
May 29, 2022
94fd1e3
merge dymf branch (#42714)
yaoxuefeng6 May 19, 2022
716f908
add dymf accessor support (#42881)
yaoxuefeng6 May 20, 2022
81644ce
Acc name (#42906)
yaoxuefeng6 May 23, 2022
484ca4c
[GPUPS]fix gpups pscore (#42967)
danleifeng May 25, 2022
e14d1f0
[GPUPS]fix dymf gpups pscore (#42991)
danleifeng May 26, 2022
453ddf5
changing int64 key to uint64 for graph engine
seemingwang May 31, 2022
6225444
delete old sample interface
DesmonDay May 31, 2022
1c2d8d4
Merge branch 'gpu_graph_engine3' into gpugraph
DesmonDay May 31, 2022
c51530a
run pre-commit
DesmonDay May 31, 2022
61a9cfa
adapt for multi table
May 31, 2022
2a16ca7
Merge pull request #9 from Thunderbrook/gpugraph_deepwalk
Thunderbrook May 31, 2022
8253f4e
Add gpu_graph_utils.h
lxsbupt May 31, 2022
f24c3e9
Merge pull request #13 from lxsbupt/gpugraph
lxsbupt May 31, 2022
78a0ed3
Merge branch 'gpugraph' of https://github.com/xuewujiao/Paddle into m…
danleifeng May 31, 2022
283a430
metapath
May 31, 2022
e6cfbde
adapt dymf for graph;test=develop
danleifeng May 31, 2022
1d313a8
Merge pull request #28 from DesmonDay/gpugraph
seemingwang Jun 1, 2022
0c4104d
resolve conflicts
seemingwang Jun 1, 2022
a9e5973
resolve conflicts
seemingwang Jun 1, 2022
b01e178
Merge pull request #10 from danleifeng/mul_dims
Thunderbrook Jun 1, 2022
eaa58a3
decrease warnings
seemingwang Jun 1, 2022
eae32ae
fix pybind int64 problem
seemingwang Jun 1, 2022
2457680
Merge pull request #12 from seemingwang/gpu_graph_engine3
seemingwang Jun 1, 2022
440da15
metapath
Jun 1, 2022
a11d1e3
meatapath
Jun 2, 2022
b624aec
merge develop
Jun 2, 2022
d01a280
Merge pull request #15 from Thunderbrook/gpugraph_deepwalk
Thunderbrook Jun 2, 2022
4fb5c13
adapt uint64 and iterate edge table
Thunderbrook Jun 2, 2022
20a1aa3
Merge pull request #17 from Thunderbrook/gpugraph_0602
Thunderbrook Jun 2, 2022
d7eae03
Optimize memory overhead for gpugraph. (#18)
lxsbupt Jun 6, 2022
750e343
[GpuGraph] same first_node_type share start (#19)
Thunderbrook Jun 8, 2022
1816fc2
search and fill slot_feature (#20)
huwei02 Jun 8, 2022
8c65693
remove debug code (#21)
huwei02 Jun 8, 2022
ccccb5f
remove debug code (#22)
huwei02 Jun 8, 2022
3bde5e4
Gpugraph (#24)
miaoli06 Jun 10, 2022
68c2cc4
dedup feature, opt parse feature (#25)
huwei02 Jun 11, 2022
ba5d709
solve int overflow (#26)
Thunderbrook Jun 12, 2022
77b007e
[GpuGraph] add enforce in alloc and free (#27)
Thunderbrook Jun 13, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions cmake/configure.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -227,3 +227,6 @@ endif(WITH_CRYPTO)
if(WITH_CUSTOM_DEVICE AND NOT WIN32)
add_definitions(-DPADDLE_WITH_CUSTOM_DEVICE)
endif()
if(WITH_GPU_GRAPH)
add_definitions(-DPADDLE_WITH_GPU_GRAPH)
endif()
73 changes: 23 additions & 50 deletions paddle/fluid/distributed/ps/service/graph_brpc_server.cc
Original file line number Diff line number Diff line change
Expand Up @@ -143,10 +143,8 @@ int32_t GraphBrpcService::add_graph_node(Table *table,

int idx_ = *(int *)(request.params(0).c_str());
size_t node_num = request.params(1).size() / sizeof(int64_t);
int64_t *node_data = (int64_t *)(request.params(1).c_str());
// size_t node_num = request.params(0).size() / sizeof(int64_t);
// int64_t *node_data = (int64_t *)(request.params(0).c_str());
std::vector<int64_t> node_ids(node_data, node_data + node_num);
uint64_t *node_data = (uint64_t *)(request.params(1).c_str());
std::vector<uint64_t> node_ids(node_data, node_data + node_num);
std::vector<bool> is_weighted_list;
if (request.params_size() == 3) {
size_t weight_list_size = request.params(2).size() / sizeof(bool);
Expand Down Expand Up @@ -177,11 +175,9 @@ int32_t GraphBrpcService::remove_graph_node(Table *table,
return 0;
}
int idx_ = *(int *)(request.params(0).c_str());
size_t node_num = request.params(1).size() / sizeof(int64_t);
int64_t *node_data = (int64_t *)(request.params(1).c_str());
// size_t node_num = request.params(0).size() / sizeof(int64_t);
// int64_t *node_data = (int64_t *)(request.params(0).c_str());
std::vector<int64_t> node_ids(node_data, node_data + node_num);
size_t node_num = request.params(1).size() / sizeof(uint64_t);
uint64_t *node_data = (uint64_t *)(request.params(1).c_str());
std::vector<uint64_t> node_ids(node_data, node_data + node_num);

((GraphTable *)table)->remove_graph_node(idx_, node_ids);
return 0;
Expand Down Expand Up @@ -215,11 +211,6 @@ int32_t GraphBrpcService::Initialize() {
&GraphBrpcService::graph_set_node_feat;
_service_handler_map[PS_GRAPH_SAMPLE_NODES_FROM_ONE_SERVER] =
&GraphBrpcService::sample_neighbors_across_multi_servers;
// _service_handler_map[PS_GRAPH_USE_NEIGHBORS_SAMPLE_CACHE] =
// &GraphBrpcService::use_neighbors_sample_cache;
// _service_handler_map[PS_GRAPH_LOAD_GRAPH_SPLIT_CONFIG] =
// &GraphBrpcService::load_graph_split_config;
// shard初始化,server启动后才可从env获取到server_list的shard信息
InitializeShardInfo();

return 0;
Expand Down Expand Up @@ -384,9 +375,6 @@ int32_t GraphBrpcService::pull_graph_list(Table *table,
int start = *(int *)(request.params(2).c_str());
int size = *(int *)(request.params(3).c_str());
int step = *(int *)(request.params(4).c_str());
// int start = *(int *)(request.params(0).c_str());
// int size = *(int *)(request.params(1).c_str());
// int step = *(int *)(request.params(2).c_str());
std::unique_ptr<char[]> buffer;
int actual_size;
((GraphTable *)table)
Expand All @@ -406,14 +394,10 @@ int32_t GraphBrpcService::graph_random_sample_neighbors(
return 0;
}
int idx_ = *(int *)(request.params(0).c_str());
size_t node_num = request.params(1).size() / sizeof(int64_t);
int64_t *node_data = (int64_t *)(request.params(1).c_str());
int sample_size = *(int64_t *)(request.params(2).c_str());
size_t node_num = request.params(1).size() / sizeof(uint64_t);
uint64_t *node_data = (uint64_t *)(request.params(1).c_str());
int sample_size = *(int *)(request.params(2).c_str());
bool need_weight = *(bool *)(request.params(3).c_str());
// size_t node_num = request.params(0).size() / sizeof(int64_t);
// int64_t *node_data = (int64_t *)(request.params(0).c_str());
// int sample_size = *(int64_t *)(request.params(1).c_str());
// bool need_weight = *(bool *)(request.params(2).c_str());
std::vector<std::shared_ptr<char>> buffers(node_num);
std::vector<int> actual_sizes(node_num, 0);
((GraphTable *)table)
Expand All @@ -433,7 +417,7 @@ int32_t GraphBrpcService::graph_random_sample_nodes(
brpc::Controller *cntl) {
int type_id = *(int *)(request.params(0).c_str());
int idx_ = *(int *)(request.params(1).c_str());
size_t size = *(int64_t *)(request.params(2).c_str());
size_t size = *(uint64_t *)(request.params(2).c_str());
// size_t size = *(int64_t *)(request.params(0).c_str());
std::unique_ptr<char[]> buffer;
int actual_size;
Expand All @@ -459,11 +443,9 @@ int32_t GraphBrpcService::graph_get_node_feat(Table *table,
return 0;
}
int idx_ = *(int *)(request.params(0).c_str());
size_t node_num = request.params(1).size() / sizeof(int64_t);
int64_t *node_data = (int64_t *)(request.params(1).c_str());
// size_t node_num = request.params(0).size() / sizeof(int64_t);
// int64_t *node_data = (int64_t *)(request.params(0).c_str());
std::vector<int64_t> node_ids(node_data, node_data + node_num);
size_t node_num = request.params(1).size() / sizeof(uint64_t);
uint64_t *node_data = (uint64_t *)(request.params(1).c_str());
std::vector<uint64_t> node_ids(node_data, node_data + node_num);

std::vector<std::string> feature_names =
paddle::string::split_string<std::string>(request.params(2), "\t");
Expand Down Expand Up @@ -497,22 +479,15 @@ int32_t GraphBrpcService::sample_neighbors_across_multi_servers(
}

int idx_ = *(int *)(request.params(0).c_str());
size_t node_num = request.params(1).size() / sizeof(int64_t),
size_t node_num = request.params(1).size() / sizeof(uint64_t),
size_of_size_t = sizeof(size_t);
int64_t *node_data = (int64_t *)(request.params(1).c_str());
int sample_size = *(int64_t *)(request.params(2).c_str());
bool need_weight = *(int64_t *)(request.params(3).c_str());

// size_t node_num = request.params(0).size() / sizeof(int64_t),
// size_of_size_t = sizeof(size_t);
// int64_t *node_data = (int64_t *)(request.params(0).c_str());
// int sample_size = *(int64_t *)(request.params(1).c_str());
// bool need_weight = *(int64_t *)(request.params(2).c_str());
// std::vector<int64_t> res = ((GraphTable
// *)table).filter_out_non_exist_nodes(node_data, sample_size);
uint64_t *node_data = (uint64_t *)(request.params(1).c_str());
int sample_size = *(int *)(request.params(2).c_str());
bool need_weight = *(bool *)(request.params(3).c_str());

std::vector<int> request2server;
std::vector<int> server2request(server_size, -1);
std::vector<int64_t> local_id;
std::vector<uint64_t> local_id;
std::vector<int> local_query_idx;
size_t rank = GetRank();
for (int query_idx = 0; query_idx < node_num; ++query_idx) {
Expand All @@ -535,7 +510,7 @@ int32_t GraphBrpcService::sample_neighbors_across_multi_servers(
std::vector<std::shared_ptr<char>> local_buffers;
std::vector<int> local_actual_sizes;
std::vector<size_t> seq;
std::vector<std::vector<int64_t>> node_id_buckets(request_call_num);
std::vector<std::vector<uint64_t>> node_id_buckets(request_call_num);
std::vector<std::vector<int>> query_idx_buckets(request_call_num);
for (int query_idx = 0; query_idx < node_num; ++query_idx) {
int server_index =
Expand Down Expand Up @@ -624,7 +599,7 @@ int32_t GraphBrpcService::sample_neighbors_across_multi_servers(

closure->request(request_idx)
->add_params((char *)node_id_buckets[request_idx].data(),
sizeof(int64_t) * node_num);
sizeof(uint64_t) * node_num);
closure->request(request_idx)
->add_params((char *)&sample_size, sizeof(int));
closure->request(request_idx)
Expand Down Expand Up @@ -661,11 +636,9 @@ int32_t GraphBrpcService::graph_set_node_feat(Table *table,
}
int idx_ = *(int *)(request.params(0).c_str());

// size_t node_num = request.params(0).size() / sizeof(int64_t);
// int64_t *node_data = (int64_t *)(request.params(0).c_str());
size_t node_num = request.params(1).size() / sizeof(int64_t);
int64_t *node_data = (int64_t *)(request.params(1).c_str());
std::vector<int64_t> node_ids(node_data, node_data + node_num);
size_t node_num = request.params(1).size() / sizeof(uint64_t);
uint64_t *node_data = (uint64_t *)(request.params(1).c_str());
std::vector<uint64_t> node_ids(node_data, node_data + node_num);

// std::vector<std::string> feature_names =
// paddle::string::split_string<std::string>(request.params(1), "\t");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,14 +81,14 @@ class GraphPyService {

graph_proto->set_table_name("cpu_graph_table");
graph_proto->set_use_cache(false);
for (int i = 0; i < id_to_edge.size(); i++)
for (int i = 0; i < (int)id_to_edge.size(); i++)
graph_proto->add_edge_types(id_to_edge[i]);
for (int i = 0; i < id_to_feature.size(); i++) {
for (int i = 0; i < (int)id_to_feature.size(); i++) {
graph_proto->add_node_types(id_to_feature[i]);
auto feat_node = id_to_feature[i];
::paddle::distributed::GraphFeature* g_f =
graph_proto->add_graph_feature();
for (int x = 0; x < table_feat_conf_feat_name[i].size(); x++) {
for (int x = 0; x < (int)table_feat_conf_feat_name[i].size(); x++) {
g_f->add_name(table_feat_conf_feat_name[i][x]);
g_f->add_dtype(table_feat_conf_feat_dtype[i][x]);
g_f->add_shape(table_feat_conf_feat_shape[i][x]);
Expand Down
Loading