Skip to content

Commit

Permalink
[BUG] Always Persist Dask DataFrames in cuGraph-DGL Graph Storage (#4296
Browse files Browse the repository at this point in the history
)

Always persists the dask dataframe in the cuGraph-DGL graph storage object.  This resolves a bug where the dataframe was unpersisted, causing `'TypeError("Could not construct DataFrame from <class \'tuple\'>")'` to be raised when trying to access it after saving it as a dataset.

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Rick Ratzel (https://github.com/rlratzel)

URL: #4296
  • Loading branch information
alexbarghi-nv authored Apr 3, 2024
1 parent 528910e commit a96e933
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion python/cugraph-dgl/cugraph_dgl/cugraph_storage.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2022-2023, NVIDIA CORPORATION.
# Copyright (c) 2022-2024, NVIDIA CORPORATION.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
Expand Down Expand Up @@ -170,6 +170,13 @@ def __init__(
self._edges_dict = add_node_offset_to_edges_dict(
_edges_dict, self._ntype_offset_d
)

# Persist the dataframes so they can be retrieved later
# for a multi-GPU workflow.
if not single_gpu:
for k in list(self._edges_dict.keys()):
self._edges_dict[k] = self._edges_dict[k].persist()

self._etype_id_dict = {
etype: etype_id for etype_id, etype in enumerate(self.canonical_etypes)
}
Expand Down

0 comments on commit a96e933

Please sign in to comment.