Skip to content

Commit

Permalink
[BUG] Skip Empty Partitions in Bulk Sample Writing (#3600)
Browse files Browse the repository at this point in the history
Currently, when the bulk sampler writes its samples, it does not check whether the partition is empty.  This causes all of the empty partitions to collide and attempt to write to the same (empty) file at once.  This writing should not occur at all.

This PR adds a check that skips writing if the partition is empty.

This should resolve the nightly test failures.

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Brad Rees (https://github.com/BradReesWork)

URL: #3600
  • Loading branch information
alexbarghi-nv authored May 24, 2023
1 parent 64690fe commit 28dc2eb
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion python/cugraph/cugraph/gnn/data_loading/bulk_sampler_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def _write_samples_to_parquet(
"""

# Required by dask; need to skip dummy partitions.
if partition_info is None:
if partition_info is None or len(results) == 0:
return
if partition_info != "sg" and (not isinstance(partition_info, dict)):
raise ValueError("Invalid value of partition_info")
Expand Down

0 comments on commit 28dc2eb

Please sign in to comment.