Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor k-core #2731

Merged
merged 102 commits into from
Nov 29, 2022
Merged
Show file tree
Hide file tree
Changes from 89 commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
0264144
Define k-core API and tests
ChuckHastings Sep 21, 2022
cdf5f0a
fix clang-format issues
ChuckHastings Sep 21, 2022
650d017
add mechanism to create result from core_number for case where python…
ChuckHastings Sep 23, 2022
3d06e3d
address PR comments
ChuckHastings Sep 23, 2022
bed80ad
update raft import
jnke2016 Sep 24, 2022
d4e8c61
reset changes to the yml files
jnke2016 Sep 24, 2022
101f35b
fix typo
jnke2016 Sep 24, 2022
d322c20
add k-core to the cmake list
jnke2016 Sep 24, 2022
11a9965
define and implement k-core in pylibcugraph
jnke2016 Sep 24, 2022
c93feb7
define SG and MG k_core in the python API
jnke2016 Sep 24, 2022
90a1afe
Merge remote-tracking branch 'upstream/fea_k_core_api' into fea_k_cor…
jnke2016 Sep 24, 2022
ab94ba6
fix typo
jnke2016 Sep 24, 2022
21e006f
implement 'k_core' mg tests
jnke2016 Sep 24, 2022
63512ea
update gc collect
jnke2016 Sep 24, 2022
e7e69a0
remove legacy k_core
jnke2016 Sep 24, 2022
c345897
fix style
jnke2016 Sep 24, 2022
46f7508
update branch
jnke2016 Sep 24, 2022
72e547e
add fixme
jnke2016 Sep 24, 2022
2c00479
Add a bunch of missing things
ChuckHastings Sep 25, 2022
1fec14f
merge latest changes from the k_core_c_api
jnke2016 Sep 25, 2022
3891803
Merge remote-tracking branch 'upstream/branch-22.10_fix-raft-import' …
jnke2016 Sep 25, 2022
acbb67e
update the python and pylibcugraph API with respect to the C API cha…
jnke2016 Sep 25, 2022
228ee16
add FIXMEs
jnke2016 Sep 25, 2022
06cb604
fix merge conflict
jnke2016 Nov 7, 2022
e6f7605
fix merge conflicts
jnke2016 Nov 7, 2022
a1f1450
remove legacy k_core from CMakeList
jnke2016 Nov 7, 2022
38186eb
update k_core python implementation and function definition
jnke2016 Nov 7, 2022
fb2f661
remove outdated comments
jnke2016 Nov 7, 2022
71f916d
update PLC implementation of k_core
jnke2016 Nov 7, 2022
8c2ecc7
add function to create a core_result
jnke2016 Nov 7, 2022
fdcb270
fix style
jnke2016 Nov 7, 2022
29b4fa9
Merge remote-tracking branch 'upstream/branch-22.12' into fea_k_core_api
jnke2016 Nov 17, 2022
8926d3b
Merge remote-tracking branch 'upstream/implement_c++_k-core' into fea…
jnke2016 Nov 17, 2022
af5d4ce
add plc implementation of k_core
jnke2016 Nov 17, 2022
5dc3d7d
update tests
jnke2016 Nov 17, 2022
79224d4
fix typo
jnke2016 Nov 17, 2022
4b26634
rename column
jnke2016 Nov 17, 2022
23fcd26
remove flag
jnke2016 Nov 18, 2022
1043949
add FIXME
jnke2016 Nov 18, 2022
6be8b30
remove outdated comments
jnke2016 Nov 18, 2022
4c99a84
fix typo
jnke2016 Nov 18, 2022
f6935c4
update MG implementation of k_core
jnke2016 Nov 18, 2022
9318bde
update k_core MG tests
jnke2016 Nov 18, 2022
47db56b
update docstrings, support 'degree_type'
jnke2016 Nov 18, 2022
92e9923
support 'degree_type' and update docstrings
jnke2016 Nov 18, 2022
1caebd0
support 'degree_type' and update docstrings
jnke2016 Nov 18, 2022
e24289b
update docstrings
jnke2016 Nov 18, 2022
d4a70bc
update docstrings, remove outdated warning
jnke2016 Nov 18, 2022
17cd77f
fix typo
jnke2016 Nov 18, 2022
d4c352e
add FIXMEs
jnke2016 Nov 18, 2022
2f3168f
remove outdated import
jnke2016 Nov 18, 2022
29a4fe7
raise error for invalid 'degree_type'
jnke2016 Nov 18, 2022
da9275c
remove outdate warning, add tests for 'bidirectional' degree_type
jnke2016 Nov 18, 2022
0b0e469
remove outdated warning tests, tests invalid degree_type
jnke2016 Nov 18, 2022
0fdc3cd
fix style
jnke2016 Nov 18, 2022
2b08ee3
remove unused variable
jnke2016 Nov 18, 2022
0924311
fix merge conflicts
jnke2016 Nov 18, 2022
6694446
remove 'core' from the python CMakeList
jnke2016 Nov 18, 2022
aad2b23
pass 'degree_type' as input
jnke2016 Nov 18, 2022
5c04033
fix merge conflict
jnke2016 Nov 21, 2022
d7e3e8c
add tests for invalid input
jnke2016 Nov 21, 2022
eabce08
remove outdated FIXMEs, add tests for 'degree_type'
jnke2016 Nov 21, 2022
7df897e
update docstrings
jnke2016 Nov 21, 2022
1a382f8
fix typo
jnke2016 Nov 21, 2022
52c0ef6
fix typo, remove outdated fixme
jnke2016 Nov 21, 2022
c67647f
raise appropriate error
jnke2016 Nov 21, 2022
aa7f36f
reset changes to artificial weight column
jnke2016 Nov 21, 2022
25ff6d4
reset changes to artificial weight column
jnke2016 Nov 21, 2022
c052720
fix typo
jnke2016 Nov 21, 2022
fca2a38
fix typo
jnke2016 Nov 21, 2022
d57e337
fix style
jnke2016 Nov 21, 2022
5529401
fix style
jnke2016 Nov 21, 2022
51fe544
remove unused function call
jnke2016 Nov 21, 2022
95368ab
fix typo
jnke2016 Nov 21, 2022
8fac872
Merge remote-tracking branch 'upstream/implement_c++_k-core' into fea…
jnke2016 Nov 22, 2022
61f9198
raise appropriate exception when passed invalid input
jnke2016 Nov 23, 2022
844bd86
revert changes to louvain as it supports unweighted graphs
jnke2016 Nov 23, 2022
2219959
remove end of file
jnke2016 Nov 23, 2022
2439273
raise a deprecation warning if the user did not pass weights when req…
jnke2016 Nov 23, 2022
d174365
raise a deprecation warning if the user did not pass weights when req…
jnke2016 Nov 23, 2022
c0f3cd2
fix typo, raise appropriate warning
jnke2016 Nov 23, 2022
ba30f0e
fix style, remove unused import
jnke2016 Nov 23, 2022
1c82823
Merge remote-tracking branch 'upstream/branch-22.12' into fea_k_core_api
jnke2016 Nov 23, 2022
f6b558a
change comparison method
jnke2016 Nov 23, 2022
2a305d5
remove expensive check
jnke2016 Nov 23, 2022
8fa6c79
remove warning as 'subgraph_extraction' supports weights
jnke2016 Nov 23, 2022
32dfbf2
remove extra line at end of file
jnke2016 Nov 23, 2022
a642629
raise appropriate error
jnke2016 Nov 23, 2022
858981e
fix style
jnke2016 Nov 23, 2022
1ddb66a
remove warnings
jnke2016 Nov 28, 2022
bfa53a9
fix style
jnke2016 Nov 28, 2022
f11d4bf
update docstrings
jnke2016 Nov 29, 2022
7304c3f
update docstrings
jnke2016 Nov 29, 2022
bd395ed
update exception raised message
jnke2016 Nov 29, 2022
22fcd29
add comments
jnke2016 Nov 29, 2022
34f9f15
remove unused import
jnke2016 Nov 29, 2022
7963eba
update docstrings
jnke2016 Nov 29, 2022
a6981ee
remove unsued import
jnke2016 Nov 29, 2022
dbd7911
remove unsued import
jnke2016 Nov 29, 2022
d458a5f
fix style
jnke2016 Nov 29, 2022
8896390
fix typo in the docstrings
jnke2016 Nov 29, 2022
3d91d52
update docstrings
jnke2016 Nov 29, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion python/cugraph/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,6 @@ rapids_cython_init()
add_subdirectory(cugraph/centrality)
add_subdirectory(cugraph/community)
add_subdirectory(cugraph/components)
add_subdirectory(cugraph/cores)
add_subdirectory(cugraph/dask/comms)
add_subdirectory(cugraph/dask/structure)
add_subdirectory(cugraph/generators)
Expand Down
7 changes: 7 additions & 0 deletions python/cugraph/cugraph/community/egonet.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,13 @@ def ego_graph(G, n, radius=1, center=True, undirected=None, distance=None):

result_graph = type(G)(directed=G.is_directed())

if not G.edgelist.weights:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ egonet does not require weights. Not sure we should require that in python.

Copy link
Contributor Author

@jnke2016 jnke2016 Nov 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a segmentation fault when running the CAPI tests without weights

warning_msg = (
"'Ego_graph' requires the input graph to be weighted: Unweighted "
"graphs will not be supported in the next release."
)
warnings.warn(warning_msg, PendingDeprecationWarning)
rlratzel marked this conversation as resolved.
Show resolved Hide resolved

if undirected is not None:
warning_msg = (
"The parameter 'undirected' is deprecated and "
Expand Down
7 changes: 6 additions & 1 deletion python/cugraph/cugraph/community/leiden.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
ensure_cugraph_obj_for_nx,
df_score_to_dictionary,
)
import warnings


def leiden(G, max_iter=100, resolution=1.0):
Expand Down Expand Up @@ -75,7 +76,11 @@ def leiden(G, max_iter=100, resolution=1.0):
G, isNx = ensure_cugraph_obj_for_nx(G)

if not G.edgelist.weights:
raise RuntimeError("input graph must be weighted")
warning_msg = (
"'Leiden' requires the input graph to be weighted: Unweighted "
"graphs will not be supported in the next release."
)
warnings.warn(warning_msg, PendingDeprecationWarning)

if G.is_directed():
raise ValueError("input graph must be undirected")
Expand Down
7 changes: 6 additions & 1 deletion python/cugraph/cugraph/community/louvain.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

from pylibcugraph import louvain as pylibcugraph_louvain
from pylibcugraph import ResourceHandle
import warnings


def louvain(G, max_iter=100, resolution=1.0):
Expand Down Expand Up @@ -78,7 +79,11 @@ def louvain(G, max_iter=100, resolution=1.0):
G, isNx = ensure_cugraph_obj_for_nx(G)

if not G.edgelist.weights:
raise RuntimeError("input graph must be weighted")
warning_msg = (
"'Louvain' requires the input graph to be weighted: Unweighted "
"graphs will not be supported in the next release."
)
warnings.warn(warning_msg, PendingDeprecationWarning)

if G.is_directed():
raise ValueError("input graph must be undirected")
Expand Down
3 changes: 0 additions & 3 deletions python/cugraph/cugraph/community/subgraph_extraction.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,9 +58,6 @@ def subgraph(G, vertices):

G, isNx = ensure_cugraph_obj_for_nx(G)

if not G.edgelist.weights:
raise RuntimeError("input graph must be weighted")

if G.renumbered:
if isinstance(vertices, cudf.DataFrame):
vertices = G.lookup_internal_vertex_id(vertices, vertices.columns)
Expand Down
22 changes: 0 additions & 22 deletions python/cugraph/cugraph/cores/CMakeLists.txt

This file was deleted.

22 changes: 8 additions & 14 deletions python/cugraph/cugraph/cores/core_number.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,11 @@
df_score_to_dictionary,
)
import cudf
import warnings

from pylibcugraph import core_number as pylibcugraph_core_number, ResourceHandle


def core_number(G, degree_type=None):
def core_number(G, degree_type="bidirectional"):
"""
Compute the core numbers for the nodes of the graph G. A k-core of a graph
is a maximal subgraph that contains nodes of degree k or more.
Expand All @@ -36,13 +35,12 @@ def core_number(G, degree_type=None):
represented as directed edges in both directions. While this graph
can contain edge weights, they don't participate in the calculation
of the core numbers.
The current implementation only supports undirected graphs.

degree_type: str
degree_type: str, (default="bidirectional")
This option determines if the core number computation should be based
on input, output, or both directed edges, with valid values being
"incoming", "outgoing", and "bidirectional" respectively.
This option is currently ignored in this release, and setting it will
result in a warning.

Returns
-------
Expand All @@ -65,19 +63,15 @@ def core_number(G, degree_type=None):

G, isNx = ensure_cugraph_obj_for_nx(G)

if degree_type is not None:
warning_msg = "The 'degree_type' parameter is ignored in this release."
warnings.warn(warning_msg, Warning)

if G.is_directed():
raise ValueError("input graph must be undirected")

# FIXME: enable this check once 'degree_type' is supported
"""
if degree_type not in ["incoming", "outgoing", "bidirectional"]:
raise ValueError(f"'degree_type' must be either incoming, "
f"outgoing or bidirectional, got: {degree_type}")
"""
raise ValueError(
f"'degree_type' must be either incoming, "
f"outgoing or bidirectional, got: {degree_type}"
)

vertex, core_number = pylibcugraph_core_number(
resource_handle=ResourceHandle(),
graph=G._plc_graph,
Expand Down
28 changes: 0 additions & 28 deletions python/cugraph/cugraph/cores/k_core.pxd

This file was deleted.

59 changes: 47 additions & 12 deletions python/cugraph/cugraph/cores/k_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,25 @@
# See the License for the specific language governing permissions and
# limitations under the License.

from cugraph.cores import k_core_wrapper
import cudf
from pylibcugraph import core_number as pylibcugraph_core_number, ResourceHandle

from pylibcugraph import (
core_number as pylibcugraph_core_number,
k_core as pylibcugraph_k_core,
ResourceHandle,
)

from cugraph.utilities import (
ensure_cugraph_obj_for_nx,
cugraph_to_nx,
)


def _call_plc_core_number(G):
def _call_plc_core_number(G, degree_type):
vertex, core_number = pylibcugraph_core_number(
resource_handle=ResourceHandle(),
graph=G._plc_graph,
degree_type=None,
degree_type=degree_type,
do_expensive_check=False,
)

Expand All @@ -34,7 +39,7 @@ def _call_plc_core_number(G):
return df


def k_core(G, k=None, core_number=None):
def k_core(G, k=None, core_number=None, degree_type="bidirectional"):
"""
Compute the k-core of the graph G based on the out degree of its nodes. A
k-core of a graph is a maximal subgraph that contains nodes of degree k or
Expand All @@ -48,11 +53,17 @@ def k_core(G, k=None, core_number=None):
should contain undirected edges where undirected edges are represented
as directed edges in both directions. While this graph can contain edge
weights, they don't participate in the calculation of the k-core.
The current implementation only supports undirected graphs.

k : int, optional (default=None)
Order of the core. This value must not be negative. If set to None, the
main core is returned.

degree_type: str, (default="bidirectional")
This option determines if the core number computation should be based
on input, output, or both directed edges, with valid values being
"incoming", "outgoing", and "bidirectional" respectively.

core_number : cudf.DataFrame, optional (default=None)
Precomputed core number of the nodes of the graph G containing two
cudf.Series of size V: the vertex identifiers and the corresponding
Expand All @@ -79,34 +90,58 @@ def k_core(G, k=None, core_number=None):

G, isNx = ensure_cugraph_obj_for_nx(G)

if degree_type not in ["incoming", "outgoing", "bidirectional"]:
raise ValueError(
f"'degree_type' must be either incoming, "
f"outgoing or bidirectional, got: {degree_type}"
)

mytype = type(G)

KCoreGraph = mytype()

if G.is_directed():
raise ValueError("G must be an undirected Graph instance")

if core_number is not None:
if G.renumbered is True:
if core_number is None:
core_number = _call_plc_core_number(G, degree_type=degree_type)
else:
if G.renumbered:
if len(G.renumber_map.implementation.col_names) > 1:
cols = core_number.columns[:-1].to_list()
else:
cols = "vertex"
core_number = G.add_internal_vertex_id(core_number, "vertex", cols)

else:
core_number = _call_plc_core_number(G)
core_number = core_number.rename(columns={"core_number": "values"}, copy=False)
core_number = G.add_internal_vertex_id(core_number, "vertex", cols)

core_number = core_number.rename(columns={"core_number": "values"})
if k is None:
k = core_number["values"].max()

k_core_df = k_core_wrapper.k_core(G, k, core_number)
src_vertices, dst_vertices, weights = pylibcugraph_k_core(
resource_handle=ResourceHandle(),
graph=G._plc_graph,
degree_type=degree_type,
k=k,
core_result=core_number,
do_expensive_check=False,
)

k_core_df = cudf.DataFrame()
k_core_df["src"] = src_vertices
k_core_df["dst"] = dst_vertices
k_core_df["weight"] = weights

if G.renumbered:
k_core_df, src_names = G.unrenumber(k_core_df, "src", get_column_names=True)
k_core_df, dst_names = G.unrenumber(k_core_df, "dst", get_column_names=True)

else:
src_names = k_core_df.columns[0]
dst_names = k_core_df.columns[1]

if G.edgelist.weights:

KCoreGraph.from_cudf_edgelist(
k_core_df, source=src_names, destination=dst_names, edge_attr="weight"
)
Expand Down
59 changes: 0 additions & 59 deletions python/cugraph/cugraph/cores/k_core_wrapper.pyx

This file was deleted.

1 change: 1 addition & 0 deletions python/cugraph/cugraph/dask/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
from .sampling.random_walks import random_walks
from .centrality.eigenvector_centrality import eigenvector_centrality
from .cores.core_number import core_number
from .cores.k_core import k_core
from .link_prediction.jaccard import jaccard
from .link_prediction.sorensen import sorensen
from .link_prediction.overlap import overlap
10 changes: 10 additions & 0 deletions python/cugraph/cugraph/dask/community/egonet.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
import dask_cudf
import cudf
from cugraph.dask.common.input_utils import get_distributed_data
import warnings

from pylibcugraph import ResourceHandle, ego_graph as pylibcugraph_ego_graph

Expand Down Expand Up @@ -110,6 +111,15 @@ def ego_graph(input_graph, n, radius=1, center=True):
# Initialize dask client
client = input_graph._client

# FIXME: Implement a better way to check if the graph is weighted similar
# to 'simpleGraph'
if len(input_graph.edgelist.edgelist_df.columns) != 3:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ egonet does not require weights. Not sure we should require that in python.

Copy link
Contributor Author

@jnke2016 jnke2016 Nov 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a segmentation fault when running the CAPI tests without weights

rlratzel marked this conversation as resolved.
Show resolved Hide resolved
warning_msg = (
"'Ego_graph' requires the input graph to be weighted: Unweighted "
"graphs will not be supported in the next release."
)
warnings.warn(warning_msg, PendingDeprecationWarning)

if isinstance(n, (int, list)):
n = cudf.Series(n)
elif not isinstance(
Expand Down
Loading