Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error arised when running the program #32

Open
jingcaiguo opened this issue Mar 17, 2020 · 7 comments
Open

Error arised when running the program #32

jingcaiguo opened this issue Mar 17, 2020 · 7 comments

Comments

@jingcaiguo
Copy link

python: src/lib/msg_pass.cpp:20: void n2n_construct(GraphStruct*, long long int*, Dtype*): Assertion `nnz == (int)graph->num_edges' failed.

could help to check this problem?

@muhanzhang
Copy link
Owner

This seems to be related to graph construction. Which data are you using? Can you check whether your graph is in the correct format? For example, check whether each node line starts with t, m, and m neighbor indices. Check whether there are exactly m neighbor indices the same as you specified m.

@mmpust
Copy link

mmpust commented Dec 8, 2022

I am getting the same error message. Could you further elaborate on what nnz and int(graph) in the error message refer to? what is the requirement the n2n_construct() function is testing here?

I am able to generate my test/training graphs of <class '__main__.GNNGraph'> and the following features:

print(len(train_graphs[0].node_features)) # 254 [[1.14e+02 1.60e+00]
                                               # [9.33e+02 3.35e+00]
                                               # [4.74e+02 8.00e-01]
                                             # [6.26e+02 3.52e+00]...]    
print(len(train_graphs[0].degs)) # 254 [4, 6, 8, 6, 1, 7, ...]
print(train_graphs[0].num_nodes) # 254
print(train_graphs[0].label) # 0
print(len(train_graphs[0].edge_pairs)) # 2172, [114 933 ... 381 794]
print(train_graphs[0].edge_features) # None
print(train_graphs[0].num_edges) # 1086

The error message is as above:

python: src/lib/msg_pass.cpp:20: void n2n_construct(GraphStruct*, long long int*, Dtype*): Assertion `nnz == (int)graph->num_edges' failed.
Aborted (core dumped)

Thanks for your support!

@mmpust
Copy link

mmpust commented Dec 9, 2022

So, in my case I can narrow down the error message to the following part in gnn_lib.py.

    def PrepareSparseMatrices(self, graph_list, is_directed=0):
        assert not is_directed
        total_num_nodes, total_num_edges = self._prepare_graph(graph_list, is_directed)

        n2n_idxes = torch.LongTensor(2, total_num_edges * 2)
        n2n_vals = torch.FloatTensor(total_num_edges * 2)

        e2n_idxes = torch.LongTensor(2, total_num_edges * 2)
        e2n_vals = torch.FloatTensor(total_num_edges * 2)

        subg_idxes = torch.LongTensor(2, total_num_nodes)
        subg_vals = torch.FloatTensor(total_num_nodes)

        idx_list = (ctypes.c_void_p * 3)()
        idx_list[0] = n2n_idxes.numpy().ctypes.data
        idx_list[1] = e2n_idxes.numpy().ctypes.data
        idx_list[2] = subg_idxes.numpy().ctypes.data

        val_list = (ctypes.c_void_p * 3)()
        val_list[0] = n2n_vals.numpy().ctypes.data
        val_list[1] = e2n_vals.numpy().ctypes.data
        val_list[2] = subg_vals.numpy().ctypes.data

       ##########################
        print(total_num_nodes) # 960
        print(total_num_edges)  # 5317
        print(len(n2n_idxes))   # 2
        print(len(n2n_vals)) # 10634

        print(len(e2n_idxes))   # 2
        print(len(e2n_vals)) # 10634
        print(len(subg_idxes)) #2
        print(len(subg_vals)) # 960
        print(len(graph_list)) # 10
       ##########################

        print('Prepare Sparse Matrix start')
        self.lib.PrepareSparseMatrices(self.batch_graph_handle,
                                ctypes.cast(idx_list, ctypes.c_void_p),
                                ctypes.cast(val_list, ctypes.c_void_p))
        print('Prepare Sparse Matrix end')

With the output:

Initiate Sparse Matrix
960
5317
2
10634
2
10634
2
960
10
Prepare Sparse Matrix start
python: src/lib/msg_pass.cpp:20: void n2n_construct(GraphStruct*, long long int*, Dtype*): Assertion `nnz == (int)graph->num_edges' failed.
Aborted (core dumped)

Any suggestions what is going on here?
Thanks!

@muhanzhang
Copy link
Owner

Hi! Sorry for the late reply. nnz means number of nonzeros in the sparse adjacency matrix constructed, and (int)graph->num_edges means the number of edges in your python constructed graph object. This function transforms python graph to a C++ sparse matrix. The assert does a sanity check of whether the numbers mismatch.

My suggestion is to check whether your input graph has duplicated edges defined, or whether it has self-loops. Sometimes nx.Graph() handles them differently from the C++ library here. Please remove duplicated edges/self-loops from your data and have a try again.

@mmpust
Copy link

mmpust commented Dec 20, 2022

Hey Muhan, good thinking, thanks and I really appreciate the explanation! Unfortunately, removing duplicated edges and self-loops did not solve the problem. I can get the pipeline running without problems if I convert my networkx graphs into the Matlab text file and then import the text file again. But that seems to be quite inefficient for my task considering I already have networkx graphs and you are also just converting the Matlab text file into networkx graph objects and then running them through the GNNGraph function.

If I print the graph characteristics after running my networkx graph through your GNNGraph function without the Matlab text file conversion:

class GNNGraph(object):
    def __init__(self, g, node_feat):
        self.node_tags = list(g.nodes)
        self.num_nodes = len(self.node_tags)
        self.label = g.y
        self.node_features = node_feat
        self.degs = list(dict(g.degree).values())
        self.edge_features = None

        if len(g.edges()) != 0:
            x, y = zip(*g.edges())
            self.num_edges = len(x)
            self.edge_pairs = np.ndarray(shape=(self.num_edges, 2), dtype=np.int32)
            self.edge_pairs[:, 0] = x
            self.edge_pairs[:, 1] = y
            self.edge_pairs = self.edge_pairs.flatten()
        else:
            self.num_edges = 0
            self.edge_pairs = np.array([])

        print('New graph')
        print('Number of nodes: {0}'.format(self.num_nodes))
        print('Number of node tags: {0}'.format(len(self.node_tags)))
        print('Node tags: {0}'.format(self.node_tags))
        print('Node tags type: {0}'.format(type(self.node_tags)))
        print('Node feature length: {0}'.format(len(self.node_features)))
        print('Node feature type: {0}'.format(type(self.node_features)))
        print('Node features: {0}'.format(self.node_features))
        print('Number of edges: {0}'.format(self.num_edges))
        print('Length edge pairs: {0}'.format(len(self.edge_pairs)))
        print('Type of edge pairs: {0}'.format(type(self.edge_pairs)))
        print('Edge pairs: {0}'.format(self.edge_pairs))
        print('Label: {0}'.format(self.label))

The output is

New graph
Number of nodes: 81
Number of node tags: 81
Node tags: [79, 47, 52, 26, 73, 77, 81, 74, 2, 5, 70, 24, 63, 1, 18, 39, 48, 72, 71, 61, 31, 40, 46, 42, 29, 58, 69, 19, 78, 27, 6, 11, 45, 32, 12, 60, 14, 57, 25, 66, 3, 49, 7, 59, 53, 16, 67, 21, 50, 55, 54, 44, 13, 4, 36, 43, 35, 10, 22, 75, 38, 23, 20, 37, 15, 65, 33, 51, 68, 8, 41, 76, 80, 28, 34, 62, 64, 30, 17, 56, 9]
Node tags type: <class 'list'>
Node feature length: 81
Node feature type: <class 'numpy.ndarray'>
Node features: [[1.26860e+01]
 [1.22515e+01]
 [1.25422e+01]
 [1.79270e+00]
 [1.17031e+01]
 [1.10693e+01]
 [4.40550e+00]
 [3.46920e+00]
 [3.66020e+00]
 [5.21040e+00]
 [2.03350e+00]
 [1.22147e+01]
 [9.26360e+00]
 [2.36480e+00]
 [1.93570e+00]
 [9.59690e+00]
 [4.97320e+00]
 [4.40630e+00]
 [1.92700e+00]
 [6.50100e-01]
 [2.47380e+00]
 [2.20550e+00]
 [1.49770e+00]
 [2.31440e+00]
 [4.71200e+00]
 [6.14120e+00]
 [5.54830e+00]
 [8.61950e+00]
 [4.68700e+00]
 [8.90920e+00]
 [8.98640e+00]
 [9.44690e+00]
 [3.28840e+00]
 [3.29900e+00]
 [1.00760e+01]
 [1.50000e-03]
 [6.10610e+00]
 [6.93300e-01]
 [4.26350e+00]
 [5.66050e+00]
 [5.05700e+00]
 [9.93870e+00]
 [5.32800e-01]
 [7.72990e+00]
 [8.56790e+00]
 [7.73990e+00]
 [1.04963e+01]
 [6.41460e+00]
 [1.04960e+01]
 [3.33070e+00]
 [1.15685e+01]
 [1.14723e+01]
 [1.13877e+01]
 [6.97700e-01]
 [1.13092e+01]
 [2.45940e+00]
 [8.22590e+00]
 [8.36690e+00]
 [3.82700e-01]
 [1.05975e+01]
 [3.12800e+00]
 [6.11730e+00]
 [4.79870e+00]
 [5.59120e+00]
 [8.12950e+00]
 [6.39330e+00]
 [4.42680e+00]
 [9.30840e+00]
 [7.43170e+00]
 [4.50030e+00]
 [2.49620e+00]
 [9.21130e+00]
 [7.30900e+00]
 [2.71750e+00]
 [1.11623e+01]
 [8.10230e+00]
 [6.44780e+00]
 [1.17065e+01]
 [1.52510e+00]
 [2.94680e+00]
 [6.02380e+00]]
Number of edges: 1645
Length edge pairs: 3290
Type of edge pairs: <class 'numpy.ndarray'>
Edge pairs: [79 47 79 ...  9 17  9]
Label: 1

Is it a problem that the node tags and hence edge pairs are not sequential starting from 1? However, I am comparing graphs with string labels. I build a dictionary to store the strings and replace each string with an integer. Some nodes are exclusively found in some graphs and absent in others. Therefore those node tags are missing.
Thanks again for your support!

@mmpust
Copy link

mmpust commented Dec 20, 2022

The sequential labelling is not causing the problem. So, the error persists even when I reorder the graphs and they look like this after passing them through the GNNGraph function:

New graph
Number of nodes: 81
Number of node tags: 81
Node tags: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80]
Node tags type: <class 'list'>
Node feature length: 81
Node feature type: <class 'numpy.ndarray'>
Node features: [[ 1.7591]
 [ 8.2386]
 [ 2.5039]
 [ 1.5928] and so on]
Number of edges: 1645
Length edge pairs: 3290
Type of edge pairs: <class 'numpy.ndarray'>
Edge pairs: [ 0  1  0 ... 63 80 64]
Label: 1

If I remove the assert statements to track the real error message afterwards, I get:

n2n_sp = torch.sparse.FloatTensor(n2n_idxes, n2n_vals, torch.Size([total_num_nodes, total_num_nodes]))
RuntimeError: size is inconsistent with indices: for dim 0, size is 49781 but found index 94661596694512

I found the following thread but I am working with the most recent PyTorch version:
https://discuss.pytorch.org/t/runtimeerror-sizes-is-inconsistent-with-indices-pytorch-0-4-1/94181
Many thanks for your help, Muhan!

@muhanzhang
Copy link
Owner

If the txt way works, can you check what are different between a direct networkx graph and a networkx graph transformed from txt format? For example, select the same graph, check the differences in node tags, feature lengths, number types, etc. I guess there are some subtle reasons causing some discrepancy between the two ways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants