Bug in `construct_W.py` #58

mary-design-testing · 2020-04-12T10:12:51Z

Hi there,

I'm trying to construct the W weight matrix to work with lap_score on the following simple dataset: employes-region.txt. I've tried the following code, which is provided as an example in file test_lap_score.py:

    kwargs_W = {"metric": "euclidean", "neighbor_mode": "knn", "weight_mode": "heat_kernel", "k": 5, 't': 1}
    W = construct_W.construct_W(X, **kwargs_W)

Unfortunately, it fails with the following exception at line 152 of file construct_W.py:

could not broadcast input array from shape (25) into shape (30)

I've gone through the code, and I think that the problem's that the dimensions of G are wrong. This is the piece of code involved in the exception:

            t = kwargs['t']
            # compute pairwise euclidean distances
            D = pairwise_distances(X)
            D **= 2
            # sort the distance matrix D in ascending order
            dump = np.sort(D, axis=1)
            idx = np.argsort(D, axis=1)  #  *** 1
            idx_new = idx[:, 0:k+1]  #  *** 2
            dump_new = dump[:, 0:k+1] #  *** 2
            # compute the pairwise heat kernel distances
            dump_heat_kernel = np.exp(-dump_new/(2*t*t))
            G = np.zeros((n_samples*(k+1), 3)) #  *** 2
            G[:, 0] = np.tile(np.arange(n_samples), (k+1, 1)).reshape(-1) #  *** 2
            G[:, 1] = np.ravel(idx_new, order='F') # *** EXCEPTION HERE!!
            G[:, 2] = np.ravel(dump_heat_kernel, order='F')
            # build the sparse affinity matrix W
            W = csc_matrix((G[:, 2], (G[:, 0], G[:, 1])), shape=(n_samples, n_samples))
            bigger = np.transpose(W) > W
            W = W - W.multiply(bigger) + np.transpose(W).multiply(bigger)

I think that there's a problem at line *** 1. Should it compute idxusing dump? I mean:

            idx = np.argsort(dump, axis=1)  #  *** 1

And the other problem is at the lines *** 2. Shouldn't they use k as a multiplier instead of k+1? That is:

            idx_new = idx[:, 0:k]  #  *** 2
            dump_new = dump[:, 0:k] #  *** 2
            # compute the pairwise heat kernel distances
            dump_heat_kernel = np.exp(-dump_new/(2*t*t))
            G = np.zeros((n_samples*(k), 3)) #  *** 2
            G[:, 0] = np.tile(np.arange(n_samples), (k, 1)).reshape(-1) #  *** 2

I've fixed my local installation using this path and I've run the system on a large collection with 200+ datasets. It works correctly now.

I've seen that there are many other lines in which a similar patch might apply, bu I haven't tried other configuration options.

Thanks! Regards

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in `construct_W.py` #58

Bug in `construct_W.py` #58

mary-design-testing commented Apr 12, 2020 •

edited

Loading

Bug in construct_W.py #58

Bug in construct_W.py #58

Comments

mary-design-testing commented Apr 12, 2020 • edited Loading

Bug in `construct_W.py` #58

Bug in `construct_W.py` #58

mary-design-testing commented Apr 12, 2020 •

edited

Loading