Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect results? #10

Closed
ben-eysenbach opened this issue Apr 4, 2018 · 6 comments
Closed

Incorrect results? #10

ben-eysenbach opened this issue Apr 4, 2018 · 6 comments

Comments

@ben-eysenbach
Copy link
Contributor

It appears that this library sometimes returns incorrect results. The example below shows two failure cases. For both problem instances, this library incorrectly assigns both rows to the same column. For reference, I have also included the (presumably correct) results returned by the scipy implementation.

import scipy.optimize
import lap
import numpy as np

C1 = np.array([[8., 9.],
               [1., 4.]])
C2 = np.array([[5., 1.],
               [8., 9.]])

for C in [C1, C2]:
    print('\n%s' % C)
    print('LAPJV result: %s, %s' % lap.lapjv(C)[1:])
    print('Scipy result: %s, %s' % scipy.optimize.linear_sum_assignment(C))

Result

[[8. 9.]
 [1. 4.]]
LAPJV result: [1 0], [1 0]
Scipy result: [0 1], [1 0]

[[5. 1.]
 [8. 9.]]
LAPJV result: [1 0], [1 0]
Scipy result: [0 1], [1 0]

Please let me know if I'm using the library incorrectly, or if there are certain types of inputs to avoid.

@dribnet
Copy link
Contributor

dribnet commented Apr 4, 2018

@gatagat - fyi, there is also related discussion in src-d/lapjv#3.

@gatagat
Copy link
Owner

gatagat commented Apr 5, 2018

@ben-eysenbach I think you are not interpreting the results correctly. The returned values are different because of the different representation of assignments.

SciPy's row, col = linear_sum_assignment(cost) should be interpreted as row[i] <-> col[i].

lap's x, y = lapjv(cost) should be interpreted as j <-> x[j] (or y[i] <-> i)

If you have an idea how to make this clearer in the docs, let me know. Now it reads:

  x: vector of columns assigned to rows
  y: vector of rows assigned to columns

@ben-eysenbach
Copy link
Contributor Author

@gatagat - Thanks for the clarification!

Here's one alternative way this could be explained in the function docstring:

Args:
    C (array): an (N x M) array specifying assignment costs.
        Entry C[i, j] is the cost of assigning row i to column j.
Outputs:
    x (array): a size-N array specifying to which column is row is assigned.
    y (array): a size-M array specifying to which row each column is assigned.

Here's some discussion that might be helpful in the documentation/readme:

The function lapjv(C) returns two arrays, x, y. If cost matrix C has shape N x M, then x is a size-N array specifying to which column is row is assigned, and y is a size-M array specifying to which row each column is assigned. For example, an output of x = [1, 0] indicates that row 0 is assigned to column 1 and row 1 is assigned to column 0. Similarly, an output of x = [2, 1, 0] indicates that row 0 is assigned to column 2, row 1 is assigned to column 1, and row 2 is assigned to column 0.

Note that this function does not return the assignment matrix (as done by scipy's linear_sum_assignment and lapsolver's solve dense). The assignment matrix can be constructed from x as follows:

A = np.zeros((N, M))
for i in range(N):
    A[i, x[i]] = 1

Equivalently, we could construct the assignment matrix from y:

A = np.zeros((N, M))
for j in range(M):
    A[y[j], j] = 1

Finally, note that the outputs are redundant: we can construct x from y, and vise versa:

x = [np.where(y == i)[0][0] for i in range(N)]
y = [np.where(x == j)[0][0] for j in range(M)]

@gatagat
Copy link
Owner

gatagat commented Apr 9, 2018

@ben-eysenbach Cool, can you put this into a PR?

@ben-eysenbach
Copy link
Contributor Author

ben-eysenbach commented Apr 10, 2018

Sure!
#11

@gatagat
Copy link
Owner

gatagat commented Apr 10, 2018

Thanks!

@gatagat gatagat closed this as completed Apr 10, 2018
gatagat pushed a commit that referenced this issue Nov 29, 2024
chore: strict upper bound for numpy 2.0
gatagat pushed a commit that referenced this issue Nov 29, 2024
This reverts commit 16f4833, reversing
changes made to 0d49997.
gatagat pushed a commit that referenced this issue Nov 29, 2024
This reverts commit 16f4833, reversing
changes made to 0d49997.
gatagat pushed a commit that referenced this issue Nov 29, 2024
This reverts commit 16f4833, reversing
changes made to 0d49997.
gatagat pushed a commit that referenced this issue Nov 29, 2024
This reverts commit 16f4833, reversing
changes made to 0d49997.
gatagat pushed a commit that referenced this issue Nov 29, 2024
This reverts commit 16f4833, reversing
changes made to 0d49997.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants