-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dawid-Sebastiani Score #75
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be curious to see the performance benefit here compared to plain numpy. Usually numba doesn't optimize much when using a lot of numpy operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and the DSS is also by far not as computationlly intensive as e.g. ES or VS so I guess even if we get a good relative speed improvement the user might not feel it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this would look more or less the same for the other backends?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty much I guess. I have not so much experience in tf/torch though, so I'm not sure yet about the exact implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The det and inv functions should be easy/similar to define in the other backends.
For cov:
- jax should be almost exactly the same as numpy.
- torch would need "tranpose" being replaced with "permute".
- tensorflow also uses "transpose" but does not have a "cov" function, so this would need to be adapted.
Although, as mentioned in a separate comment, the transpose function currently assumes we only have 3 dimensions.
return np.cov(x) | ||
else: | ||
centered = x - x.mean(axis=-2, keepdims=True) | ||
return (centered.transpose(0, 2, 1) @ centered) / (x.shape[-2] - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
centered.transpose(0, 2, 1) assumes the array is 3D. Is there a way to generalise this to calculate the covariance matrix for higher dimensions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merry Christmas :)
If we assume that we always have the square matrix in the last two dimensions (like numpy does on the batched version of e.g. np.linalg.inv()
), something like centered.swapaxes(-1, -2)
should do the trick. Is that convention fine for you @sallen12?
Hi @frazane as discussed, please have a look.
Compared to #32 I've calculated the batched covariance here as matrix multiplication which I feel is the cleaner way - but a bit more tricky to figure out the correct axes to multiply along.
To-Do: