Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[r] add row/col max stat functions #103

Merged
merged 5 commits into from
Jul 9, 2024
Merged

[r] add row/col max stat functions #103

merged 5 commits into from
Jul 9, 2024

Conversation

immanuelazn
Copy link
Collaborator

@immanuelazn immanuelazn commented Jul 5, 2024

Description

Added row max and col max standalone rudimentary functions for iterable matrices. Can handle transposed matrices.

Tests

Tested against passing the non-iterable matrix version through MatrixStats and asserting equality

  • matrices with no zeroes
  • matrices with some, or all zeroes
  • transposed matrices

Thoughts

Would it make more sense to make this a method within IterableMatrix rather than a standalone function? I see some functions in Matrix.R that aren't methods for their specific matrix class, but instead assert typing. Not sure if that is the correct route for me to take though.

@immanuelazn immanuelazn requested a review from bnprks July 5, 2024 19:37
r/src/matrix_utils.cpp Outdated Show resolved Hide resolved
r/R/matrix.R Outdated
#' Get the max of each row in an iterable matrix
#' @param x IterableMatrix object/dgCMatrix object
#' @export
row_max <- function(x) {
Copy link
Owner

@bnprks bnprks Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For naming clarity, I think we'll want to do a similar approach as rowVars()/colVars() below, namely:

  1. Define a rowMaxs generic function with the same arguments as matrixStats::rowMaxs
  2. Define rowMaxs.default to call out to matrixStats or matrixGenerics if present
  3. Put our actual implementaiton in rowMaxs.IterableMatrix
  4. Call rlang::on_load to register the generic at load time if the other libraries are present

This will make it so our users can use rowMaxs() mostly without worry. (If they call matrixStats::rowMaxs() directly it won't work for BPCells objects, but any other copy of rowMaxs they call will support the full range of matrix types)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes interfacing with BPCells a lot more elegant, however for the future I wonder if doing this for each function will end up creating a lot of bloat. Maybe we could create a function to automatically perform this strategy, given the default libraries with the same function names?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good idea! I'm not quite sure about how to programmatically handle defining functions with a given set of arguments, but it's probably something we could figure out with the Advanced R book.

r/R/matrix.R Outdated Show resolved Hide resolved
r/NEWS.md Outdated Show resolved Hide resolved
r/R/matrix.R Outdated Show resolved Hide resolved
r/NEWS.md Outdated Show resolved Hide resolved
r/R/matrix.R Outdated Show resolved Hide resolved
bnprks added 2 commits July 9, 2024 12:14
- Moved code to sit next to the colVars functions
- Added an rlang::on_load wrap and updated the docstring for each function
  - Also deleted the header line
  `Get the max of each row in an iterable matrix`
  as that rendered badlly in the docs
- Ran `devtools::document()` to re-generate the docs files
(these live in the `man` folder and power the built-in R docs system
e.g. running `?BPCells::rowMaxs` in R terminal)
  - This also updates the `NAMESPACE` file so the functions will be
  exported properly
- Reduce to just the comprehensive test
  - Switch from exect_equal to expect_identical (more stringent since we
   don't have any precision issues to worry about)
  - Make the sparse matrix test include both positive + negative, and
  expand size so we still get good coverage with the new possibilities
  - Add a 4th transpose case to cover all possibilities of
  transpose before (yes/no) and transpose after (yes/no)
- Fix a typo in NEWS and add a shoutout for your first contribution!
@immanuelazn immanuelazn merged commit 383690a into main Jul 9, 2024
@bnprks bnprks deleted the max-by-row-col branch July 9, 2024 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants