Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cpp] Improve stats calculation in row/col reorder #180

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bnprks
Copy link
Owner

@bnprks bnprks commented Dec 23, 2024

Motivation: The subset assignment operator results in a concatenation layer followed by a re-ordering of the rows/cols. We don't want this outer re-order layer to prevent any parallelism or other optimizations available on the inner layers, so it's time to have all the stats specialized for row/col re-ordering.

This just adds specializations for computeMatrixStats, rowSums, and colSums along with tests

Motivation: The subset assignment operator results in a concatenation
layer followed by a re-ordering of the rows/cols. We don't want this
outer re-order layer to prevent any parallelism or other optimizations
available on the inner layers, so it's time to have all the stats
specialized for row/col re-ordering.
Copy link
Collaborator

@immanuelazn immanuelazn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Small general comments, but those lay on the styling side.

@@ -58,6 +58,27 @@ subset_eigen_cols(const Eigen::Matrix<double, Rows, Cols> m, const std::vector<u
return ret;
}

// Array variants
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use Eigen::ArrayXXd here rather than use templating? Similar can probably done with MatrixXd for the function definitions at the top of this file.

ie:

// Array variants
Eigen::ArrayXXd
subset_eigen_rows(const Eigen::ArrayXXd &m, const std::vector<uint32_t> &indices)
{
    Eigen::ArrayXXd ret(indices.size(), m.cols());
    return ret;
    for (size_t i = 0; i < indices.size(); i++) {
        ret.row(i) = m.row(indices[i]);
    }
    return ret;
}

Eigen::ArrayXXd
subset_eigen_cols(const Eigen::ArrayXXd &m, const std::vector<uint32_t> &indices)
{
    Eigen::ArrayXXd ret(m.rows(), indices.size());

    for (size_t i = 0; i < indices.size(); i++) {
         ret.col(i) = m.col(indices[i]);
    }
    return ret;
}

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the matrix variants were probably templated so they would cover both MatrixXd and VectorXd arguments. Though you're right we don't really need that flexibility in the Array case.


template <int Rows, int Cols>
Eigen::Array<double, Rows, Cols>
subset_eigen_cols(const Eigen::Array<double, Rows, Cols> m, const std::vector<uint32_t> &indices) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a more general question, shouldn't these be placed in a cpp file, with only the function headers in the .h file? Or does that not matter as much as these are smaller functions?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these functions are templated, they kind of need to live in a header file unless we pre-commit to a specific set of explicitly instantiated template options. And for something so small and specific I didn't want to split out into another file

At some point, it might make sense to refactor many of the header-only IterableMatrix subclasses like this one to have most of the implementations in cpp files with explicit template instantiations for supported types

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants