Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Columns support for matrix class #229

Merged
merged 4 commits into from
Aug 19, 2021
Merged

Conversation

alyst
Copy link
Contributor

@alyst alyst commented Aug 11, 2021

This draft PR adds matrix::column class in full analogy to matrix::row class and updates the matrix API so that both rows and columns could be accessed. I just added it because my use-case required column-wise access.
I'm not sure about the proposed matrix API, it was just the easiest to implement.
Alternatively one could think about matrix::row_vector proxy class and matrix::rows() method that would provide row access (begin/end/[] etc) and matrix::column_vector that would provide symmetric columns access.

I also added the required types to matrix::row::iterator so that it actually conforms to the iterator concept.

@alyst
Copy link
Contributor Author

alyst commented Aug 17, 2021

The tests are failing because the matrix::row-related API has changed. I will fix the tests once there is a decision how this API should actually look like.

@jimhester
Copy link
Member

An alternative interface would be to let the user to supply if they want to use column or row based indexing as part of the matrix constructor, either as a template parameter or an argument.

I think generally users only want to iterator over rows or columns, and very rarely do both with the same matrix. If they do want to use both in the same application they could create two matrix objects pointing at the same underlying data.

@alyst
Copy link
Contributor Author

alyst commented Aug 17, 2021

@jimhester I've tried to implement your suggestion.

Now the matrix class has the 3rd template parameter, S (how the matrix is sliced), which could be either by_row or by_column.
So all fully specialized matrix types are duplicated: there is doubles_row_matrix and doubles_column_matrix.
The matrix::row class is replaced by matrix::slice, which uses matrix::slice_xxx() methods to define how it accesses the matrix elements.
matrix<V,T,S> is inherited from matrix_slices<S>, where the slice_xxx methods are actually defined (there are matrix_slices<by_row> and matrix_slices<by_column> specializations).

It's been a while since I wrote something in C++, I hope I have not overcomplicated these specialization things.

If this approach looks good, I can start updating the tests.

@alyst alyst force-pushed the matrix_column branch 2 times, most recently from 80dfe43 to a23f116 Compare August 18, 2021 08:54
@alyst
Copy link
Contributor Author

alyst commented Aug 18, 2021

@jimhester the tests pass now

using integers_matrix = matrix<r_vector<int>, int>;
using logicals_matrix = matrix<r_vector<r_bool>, r_bool>;
using strings_matrix = matrix<r_vector<r_string>, r_string>;
template <typename S = by_column>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the default access pattern would need to be by row, as this is what cpp11 currently supplies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change that. I just have thought that cpp11 is in active development and have not reached 1.0.0 yet, so changes like this are still possible. The reason is - column slices have continuous memory layout, so it would be nice to select more efficient alternative by default. And the fact that there were some issues with matrix iterators suggests that so far there are not so many matrix users whose code might be broken.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, let me ask around and see what people might prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright since R generally does column major ordering by default I think having cpp11 match that makes the most sense. So I think leaving it as you have it now is OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks for double-checking.
I've applied some cosmetics and squashed the commits to clean-up the log.

@alyst alyst force-pushed the matrix_column branch 2 times, most recently from 5f14f52 to e8ac032 Compare August 18, 2021 19:10
@jimhester
Copy link
Member

Can you please add a bullet to NEWS? It should briefly describe the change and end with (@yourname, #issuenumber).

Otherwise I think this looks great, thanks for working on it!

alyst added 4 commits August 19, 2021 17:17
- add the 3rd template parameter S, which could be either
  by_row or by_column
- matrix::row is replaced by matrix::slice
- make matrix::slice::iterator conforming to the iterator concept
- separate matrix::slice from matrix::slice_iterator
since it matches the column-major memory layout of R matrices
add tests for by_column slicing
@alyst
Copy link
Contributor Author

alyst commented Aug 19, 2021

@jimhester done

@jimhester jimhester merged commit 138bc27 into r-lib:master Aug 19, 2021
@jimhester
Copy link
Member

Thanks again for working on this!

Thanks a bunch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants