Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binary matrices utils in rust #12456

Merged
merged 49 commits into from
Jun 28, 2024
Merged

binary matrices utils in rust #12456

merged 49 commits into from
Jun 28, 2024

Conversation

ShellyGarion
Copy link
Member

@ShellyGarion ShellyGarion commented May 23, 2024

Summary

close #12330

Transfer to rust binary matrices utils from:
https://github.com/Qiskit/qiskit/blob/main/qiskit/synthesis/linear/linear_matrix_utils.py.

joint with @alexanderivrii

Details and comments

@ShellyGarion ShellyGarion added performance synthesis Rust This PR or issue is related to Rust code in the repository labels May 23, 2024
@ShellyGarion ShellyGarion added this to the 1.2.0 milestone May 23, 2024
@ShellyGarion ShellyGarion requested a review from a team as a code owner May 23, 2024 15:05
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

@coveralls
Copy link

coveralls commented May 23, 2024

Pull Request Test Coverage Report for Build 9697320760

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 249 of 265 (93.96%) changed or added relevant lines in 10 files are covered.
  • 138 unchanged lines in 7 files lost coverage.
  • Overall coverage increased (+0.03%) to 89.754%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/accelerate/src/synthesis/linear/mod.rs 86 92 93.48%
crates/accelerate/src/synthesis/linear/utils.rs 139 149 93.29%
Files with Coverage Reduction New Missed Lines %
qiskit/circuit/library/standard_gates/r.py 1 97.73%
qiskit/synthesis/clifford/clifford_decompose_layers.py 1 93.9%
crates/qasm2/src/lex.rs 3 92.62%
qiskit/synthesis/discrete_basis/solovay_kitaev.py 4 94.74%
qiskit/providers/fake_provider/generic_backend_v2.py 10 94.71%
qiskit/circuit/quantumcircuit.py 17 95.69%
crates/circuit/src/operations.rs 102 72.45%
Totals Coverage Status
Change from base Build 9661630219: 0.03%
Covered Lines: 63985
Relevant Lines: 71289

💛 - Coveralls

@ShellyGarion ShellyGarion self-assigned this May 30, 2024
@sbrandhsn
Copy link
Contributor

I'm excited for this PR! Can you add a graph highlighting the runtime speedup of your Rust code compared to the old Python code? I don't think this graph would need to be comprehensive but it would be nice to get a clearer picture for some relevant procedures. :-)

Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this, it's a great start. I left a couple inline comments for some suggestions. The larger comment shows how you can leverage rayon for multithreading part of the implementation. But we should benchmark it to make sure that multithreading actually speeds it up in this case. I could see the benefits from it not being huge here. In that suggestion you can just remove the .into_par_iter() line and that will make the iterator chain serially execute instead of doing it in parallel.

crates/accelerate/src/linear_matrix.rs Outdated Show resolved Hide resolved
crates/accelerate/src/linear_matrix.rs Outdated Show resolved Hide resolved
@alexanderivrii
Copy link
Contributor

alexanderivrii commented Jun 4, 2024

Shelly and I were experimenting with this code trying to understand how to modify matrix elements in-place (rather than copying the matrix from Python space to Rust space and back), and this is what we currently have.

The wrapper function looks as follows:

#[pyfunction]
#[pyo3(signature = (mat, ncols=None, full_elim=false))]
fn _gauss_elimination(
    _py: Python,
    mat: &PyArray2<bool>,
    ncols: Option<usize>,
    full_elim: Option<bool>,
) {
    unsafe {
        let view = mat.as_array_mut();
        gauss_elimination(view, ncols, full_elim);
    }
}

This seems to work, producing the expected results, yet there are a few points we are not sure about.

  1. We are using &PyArray2 rather than PyReadwriteArray2 that was suggested elsewhere. How can we be absolutely certain that the memory is modified in-place rather than copied?

  2. cargo build complains that &PyArray2 is deprecated:

warning: use of deprecated method `pyo3::deprecations::GilRefs::<T>::function_arg`: use `&Bound<'_, T>` instead for this function argument

however replacing &PyArray2<bool> by something like mat: &Bound<PyArray2<bool>> or mat: &Bound<'_PyArray2<bool>> does not compile, with the error

no method named `as_array_mut` found for reference `&pyo3::Bound<'_, PyArray<bool, Dim<[usize; 2]>>>` in the current scope 
  1. The code from the python space must be careful to use the right type for the entries, e.g. this works:
mat = rng.integers(2, size=(num_qubits, num_qubits)).astype(bool)
_gauss_elimination(mat)

but this does not:

mat = rng.integers(2, size=(num_qubits, num_qubits))
_gauss_elimination(mat)

giving the error

TypeError: argument 'mat': type mismatch:
 from=int64, to=bool

which is probably to be expected.

@mtreinish
Copy link
Member

If you want to make the rust function mutate the input numpy array this isn't too difficult. You need to change the types on things and that might require input normalization in the python side. I did a quick test locally and you would need to change the PR like:

diff --git a/crates/accelerate/src/linear_matrix.rs b/crates/accelerate/src/linear_matrix.rs
index d708e71bd..197c13c3e 100644
--- a/crates/accelerate/src/linear_matrix.rs
+++ b/crates/accelerate/src/linear_matrix.rs
@@ -10,19 +10,19 @@
 // copyright notice, and modified files need to carry a notice indicating
 // that they have been altered from the originals.
 
-use ndarray::{s, Array2};
-use numpy::{AllowTypeChange, IntoPyArray, PyArray2, PyArrayLike2};
+use ndarray::{s, ArrayViewMut2};
+use numpy::PyReadwriteArray2;
 use pyo3::prelude::*;
 
 // Perform ROW operation on a matrix mat
-fn _row_op(mat: &mut Array2<i8>, ctrl: usize, trgt: usize) {
+fn _row_op(mat: &mut ArrayViewMut2<i8>, ctrl: usize, trgt: usize) {
     let row0 = mat.row(ctrl).to_owned();
     let mut row1 = mat.row_mut(trgt);
     row1.zip_mut_with(&row0, |x, &y| *x ^= y);
 }
 
 // Perform COL operation on a matrix mat
-fn _col_op(mat: &mut Array2<i8>, ctrl: usize, trgt: usize) {
+fn _col_op(mat: &mut ArrayViewMut2<i8>, ctrl: usize, trgt: usize) {
     let col0 = mat.column(ctrl).to_owned();
     let mut col1 = mat.column_mut(trgt);
     col1.zip_mut_with(&col0, |x, &y| *x ^= y);
@@ -31,11 +31,7 @@ fn _col_op(mat: &mut Array2<i8>, ctrl: usize, trgt: usize) {
 // Gauss elimination of a matrix mat with m rows and n columns.
 // If full_elim = True, it allows full elimination of mat[:, 0 : ncols]
 // Returns the matrix mat.
-fn gauss_elimination(
-    mut mat: Array2<i8>,
-    ncols: Option<usize>,
-    full_elim: Option<bool>,
-) -> Array2<i8> {
+fn gauss_elimination(mut mat: ArrayViewMut2<i8>, ncols: Option<usize>, full_elim: Option<bool>) {
     let (m, mut n) = (mat.nrows(), mat.ncols()); // no. of rows and columns
     if let Some(ncols_val) = ncols {
         n = usize::min(n, ncols_val); // no. of active columns
@@ -62,7 +58,7 @@ fn gauss_elimination(
             }
         }
         if !is_non_zero {
-            return mat; // A is in the canonical form
+            return; // A is in the canonical form
         }
 
         if new_r != r {
@@ -87,21 +83,17 @@ fn gauss_elimination(
         }
         r += 1;
     }
-    mat
 }
 
 #[pyfunction]
 #[pyo3(signature = (mat, ncols=None, full_elim=false))]
 fn _gauss_elimination(
-    py: Python,
-    mat: PyArrayLike2<i8, AllowTypeChange>,
+    mut mat: PyReadwriteArray2<i8>,
     ncols: Option<usize>,
     full_elim: Option<bool>,
-) -> PyResult<Py<PyArray2<i8>>> {
-    let view = mat.as_array().to_owned();
-    Ok(gauss_elimination(view, ncols, full_elim)
-        .into_pyarray_bound(py)
-        .unbind())
+) {
+    let view = mat.as_array_mut();
+    gauss_elimination(view, ncols, full_elim);
 }
 
 #[pymodule]

The tradeoff here is that you'll probably need to do a np.asarray(dtype=np.int8) call in python to ensure the input array is typed with i8 as it will error if it's not typed correctly.

@alexanderivrii
Copy link
Contributor

Thanks, @mtreinish, your code using PyReadwriteArray2 works perfectly! As far as I can tell, the difference between this and what we tried together with Shelly is that you have also removed the py: Python argument. I don't really understand when this argument should or should not be present.

Regarding np.asarray(dtype=np.int8), doesn't this copy the array?

@mtreinish
Copy link
Member

Regarding np.asarray(dtype=np.int8), doesn't this copy the array?

It will make a copy if the dtype doesn't match the input array's type: https://numpy.org/doc/stable/reference/generated/numpy.asarray.html.

The copy is needed to convert the elements to the specified dtype and would be unavoidable in this case. The previous PyArrayLike2<> usage was actually doing this for you because you had the AllowTypeChange argument: https://docs.rs/numpy/latest/numpy/struct.PyArrayLike.html

@alexanderivrii
Copy link
Contributor

A very quick experiment on my laptop:

for num_qubits in [50, 100, 250, 500, 1000, 2500]:
    rng = np.random.default_rng(0)
    mats = [rng.integers(2, size=(num_qubits, num_qubits)).astype(bool) for _ in range(10)]

    time_start = time.time()

    for mat in mats:
        _gauss_elimination(mat)

    time_end = time.time()
    print(f"{num_qubits = }, time = {time_end-time_start:.2f}")

Old version (python/numpy):

num_qubits = 50, time = 0.02
num_qubits = 100, time = 0.05
num_qubits = 250, time = 0.39
num_qubits = 500, time = 1.89
num_qubits = 1000, time = 7.56
num_qubits = 2500, time = 66.32

New code (ref. 0cf7502) but without using parallel iterators:

num_qubits = 50, time = 0.00
num_qubits = 100, time = 0.01
num_qubits = 250, time = 0.08
num_qubits = 500, time = 0.63
num_qubits = 1000, time = 5.11
num_qubits = 2500, time = 35.89

New code (ref. 0cf7502) with using parallel iterators:

num_qubits = 50, time = 0.01
num_qubits = 100, time = 0.02
num_qubits = 250, time = 0.15
num_qubits = 500, time = 0.69
num_qubits = 1000, time = 2.99
num_qubits = 2500, time = 34.27

@alexanderivrii
Copy link
Contributor

The Rust code is compiled in release mode (without it it's significantly slower).

@alexanderivrii
Copy link
Contributor

alexanderivrii commented Jun 5, 2024

The copy is needed to convert the elements to the specified dtype and would be unavoidable in this case. The previous PyArrayLike2<> usage was actually doing this for you because you had the AllowTypeChange argument: https://docs.rs/numpy/latest/numpy/struct.PyArrayLike.html

Thank you! In terms of the preferred API, would it be better to be more general on the Rust side (i.e. restore AllowTypeChange) or more restrictive on the Rust side (i.e. only allow arrays of type bool as per the latest version)?

@mtreinish
Copy link
Member

I typically prefer doing the type normalization in Python directly. I think in this case that's preferable because you avoid a copy if the input is the correct dtype. While in the previous implementation you always needed to copy to get a mutable array.

Comment on lines 74 to 90
if full_elim == Some(true) {
mat.axis_iter_mut(Axis(0))
.into_par_iter()
.enumerate()
.filter(|(i, row)| (*i < r) && row[new_k])
.for_each(|(_i, mut row)| {
row.zip_mut_with(&row0, |x, &y| *x ^= y);
});
}

mat.axis_iter_mut(Axis(0))
.into_par_iter()
.enumerate()
.filter(|(i, row)| (*i > r) && (*i < m) && row[new_k])
.for_each(|(_i, mut row)| {
row.zip_mut_with(&row0, |x, &y| *x ^= y);
});
Copy link
Member

@mtreinish mtreinish Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified to a single iterator with something like (double check that I got the filter condition correct:

Suggested change
if full_elim == Some(true) {
mat.axis_iter_mut(Axis(0))
.into_par_iter()
.enumerate()
.filter(|(i, row)| (*i < r) && row[new_k])
.for_each(|(_i, mut row)| {
row.zip_mut_with(&row0, |x, &y| *x ^= y);
});
}
mat.axis_iter_mut(Axis(0))
.into_par_iter()
.enumerate()
.filter(|(i, row)| (*i > r) && (*i < m) && row[new_k])
.for_each(|(_i, mut row)| {
row.zip_mut_with(&row0, |x, &y| *x ^= y);
});
mat.axis_iter_mut(Axis(0))
.into_par_iter()
.enumerate()
.filter(|(i, row)| (full_elim == Some(true) && (*i < r) && row[new_k]) || (*i > r && *i < m))
.for_each(|(_i, mut row)| {
row.zip_mut_with(&row0, |x, &y| *x ^= y);
});

You can also move the full_elim condition into the for_each block and just change the filter to be i < m if that's easier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other thing I was thinking about is you can probably get a slice view of the rows you care about and iterate over that slice instead of the full matrix. It's might be worth investigating that instead of doing a single big iterator like this. TBH, I'm not sure what is going to be the most efficient here we need to benchmark all the different possible ways to write this and figure out what ends up being the best performing.

@ShellyGarion ShellyGarion changed the title [WIP] binary matrices utils in rust binary matrices utils in rust Jun 25, 2024
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only took a quick look at the numpy usage in this PR. I feel like there are a lot of extra copies happening which we can mostly avoid I think. Although I haven't reviewed the rust code in depth I was leaving that for @kevinhartman (as I think he volunteered to review it). As we're accelerating this code extra copies will have a real noticeable overhead so we should be intentional around when we copy.

@@ -53,7 +53,8 @@ def random_cnotdihedral(num_qubits, seed=None):
random_invertible_binary_matrix,
)

linear = random_invertible_binary_matrix(num_qubits, seed=rng)
seed = rng.integers(100000, size=1, dtype=np.uint64)[0]
linear = random_invertible_binary_matrix(num_qubits, seed=seed).astype(int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always result in a copy. You could use .astype(int, copy=False) but it still would need to copy. Also I think the dtype is incorrect shouldn't this be int8 because that's what cnot dihedral is using: https://github.com/qiskit/qiskit/blob/main/qiskit/quantum_info/operators/dihedral/dihedral.py#L129.

That being said wouldn't it just make more sense to change all the binary matrices in rust to use Array2<i8> to match this? It'd be the same size as Array2<bool> but then you could do a no-copy return to cnotdihedral and anywhere else using these functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original motivation for rust binary matrices to be of type Array2<bool> is that python classes Clifford and LinearFunction internally store a matrix of booleans, and we wanted to avoid copying there. IMHO, the dihedral class is less important then either of the above (at least right now). However, I do agree that the python synthesize size needs to be cleaned up.

Copy link
Member Author

@ShellyGarion ShellyGarion Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @alexanderivrii . As follows form the function name, the aim is only to handle binary matrices (e.g. bool).
In the future, the plan is to port the synthesis code to rust, and so we can handle the other classes later.

qiskit/synthesis/clifford/clifford_decompose_layers.py Outdated Show resolved Hide resolved
qiskit/synthesis/linear/linear_depth_lnn.py Outdated Show resolved Hide resolved
qiskit/synthesis/linear/linear_depth_lnn.py Outdated Show resolved Hide resolved
qiskit/synthesis/linear_phase/cx_cz_depth_lnn.py Outdated Show resolved Hide resolved
qiskit/synthesis/stabilizer/stabilizer_decompose.py Outdated Show resolved Hide resolved
qiskit/transpiler/passes/synthesis/high_level_synthesis.py Outdated Show resolved Hide resolved
qiskit/transpiler/passes/synthesis/high_level_synthesis.py Outdated Show resolved Hide resolved
qiskit/synthesis/clifford/clifford_decompose_layers.py Outdated Show resolved Hide resolved
qiskit/synthesis/clifford/clifford_decompose_layers.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Cryoris Cryoris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good, and I'm eager to merge it! Below are some questions and comments, my main concern being on whether we handle the functions as API-documented (meaning we keep the signature the same) or whether we can change them as if they were private 🙂

crates/accelerate/src/synthesis/linear/mod.rs Outdated Show resolved Hide resolved
crates/accelerate/src/synthesis/linear/mod.rs Outdated Show resolved Hide resolved
Comment on lines +145 to +151
if verify {
let mat2 = binary_matmul_inner(mat, (&invmat).into())?;
let identity_matrix: Array2<bool> = Array2::from_shape_fn((n, n), |(i, j)| i == j);
if mat2.ne(&identity_matrix) {
return Err("The inverse matrix is not correct.".to_string());
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this was here before, but it seems very strange to essentially have a runtime test of our code here... if we provide the function then we should also be sure it is correct and not need to verify the result 😅 I'm fine keeping it because it existed before, but I would prefer removing this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a parameter "verify" in several parts of the synthesis code, which is mainly used for tests, since the algorithms are quite complicated (the default value is False)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave it for this PR 👍🏻 but a common subroutine like matrix inversion seems like something stable enough that we don't need to bake in a verification and instead just have some tests in the test suite (fwiw other packages like NumPy/SciPy also don't have any arguments to check that 🙂).

Copy link
Contributor

@Cryoris Cryoris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for the effort! 🦀

@Cryoris Cryoris added this pull request to the merge queue Jun 28, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 28, 2024
@Cryoris Cryoris added this pull request to the merge queue Jun 28, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 28, 2024
@mtreinish mtreinish added this pull request to the merge queue Jun 28, 2024
Merged via the queue into Qiskit:main with commit e9208a6 Jun 28, 2024
15 checks passed
@ElePT ElePT added the Changelog: New Feature Include in the "Added" section of the changelog label Jul 30, 2024
Procatv pushed a commit to Procatv/qiskit-terra-catherines that referenced this pull request Aug 1, 2024
* gaussian elimination in rust

* handle lint errors

* replace python function by rust function for gauss elimination

* change matrix elements type from bool to i8

* add parallelization in row operations

* update matrices in place

* change matrix type in rust code to bool

* handle type in python code

* update filter following review

* remove parallelization using rayon

* move _gauss_elimination_with_perm to rust

* fix fmt error

* simplify _gauss_elimination function

* update _compute_rank_after_gauss_elim to rust

* update _row_op and _col_op

* transfer _row_op and _col_op from python to rust

* fix code due to failing tests

* minor update of types

* move calc_inverse_matrix to rust, add _binary_matmul in rust

* fix failing tests, by changing mat type from int to bool

* update rust docstrings

* add function _add_row_or_col to rust code

* improve binary_matmul

* proper error handling

* unified format of function names

* move compute_rank from python to rust, update errors

* update type of mat in compute_rank

* move random_invertible_binary_matrix and check_invertible_binary_matrix to rust

* Updating HighLevelSynthesis tests that depend on the specific random number

* Updating LinearSynthesis tests to pass seeds

* Updating tests in test_linear_function

* changing the matrix type in random dyhedral to be a matrix of ints rather than bools

* updating cx_cz synthesis tests

* updating clifford tests

* remove unused imports

* add option seed=None

* enhance rust docs

* add release notes

* remove unnecessary copy in python

* another copy fix

* another copy fix

* update rust docstrings

* update release notes

---------

Co-authored-by: AlexanderIvrii <alexi@il.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: New Feature Include in the "Added" section of the changelog performance Rust This PR or issue is related to Rust code in the repository synthesis
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move binary linear matrices utils to rust
8 participants