-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hadar/vecops #639
Merged
Merged
Hadar/vecops #639
Changes from 45 commits
Commits
Show all changes
47 commits
Select commit
Hold shift + click to select a range
f651c59
initial edits
64d4414
vector_sum issue
04351fb
for Miki
f3086d4
debugged reduction ops
2ab4488
added offset/stride to reduce ops
89e998a
implemented strides ops
9aaf944
vec_ops batch added
ShanieWinitz 1488732
vec_ops - added: config.batch, parallel transpose, tests
ShanieWinitz de1fcbf
vecops with batch - documentation
ShanieWinitz 3a943a5
formating
ShanieWinitz a013f46
Merge branch 'main' into hadar/vecops
HadarIngonyama 98ca917
vectorVectorOps passes
HadarIngonyama 0c6bc9a
mont + scalars passing
HadarIngonyama 32e262b
bitrev passes
HadarIngonyama e8e1799
slice passes
HadarIngonyama 1d1f84e
slice passes
HadarIngonyama 0c609bf
reduction passes
HadarIngonyama dca2e5b
fix scalar columns batch
HadarIngonyama 0728a06
remove same scalar bool
HadarIngonyama 2590df0
fix API
HadarIngonyama 2fd1fac
fix API
HadarIngonyama 1bd7c05
non zero passes
HadarIngonyama 0016149
slice and poly_dev apis deprecated use new ones with warning
yshekel 916618c
poly eval WIP
HadarIngonyama 6176a79
Merge remote-tracking branch 'refs/remotes/origin/hadar/vecops' into …
HadarIngonyama f033bdb
poly eval passes
HadarIngonyama 35d2e23
fix types +
HadarIngonyama ecc054d
tidy up
HadarIngonyama d9a0b5f
Merge remote-tracking branch 'origin/main' into hadar/vecops
HadarIngonyama 9798073
formatting and spelling
HadarIngonyama 32bd780
ntt test
HadarIngonyama 5291608
debug eval bug
HadarIngonyama b7b26ec
eval bug solved
HadarIngonyama baf3eb2
removed vec-ops example - doesn't compile and very similar to other e…
yshekel 2ed4369
updated poly-div test and poly-eval fix for column mode
yshekel b7d62c8
updated for poly div
yshekel b361b0f
vector div for extension field and test fix for missing ext field apis
yshekel fd208f4
remove wrong file
yshekel 4de758f
revert api headers
yshekel c9788e9
minor cleanup
yshekel 0562f85
Merge remote-tracking branch 'origin/main' into hadar/vecops
yshekel fdc7a5c
update go vec-ops config struct
yshekel 198d196
fix C++ example
yshekel 8f827d6
vec_ops rust binding and tests (#642)
emirsoyturk 0c25f75
formatting rust
yshekel dd6833b
extension field vec ops
yshekel fbb9f55
release script build v3.1
yshekel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,8 @@ The `VecOpsConfig` struct is a configuration object used to specify parameters f | |
- **`is_b_on_device: bool`**: Indicates whether the second input vector (`b`) is already on the device. If `false`, the vector will be copied from the host to the device. This field is optional. | ||
- **`is_result_on_device: bool`**: Indicates whether the result should be stored on the device. If `false`, the result will be transferred back to the host. | ||
- **`is_async: bool`**: Specifies whether the vector operation should be performed asynchronously. When `true`, the operation will not block the CPU, allowing other operations to proceed concurrently. Asynchronous execution requires careful synchronization to ensure data integrity. | ||
- **`batch_size: int`**: Number of vectors (or operations) to process in a batch. Each vector operation will be performed independently on each batch element. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the assumption is that all the vectors are concatenated to 1 vector |
||
- **`columns_batch: bool`**: True if the batched vectors are stored as columns in a 2D array (i.e., the vectors are strided in memory as columns of a matrix). If false, the batched vectors are stored contiguously in memory (e.g., as rows or in a flat array). | ||
- **`ext: ConfigExtension*`**: Backend-specific extensions. | ||
|
||
#### Default Configuration | ||
|
@@ -28,14 +30,17 @@ static VecOpsConfig default_vec_ops_config() { | |
false, // is_b_on_device | ||
false, // is_result_on_device | ||
false, // is_async | ||
1, // batch_size | ||
false, // columns_batch | ||
nullptr // ext | ||
}; | ||
return config; | ||
} | ||
``` | ||
|
||
### Element-wise Operations | ||
|
||
These functions perform element-wise operations on two input vectors `a` and `b`, producing an output vector. | ||
These functions perform element-wise operations on two input vectors a and b. If VecOpsConfig specifies a batch_size greater than one, the operations are performed on multiple pairs of vectors simultaneously, producing corresponding output vectors. | ||
|
||
#### `vector_add` | ||
|
||
|
@@ -90,9 +95,31 @@ template <typename T> | |
eIcicleError convert_montgomery(const T* input, uint64_t size, bool is_into, const VecOpsConfig& config, T* output); | ||
``` | ||
|
||
### Reduction operations | ||
|
||
These functions perform reduction operations on vectors. If VecOpsConfig specifies a batch_size greater than one, the operations are performed on multiple vectors simultaneously, producing corresponding output values. The storage arrangement of batched vectors is determined by the columns_batch field in the VecOpsConfig. | ||
|
||
#### `vector_sum` | ||
|
||
Computes the sum of all elements in each vector in a batch. | ||
|
||
```cpp | ||
template <typename T> | ||
eIcicleError vector_sum(const T* vec_a, uint64_t size, const VecOpsConfig& config, T* output); | ||
``` | ||
|
||
#### `vector_product` | ||
|
||
Computes the product of all elements in each vector in a batch. | ||
|
||
```cpp | ||
template <typename T> | ||
eIcicleError vector_product(const T* vec_a, uint64_t size, const VecOpsConfig& config, T* output); | ||
``` | ||
|
||
### Scalar-Vector Operations | ||
|
||
These functions apply a scalar operation to each element of a vector. | ||
These functions apply a scalar operation to each element of a vector. If VecOpsConfig specifies a batch_size greater than one, the operations are performed on multiple vector-scalar pairs simultaneously, producing corresponding output vectors. | ||
|
||
#### `scalar_add_vec / scalar_sub_vec` | ||
|
||
|
@@ -123,7 +150,7 @@ eIcicleError scalar_mul_vec(const T* scalar_a, const T* vec_b, uint64_t size, co | |
|
||
### Matrix Operations | ||
|
||
These functions perform operations on matrices. | ||
These functions perform operations on matrices. If VecOpsConfig specifies a batch_size greater than one, the operations are performed on multiple matrices simultaneously, producing corresponding output matrices. | ||
|
||
#### `matrix_transpose` | ||
|
||
|
@@ -138,7 +165,7 @@ eIcicleError matrix_transpose(const T* mat_in, uint32_t nof_rows, uint32_t nof_c | |
|
||
#### `bit_reverse` | ||
|
||
Reorders the vector elements based on a bit-reversal pattern. | ||
Reorders the vector elements based on a bit-reversal pattern. If VecOpsConfig specifies a batch_size greater than one, the operation is performed on multiple vectors simultaneously. | ||
|
||
```cpp | ||
template <typename T> | ||
|
@@ -147,16 +174,16 @@ eIcicleError bit_reverse(const T* vec_in, uint64_t size, const VecOpsConfig& con | |
|
||
#### `slice` | ||
|
||
Extracts a slice from a vector. | ||
Extracts a slice from a vector. If VecOpsConfig specifies a batch_size greater than one, the operation is performed on multiple vectors simultaneously, producing corresponding output vectors. | ||
|
||
```cpp | ||
template <typename T> | ||
eIcicleError slice(const T* vec_in, uint64_t offset, uint64_t stride, uint64_t size, const VecOpsConfig& config, T* vec_out); | ||
eIcicleError slice(const T* vec_in, uint64_t offset, uint64_t stride, uint64_t size_in, uint64_t size_out, const VecOpsConfig& config, T* vec_out); | ||
``` | ||
|
||
#### `highest_non_zero_idx` | ||
|
||
Finds the highest non-zero index in a vector. | ||
Finds the highest non-zero index in a vector. If VecOpsConfig specifies a batch_size greater than one, the operation is performed on multiple vectors simultaneously. | ||
|
||
```cpp | ||
template <typename T> | ||
|
@@ -165,7 +192,7 @@ eIcicleError highest_non_zero_idx(const T* vec_in, uint64_t size, const VecOpsCo | |
|
||
#### `polynomial_eval` | ||
|
||
Evaluates a polynomial at given domain points. | ||
Evaluates a polynomial at given domain points. If VecOpsConfig specifies a batch_size greater than one, the operation is performed on multiple vectors simultaneously. | ||
|
||
```cpp | ||
template <typename T> | ||
|
@@ -174,7 +201,7 @@ eIcicleError polynomial_eval(const T* coeffs, uint64_t coeffs_size, const T* dom | |
|
||
#### `polynomial_division` | ||
|
||
Divides two polynomials. | ||
Divides two polynomials. If VecOpsConfig specifies a batch_size greater than one, the operation is performed on multiple vectors simultaneously. | ||
|
||
```cpp | ||
template <typename T> | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,10 @@ | ||
# Vector Operations API | ||
|
||
Our vector operations API includes fundamental methods for addition, subtraction, and multiplication of vectors, with support for both host and device memory. | ||
Our vector operations API includes fundamental methods for addition, subtraction, and multiplication of vectors, with support for both host and device memory, as well as batched operations. | ||
|
||
## Vector Operations Configuration | ||
|
||
The `VecOpsConfig` struct encapsulates the settings for vector operations, including device context and operation modes. | ||
The `VecOpsConfig` struct encapsulates the settings for vector operations, including device context, operation modes, and batching parameters. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. remove , before the "and" |
||
|
||
### `VecOpsConfig` | ||
|
||
|
@@ -17,6 +17,8 @@ pub struct VecOpsConfig { | |
pub is_b_on_device: bool, | ||
pub is_result_on_device: bool, | ||
pub is_async: bool, | ||
pub batch_size: usize, | ||
pub columns_batch: bool, | ||
pub ext: ConfigExtension, | ||
} | ||
``` | ||
|
@@ -28,6 +30,9 @@ pub struct VecOpsConfig { | |
- **`is_b_on_device: bool`**: Indicates whether the input b data has been preloaded on the device memory. If `false` inputs will be copied from host to device. | ||
- **`is_result_on_device: bool`**: Indicates whether the output data is preloaded in device memory. If `false` outputs will be copied from host to device. | ||
- **`is_async: bool`**: Specifies whether the NTT operation should be performed asynchronously. | ||
- **`batch_size: usize`**: Number of vector operations to process in a single batch. Each operation will be performed independently on each batch element. | ||
- **`columns_batch: bool`**: true if the batched vectors are stored as columns in a 2D array (i.e., the vectors are strided in memory as columns of a matrix). If false, the batched vectors are stored contiguously in memory (e.g., as rows or in a flat array). | ||
|
||
- **`ext: ConfigExtension`**: extended configuration for backend. | ||
|
||
### Default Configuration | ||
|
@@ -40,11 +45,11 @@ let cfg = VecOpsConfig::default(); | |
|
||
## Vector Operations | ||
|
||
Vector operations are implemented through the `VecOps` trait, providing methods for addition, subtraction, and multiplication of vectors. | ||
Vector operations are implemented through the `VecOps` trait, providing methods for addition, subtraction, and multiplication of vectors. These methods support both single and batched operations based on the batch_size and columns_batch configurations. | ||
|
||
### Methods | ||
|
||
All operations are element-wise operations, and the results placed into the `result` param. These operations are not in place. | ||
All operations are element-wise operations, and the results placed into the `result` param. These operations are not in place, except for accumulate. | ||
|
||
- **`add`**: Computes the element-wise sum of two vectors. | ||
- **`accumulate`**: Sum input b to a inplace. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about the rest of the operations