New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Aviad/blake2s #576

Open

aviadingo wants to merge 25 commits into V2 from aviad/blake2s

Contributor

aviadingo commented Aug 11, 2024

Describe the changes

adds Blake2s cuda capability.

aviadingo added 14 commits

June 9, 2024 16:40


          init blake2s

dcf70e6


          added tests

75614f2


          fixed import

9c2d9ab


          added makefile

081218f


          Merge remote-tracking branch 'origin/main' into aviad/blake2s

1ff9942


          added namespace blake2s

256f8fa


          blake2s working for single hash using Hasher


          temp test tree code

076f82f


          added merkle tree test

4229b9d


          moved initialization to device

6ce4a6c


          added sequential test

d203c1e


          fixed deq test

050c70d


          batched test working

30301f2


          all tests pass

0678a2f

aviadingo requested review from ChickenLover, mickeyasa and LeonHibnik

August 11, 2024 10:12


          ran clang-format

340f962

ChickenLover reviewed

View reviewed changes

Contributor

ChickenLover left a comment

Good job with the PR. I left a bunch of style related comments. In theory they can be ignored as we are merging this to V2

icicle/include/hash/blake2s/blake2s.cuh Outdated

+               */
+              #pragma once
+              typedef unsigned char BYTE;

Contributor

ChickenLover Sep 5, 2024

please move these inside blake2s namespace

Contributor Author

aviadingo Nov 4, 2024

done

icicle/src/merkle-tree/merkle.cu Outdated

-                    THROW_ICICLE_ERR(
-                      IcicleError_t::InvalidArgument,
-                      "Hash max preimage length does not match merkle tree arity multiplied by digest elements");
+                  // if (compression.preimage_max_length < tree_config.arity * tree_config.digest_elements)

Contributor

ChickenLover Sep 5, 2024

You can just delete those at this point

Contributor Author

aviadingo Nov 4, 2024

done

icicle/src/hash/blake2s/blake2s.cu Outdated

+                __device__ __forceinline__ void cuda_blake2s_init_state(cuda_blake2s_ctx_t* ctx)
+                {
+                  memcpy(ctx->state, ctx->chain, BLAKE2S_CHAIN_LENGTH);
+                  // ctx->state[8] = ctx->t0;

Contributor

ChickenLover Sep 6, 2024

Why are these commented?

Contributor Author

aviadingo Nov 4, 2024

it's from an old version. deleted

icicle/src/hash/blake2s/blake2s.cu Outdated

+                  return a;
+                }
+                __device__ uint32_t cuda_blake2s_ROTR32(uint32_t a, uint8_t b) { return (a >> b) | (a << (32 - b)); }

Contributor

ChickenLover Sep 6, 2024

Maybe worth to add __inline__

icicle/src/hash/blake2s/blake2s.cu Outdated

+                  cudaMalloc(&cuda_outdata, BLAKE2S_BLOCK_SIZE * n_batch);
+                  assert(keylen <= 32);
+                  // CUDA_BLAKE2S_CTX ctx;

Contributor

ChickenLover Sep 6, 2024

These should be removed

Contributor Author

aviadingo Nov 4, 2024

done

icicle/src/hash/blake2s/blake2s.cu

+                  WORD block = (n_batch + thread - 1) / thread;
+                  kernel_blake2s_hash<<<block, thread>>>(cuda_indata, inlen, cuda_outdata, n_batch, BLAKE2S_BLOCK_SIZE);
+                  cudaMemcpy(out, cuda_outdata, BLAKE2S_BLOCK_SIZE * n_batch, cudaMemcpyDeviceToHost);
+                  cudaDeviceSynchronize();

Contributor

ChickenLover Sep 6, 2024

Current implementation does not support async. All of our other primitives do. So maybe worth adding HashConfig as an input and changing all the functions to their async alternatives

Contributor Author

aviadingo Nov 4, 2024

mcm_cuda_blake2s_hash_batch is not a part of the API, it's only for unit tests.
run_hash_many_kernel() is used in the same way as our other implementations

icicle/src/hash/blake2s/blake2s.cu Outdated

+                  kernel_blake2s_hash<<<block, thread>>>(cuda_indata, inlen, cuda_outdata, n_batch, BLAKE2S_BLOCK_SIZE);
+                  cudaMemcpy(out, cuda_outdata, BLAKE2S_BLOCK_SIZE * n_batch, cudaMemcpyDeviceToHost);
+                  cudaDeviceSynchronize();
+                  cudaError_t error = cudaGetLastError();

Contributor

ChickenLover Sep 6, 2024

Please use our error-management functions (you can find an example in any of our primitives)

Contributor Author

aviadingo Nov 4, 2024

done

emirsoyturk reviewed

View reviewed changes

icicle/src/hash/blake2s/extern.cu

Comment on lines +12 to +14

+                extern "C" cudaError_t blake2s_cuda(
+                  BYTE* input, BYTE* output, WORD number_of_blocks, WORD input_block_size, WORD output_block_size, HashConfig& config)
+                {

Contributor

emirsoyturk Sep 22, 2024

I noticed that the parameter order in blake2s_cuda is a bit different from keccak256_cuda (and other keccak functions).

keccak256_cuda(uint8_t* input, int input_block_size, int number_of_blocks, uint8_t* output, HashConfig& config)

To keep things consistent, so my suggestion smt like this:

blake2s_cuda(BYTE* input, WORD input_block_size, WORD number_of_blocks, WORD output_block_size, BYTE* output, HashConfig& config)

Contributor Author

aviadingo Nov 4, 2024

you are right. I'm not sure if it is worth changing it since this is usually not exposed to the user and it will require updates to the rust signatures as well

emirsoyturk and others added 9 commits

November 4, 2024 09:12


          blake2s cpp and rust API (#616)

e9fa409

## Changes

This PR updates the existing Blake2s implementation by integrating a C++
API and a Rust wrapper API.


          Merge remote-tracking branch 'origin/V2' into aviad/blake2s

eb10560


          clang-fmt

166c615


          Revert "clang-fmt"

caece25

This reverts commit 166c615.


          cargo-fmt

a52c6fd


          styling fixes

d40e485


          updated standalone function

ba1bd4f


          clang-fmt

07262ba


          updated api signature

247cdef

aviadingo requested a review from yshekel

November 4, 2024 08:55

jeremyfelder approved these changes

View reviewed changes

emirsoyturk approved these changes

View reviewed changes


          compare with stwo result

edb0f85

jeremyfelder self-requested a review

November 5, 2024 11:27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

ChickenLover ChickenLover left review comments

emirsoyturk emirsoyturk approved these changes

mickeyasa Awaiting requested review from mickeyasa

LeonHibnik Awaiting requested review from LeonHibnik

yshekel Awaiting requested review from yshekel

jeremyfelder Awaiting requested review from jeremyfelder

Labels

None yet