Skip to content

Commit

Permalink
use smaller block size on cuda
Browse files Browse the repository at this point in the history
  • Loading branch information
pratikvn committed Aug 16, 2024
1 parent c689cf3 commit e5b261f
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions cuda/solver/batch_bicgstab_kernels.cu
Original file line number Diff line number Diff line change
Expand Up @@ -144,10 +144,11 @@ public:
const int shmem_per_blk =
get_max_dynamic_shared_memory<StopType, PrecType, LogType,
BatchMatrixType, value_type>(exec_);
const int block_size =
get_num_threads_per_block<StopType, PrecType, LogType,
BatchMatrixType, value_type>(
exec_, mat.num_rows);
// TODO
const int block_size = 256;
// get_num_threads_per_block<StopType, PrecType, LogType,
// BatchMatrixType, value_type>(
// exec_, mat.num_rows);
GKO_ASSERT(block_size >= 2 * config::warp_size);
const size_t prec_size = PrecType::dynamic_work_size(
Expand Down

0 comments on commit e5b261f

Please sign in to comment.