Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gpu kernel break #13

Merged
merged 34 commits into from
Mar 14, 2024
Merged

Gpu kernel break #13

merged 34 commits into from
Mar 14, 2024

Conversation

xenshinu
Copy link

Still cause page fault when not resetting long seg count every batch

gpu/plscore.cu Outdated
Comment on lines 499 to 501
// cudaMemsetAsync(dev_mem->d_total_n_long, 0, sizeof(size_t),
// *stream);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented this. Working fine

Copy link
Author

@xenshinu xenshinu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment

@@ -490,14 +490,14 @@ void plscore_async_short_mid_forward_dp(deviceMemPtr* dev_mem, cudaStream_t* str
dim3 shortDimGrid(score_kernel_config.short_griddim, 1, 1);
dim3 midDimGrid(score_kernel_config.mid_griddim, 1, 1);

// Run kernel
// printf("Grid Dim, %d\n", DimGrid.x);
cudaMemsetAsync(dev_mem->d_long_seg_count, 0, sizeof(unsigned int),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If comment these line, it abort with

[Debug] plchain_cal_score_async (gpu/plchain.cu:572) batch_n 31853545, read_start 26, read_end 50
[Info] plrange_async_range_selection (gpu/plrange.cu:258): Batch total_n 31853545, Range Kernel Launched, grid 984 cut 62204
[Info] plscore_async_short_mid_forward_dp (gpu/plscore.cu:537) short mid score generation success
[Debug] plchain_cal_score_async (gpu/plchain.cu:572) batch_n 32180703, read_start 50, read_end 82
[Info] plrange_async_range_selection (gpu/plrange.cu:258): Batch total_n 32180703, Range Kernel Launched, grid 999 cut 62838
[Info] plscore_async_short_mid_forward_dp (gpu/plscore.cu:537) short mid score generation success
[Debug] plchain_cal_score_async (gpu/plchain.cu:572) batch_n 30397856, read_start 82, read_end 111
[Info] plrange_async_range_selection (gpu/plrange.cu:258): Batch total_n 30397856, Range Kernel Launched, grid 943 cut 59356
[Info] plscore_async_short_mid_forward_dp (gpu/plscore.cu:537) short mid score generation success
[Info] plscore_async_long_forward_dp (gpu/plscore.cu:563) long score generation success
Memory access fault by GPU node-2 (Agent handle: 0x825d1dc0) on address 0x7fbe88e09000. Reason: Unknown.
/var/spool/slurm-llnl/job59127/slurm_script: line 30: 4091078 Aborted                 (core dumped) ./minimap2 -t 1 --max-chain-skip=2147483647 --gpu-chain /shareddata/umich_folder/data/ONT/hg38.mmi /shareddata/umich_folder/data/ONT/random_500MBases_200kto300k.fa

gpu/plchain.cu Outdated
#endif

// DEBUG: debug analysis that involves sychronizating the whole device
#if defined(DEBUG_CHECK)
plchain_debug_analysis(stream_setup.streams[stream_id]);
#endif // DEBUG_VERBOSE
// reset values
cudaMemsetAsync(stream_setup.streams[stream_id].dev_mem.d_long_seg_count, 0, sizeof(unsigned int),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reset counters after each large batch

gpu/plchain.cu Outdated
// NOTE: this is the number of long segs till this microbatch
size_t long_segs_num = stream_setup.streams[stream_id].host_mems[uid].long_segs_num;
for (; long_seg_idx < long_segs_num; long_seg_idx++) {
// TODO: write long_segs + long_seg_idx to f/p
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: copy p/f_long back

gpu/plchain.cu Outdated
stream_setup.streams[t].reads, misc, km);
// TODO: backtrack multiple pending batches
// reset values
cudaMemsetAsync(stream_setup.streams[t].dev_mem.d_long_seg_count, 0, sizeof(unsigned int),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reset counters once a large stream finish

max_range_grid, long_seg_buffer_size);
}
// one stream has one long mem and one device mem
plmem_malloc_long_mem(&stream_setup.streams[i].long_mem, long_seg_buffer_size);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

malloc long mem

gpu/plmem.cu Outdated
cudaMemsetAsync(&stream_setup.streams[i].dev_mem.d_mid_seg_count, 0, sizeof(unsigned int),
stream_setup.streams[i].cudastream);
cudaMemsetAsync(&stream_setup.streams[i].dev_mem.d_total_n_long, 0, sizeof(size_t),
stream_setup.streams[i].cudastream);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reset counters at the first initialize

@@ -23,10 +24,8 @@ typedef struct {
uint16_t *p; // predecessor

// array size: number of cuts in the batch / long_seg_cut
seg_t *long_segs;
// total long segs number till this batch
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still keep the accumulative value since we are running async

fprintf(stderr, "max_range_grid %d griddim %d, total_n %lu n_read %d\n",
stream_setup.max_range_grid, griddim, total_n, n_read);
}
stream_setup.streams[stream_id].reads = reads;
Copy link
Member

@joydddd joydddd Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move cudaMemsetAsync here. Reset long_seg_count and total_n_long before lauching async kernels into the stream.

gpu/plchain.cu Outdated
stream_setup.streams[t].cudastream);
cudaMemsetAsync(stream_setup.streams[t].dev_mem.d_total_n_long, 0, sizeof(size_t),
stream_setup.streams[t].cudastream);
seg_t* long_segs = stream_setup.streams[t].long_mem.long_segs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to reset long seg count here: will reset every time before lauching any kernels.

Copy link
Member

@joydddd joydddd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aac_config.json: manual config for fitting one batch + long buffer in aac gpu memory

Batch memory usage: 51.32 GB

Long buffer memory usage: 8.98 GB

@xenshinu
Copy link
Author

xenshinu commented Nov 1, 2023

This line of code will limit max_n by int32max

@xenshinu
Copy link
Author

xenshinu commented Nov 6, 2023

When building the program, use make MICRO_BATCH=4 GPU_CONFIG=aac_config.json SHORT_BLOCK_SIZE=64 LONG_BLOCK_SIZE=1024 MID_BLOCK_SIZE=512 MID_CUT=1 LONG_CUT=100, where MICRO_BATCH=4 will specify how many micro-batches you want to use. We'll add it to GPU_CONFIG in the next commit.

# rocprof --stats -o rocprof_output/long_seg.${SLURM_JOB_ID}.csv ./minimap2 -t 1 --max-chain-skip=2147483647 --gpu-chain /shareddata/umich_folder/data/ONT/hg38.mmi /shareddata/umich_folder/data/ONT/short_seg_reads_from_1kto10k_distri.fa
# rocprof --stats -o rocprof_output/long_seg.${SLURM_JOB_ID}.csv ./minimap2 -t 1 --max-chain-skip=2147483647 --gpu-chain /shareddata/umich_folder/data/ONT/hg38.mmi /shareddata/umich_folder/data/ONT/reads_4f452f4a-d82a-4580-981b-32d14b997217.fa
# omnitrace-sample -PTDH -E all -I rocm-smi -I roctracer -I rocprofiler -I roctx -o omni_output -- ./minimap2 -K 2000000000 -t 1 --max-chain-skip=2147483647 --gpu-chain /shareddata/umich_folder/data/ONT/hg38.mmi /shareddata/umich_folder/data/ONT/random_500MBases_200kto300k.fa
omnitrace-sample -PTDH -E all -I rocm-smi -I roctracer -I rocprofiler -I roctx -o omni_output -- ./minimap2 -K 2000000000 -t 1 --max-chain-skip=2147483647 --gpu-chain /shareddata/umich_folder/data/ONT/hg38.mmi /shareddata/umich_folder/data/ONT/random_2GBases_10kto300k.fa
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line runs the program under omnitrace.

Copy link
Author

@xenshinu xenshinu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow scripts/aac_omnitrace.slurm to profile the program with omnitrace.

gpu/plscore.cu Outdated
// init the first batch as the size of the grid
curr_long_segid = 0;
}
auto start = clock64();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use runtime print clock to see the time of each block

gpu/plscore.cu Outdated
curr_long_segid = 0;
}
auto start = clock64();
unsigned int segid = atomicAdd(&curr_long_segid, 1);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried start from bid and add atomic value in following cycles,

unsigned segid = bid;
while () {
...
segid = atomicAdd(&curr_long_segid, 1);
}

But this gives much worse performance.
I suppose the block is not launched in order, (i.e. block#0 might be launched after block#100)

#endif
unsigned int segid = atomicAdd(&curr_long_segid, 1);
while (segid < *long_seg_count) {
seg_t seg = long_seg[map[segid]]; // sorted
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Segfault here if compiled with -ggdb flag when using hipcc.

Comment on lines +834 to 837
// FIXME: temporary solution for microbatching
if (read_start < n_read) {
fprintf(stderr, "[WARNING] Unable to fit reads %d - %d into a microbatch. Fall back to cpu chaining\n", read_start, n_read-1);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporary fix: somethings the total number of anchors < MICRO_BATCH * batch_size, but cannot be divided into microbatches without cutting a read into two halves. If 1 or 2 reads are left behind because of this, we use CPU kenel in map.c. In the future, different threads creates different microbatch, and is responsible for making sure this does not happen.

Comment on lines +1077 to +1082
// FIXME: temporary solution for reads fail to fit in microbatch
// cpu kernel
for (kernel_batch.count; kernel_batch.count<tr->launched_batch.count; kernel_batch.count++) {
fprintf(stderr, "[WARNING] Run CPU kernel for read %d\n", iread);
mm_map_chain(s->p->mi, s->p->opt, &kernel_batch.reads[kernel_batch.count], b, kernel_batch.km);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporary fix for batches that cannot be divided evenly into microbatches.

Comment on lines +1034 to +1039
// FIXME: temporary solution for reads fail to fit in microbatch
// cpu kernel
for (kernel_batch.count; kernel_batch.count<tr->launched_batch.count; kernel_batch.count++) {
fprintf(stderr, "[WARNING] Run CPU kernel for read %d\n", kernel_batch.count);
mm_map_chain(s->p->mi, s->p->opt, &kernel_batch.reads[kernel_batch.count], b, tr->launched_batch.km);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporay fix for batches that cannot be evenly divided into micro batches.

Copy link
Member

@joydddd joydddd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to enable backtracking after performance testing! (in plchain.cu, search for FIXME)

@joydddd joydddd merged commit ec8a340 into gpu_kernel Mar 14, 2024
joydddd added a commit that referenced this pull request Mar 18, 2024
Squashed commit of the following:

commit 938a45f
Author: joydddd <joydong@umich.edu>
Date:   Mon Mar 18 13:26:25 2024 -0400

    Add planalyze.cu(h)

commit fee935f
Merge: ec8a340 41062d0
Author: joydddd <joydong@umich.edu>
Date:   Mon Mar 18 13:18:42 2024 -0400

    Merge branch 'gpu_kernel-break' into gpu_kernel

commit 41062d0
Author: joydddd <joydong@umich.edu>
Date:   Mon Mar 18 13:17:15 2024 -0400

    Clean up compile options. Move config to cmd option / gpu_config.json

commit ec8a340
Merge: 31dff82 3218873
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Wed Mar 13 23:37:55 2024 -0400

    Merge pull request #13 from Minimap2onGPU:gpu_kernel-break

    Add long segment aggregation. Cleaned up code for sysbio submission. TODO: Add README

commit 3218873
Author: joydddd <joydong@umich.edu>
Date:   Wed Mar 13 23:31:16 2024 -0400

    Update print compile time config

commit ad70b6a
Author: joydddd <joydong@umich.edu>
Date:   Wed Mar 13 23:09:51 2024 -0400

    cleanup gpu code for open source. TODO: Add README

commit 4d2459c
Merge: 13b2eab 5c742d9
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 13 19:34:56 2024 -0500

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit 13b2eab
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 13 19:34:05 2024 -0500

    Add range distribution analysis

commit f7ecde1
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 6 12:14:26 2024 -0600

    Add data analysis script

commit 5c742d9
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 6 12:14:26 2024 -0600

    Add data analysis script

commit 12c3fc0
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Mar 1 22:50:35 2024 -0600

    config aac that maximize memory usage

commit eed2640
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Mar 1 00:24:31 2024 -0600

    comment in kernel print to maximize tp

commit c982bb5
Merge: 94f333a d8ba447
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Feb 28 11:14:20 2024 -0600

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit 94f333a
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Feb 28 11:14:16 2024 -0600

    update plscore

commit d8ba447
Author: joydddd <joydong@umich.edu>
Date:   Mon Feb 26 12:40:55 2024 -0500

    Remove skip backtracking in GPU implementation. Outputs are correct

commit 6836cb7
Merge: 907748f 3239112
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 23 19:13:51 2024 -0600

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit 907748f
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 23 19:13:46 2024 -0600

    add seg count

commit 3239112
Author: joydddd <joydong@umich.edu>
Date:   Thu Feb 22 23:26:47 2024 -0500

    Add put long segs back to original reads, but output seems to be wrong??

commit 474e746
Author: joydddd <joydong@umich.edu>
Date:   Thu Feb 22 22:37:27 2024 -0500

    Temporal Fix microbacthing error (Use CPU kernel)

commit 66e2e54
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Feb 22 11:15:52 2024 -0600

    fix atomic add in long seg, only first thread in block add the atomic

commit c40bf9f
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sat Feb 17 00:38:19 2024 -0600

    fix atomicadd -> atomicsub, TODO: add more cudaCheck

commit 9678399
Author: Joy Dong <joydong@umich.edu>
Date:   Thu Feb 15 23:45:36 2024 -0600

    Fix throughput analysis

commit 39f758b
Author: Joy Dong <joydong@umich.edu>
Date:   Thu Feb 15 22:50:52 2024 -0500

    Edit throughput calculation. JIT Compilat error on cuda, push to try on HIP

commit b21c5f9
Author: Joy Dong <joydong@umich.edu>
Date:   Tue Feb 13 15:46:20 2024 -0500

    Update debug analysis

commit cff2e27
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 9 10:12:52 2024 -0600

    debug info control

commit accea33
Merge: d9e7396 e1248fb
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 9 09:44:56 2024 -0600

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit d9e7396
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 9 09:44:52 2024 -0600

    add atomic runtime balancing

commit e1248fb
Author: Joy Dong <joydong@umich.edu>
Date:   Thu Feb 8 20:01:02 2024 -0500

    Update throughput calculation, a6000 config

commit 4aeacfd
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Feb 8 12:48:15 2024 -0600

    add sorting technique

commit 8ee89c1
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Feb 6 15:09:15 2024 -0600

    update scripts

commit 216f2b2
Author: Joy Dong <joydong@umich.edu>
Date:   Fri Jan 26 12:01:24 2024 -0600

    Add kernel throughput calculatation

commit 15d0003
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Jan 10 21:11:34 2024 -0600

    change script path

commit aa62643
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jan 10 14:37:39 2024 -0600

    use hostmalloc to avoid step1 delay

commit 85f1cbb
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Nov 3 19:26:00 2023 -0500

    finish microbatch design, TODO: add batch number to config, and use hostmalloc

commit 1a585ab
Author: Joy Dong <joydong@umich.edu>
Date:   Mon Oct 30 16:23:50 2023 -0500

    Add acc_config. FIX seg fault for long_seg_count reset

commit cb1a30e
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Oct 27 13:56:12 2023 -0500

    no reset long seg on each micro batch cause fault

commit 5e1abe8
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Oct 27 11:34:49 2023 -0500

    finish minibatch, parameter is still hardcoded, debug function need fix, f_long is not copied back

commit 5598718
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Oct 26 09:46:48 2023 -0500

    add omnitrace scripts

commit 31dff82
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Thu Oct 19 23:01:03 2023 -0400

    Add aggregate long segs (#12)

commit cd225fa
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Oct 12 09:49:48 2023 -0500

    add new api translation, enable analysis print

commit 247a17f
Merge: 1d5c636 1e8781a
Author: joydddd <joydong@umich.edu>
Date:   Tue Oct 10 22:00:31 2023 -0400

    Merge commit '1e8781a' into gpu_kernel

commit 1e8781a
Author: joydddd <joydong@umich.edu>
Date:   Tue Oct 10 21:56:18 2023 -0400

    Fix debug functions

commit 1d5c636
Merge: d334c9e ccddf5c
Author: joydddd <joydong@umich.edu>
Date:   Wed Sep 27 11:22:11 2023 -0400

    Delete gpu/.depend

commit d334c9e
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Tue Sep 26 15:30:12 2023 -0400

    merge with aac code including CUDA timer (#9)

commit ccddf5c
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 26 16:55:27 2023 -0500

    add sample slurm

commit 7b65403
Merge: cabe321 6897137
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 26 16:41:38 2023 -0500

    Merge branch 'gpu_kernel' of github.com:Minimap2onGPU/minimap2 into gpu_kernel

commit cabe321
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 26 16:39:58 2023 -0500

    add profiling scripts, please check scripts/*.slurm

commit 6897137
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Tue Sep 26 15:30:12 2023 -0400

    merge with aac code including CUDA timer (#9)

commit e144d2d
Merge: 4044b9f 1d5d7ab
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 5 19:50:37 2023 -0400

    merge with aac code including CUDA timer

commit 4044b9f
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 5 19:47:31 2023 -0400

    remove const keyword

commit 1d5d7ab
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 5 17:52:39 2023 -0500

    add timer by event recorder

commit 10e7197
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:49:27 2023 -0400

    update parseing profile scripts

commit ae3ecd3
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:46:53 2023 -0400

    add profile scripts

commit cff2019
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:46:41 2023 -0400

    add profile scripts

commit 26a4044
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:45:37 2023 -0400

    add profile scripts

commit ced508a
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Jul 11 22:47:19 2023 -0400

    add cudacheck and uncomment syncthreads

commit a01dd39
Merge: ea9c438 10a5a8b
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Jul 11 16:03:45 2023 -0400

    merge with independent mid kernel

commit ea9c438
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Jul 11 15:19:51 2023 -0400

    add global config

commit 10a5a8b
Merge: 7018ece 18bc4ac
Author: joydddd <joydong@umich.edu>
Date:   Wed Jul 5 11:06:17 2023 -0400

    Merge branch 'gpu_kernel' of github.com:Minimap2onGPU/minimap2 into gpu_kernel

commit 7018ece
Author: joydddd <joydong@umich.edu>
Date:   Wed Jul 5 10:58:35 2023 -0400

    Add anchor compression

commit 18bc4ac
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jun 28 19:21:06 2023 -0400

    merge makefile with nvcc version

commit 5a12d5b
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jun 28 19:19:25 2023 -0400

    add template to mid kernel launch

commit 4318078
Author: joydddd <joydong@umich.edu>
Date:   Wed Jun 28 15:28:53 2023 -0400

    Add num segs printout

commit 5906e29
Merge: 9149e17 e6f016c
Author: joydddd <joydong@umich.edu>
Date:   Wed Jun 28 13:17:41 2023 -0400

    Merge branch 'gpu_kernel' of github.com:Minimap2onGPU/minimap2 into gpu_kernel

commit 9149e17
Author: joydddd <joydong@umich.edu>
Date:   Wed Jun 28 13:11:40 2023 -0400

    Add short-mid-long kernel

commit e6f016c
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sun Jun 25 12:21:50 2023 -0400

    add compile time launch bound using template

commit c601518
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sun Jun 25 12:15:49 2023 -0400

    add compile time launch bound

commit be01974
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sat Jun 10 16:16:23 2023 -0400

    disable new offset feature

commit 2d46499
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sat Jun 10 16:05:53 2023 -0400

    merge with cuda
joydddd added a commit that referenced this pull request Mar 19, 2024
* Merge gpu-kernel. Squashed commit of the following:

commit 41062d0
Author: joydddd <joydong@umich.edu>
Date:   Mon Mar 18 13:17:15 2024 -0400

    Clean up compile options. Move config to cmd option / gpu_config.json

commit 3218873
Author: joydddd <joydong@umich.edu>
Date:   Wed Mar 13 23:31:16 2024 -0400

    Update print compile time config

commit ad70b6a
Author: joydddd <joydong@umich.edu>
Date:   Wed Mar 13 23:09:51 2024 -0400

    cleanup gpu code for open source. TODO: Add README

commit 4d2459c
Merge: 13b2eab 5c742d9
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 13 19:34:56 2024 -0500

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit 13b2eab
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 13 19:34:05 2024 -0500

    Add range distribution analysis

commit f7ecde1
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 6 12:14:26 2024 -0600

    Add data analysis script

commit 5c742d9
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 6 12:14:26 2024 -0600

    Add data analysis script

commit 12c3fc0
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Mar 1 22:50:35 2024 -0600

    config aac that maximize memory usage

commit eed2640
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Mar 1 00:24:31 2024 -0600

    comment in kernel print to maximize tp

commit c982bb5
Merge: 94f333a d8ba447
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Feb 28 11:14:20 2024 -0600

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit 94f333a
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Feb 28 11:14:16 2024 -0600

    update plscore

commit d8ba447
Author: joydddd <joydong@umich.edu>
Date:   Mon Feb 26 12:40:55 2024 -0500

    Remove skip backtracking in GPU implementation. Outputs are correct

commit 6836cb7
Merge: 907748f 3239112
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 23 19:13:51 2024 -0600

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit 907748f
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 23 19:13:46 2024 -0600

    add seg count

commit 3239112
Author: joydddd <joydong@umich.edu>
Date:   Thu Feb 22 23:26:47 2024 -0500

    Add put long segs back to original reads, but output seems to be wrong??

commit 474e746
Author: joydddd <joydong@umich.edu>
Date:   Thu Feb 22 22:37:27 2024 -0500

    Temporal Fix microbacthing error (Use CPU kernel)

commit 66e2e54
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Feb 22 11:15:52 2024 -0600

    fix atomic add in long seg, only first thread in block add the atomic

commit c40bf9f
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sat Feb 17 00:38:19 2024 -0600

    fix atomicadd -> atomicsub, TODO: add more cudaCheck

commit 9678399
Author: Joy Dong <joydong@umich.edu>
Date:   Thu Feb 15 23:45:36 2024 -0600

    Fix throughput analysis

commit 39f758b
Author: Joy Dong <joydong@umich.edu>
Date:   Thu Feb 15 22:50:52 2024 -0500

    Edit throughput calculation. JIT Compilat error on cuda, push to try on HIP

commit b21c5f9
Author: Joy Dong <joydong@umich.edu>
Date:   Tue Feb 13 15:46:20 2024 -0500

    Update debug analysis

commit cff2e27
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 9 10:12:52 2024 -0600

    debug info control

commit accea33
Merge: d9e7396 e1248fb
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 9 09:44:56 2024 -0600

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit d9e7396
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 9 09:44:52 2024 -0600

    add atomic runtime balancing

commit e1248fb
Author: Joy Dong <joydong@umich.edu>
Date:   Thu Feb 8 20:01:02 2024 -0500

    Update throughput calculation, a6000 config

commit 4aeacfd
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Feb 8 12:48:15 2024 -0600

    add sorting technique

commit 8ee89c1
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Feb 6 15:09:15 2024 -0600

    update scripts

commit 216f2b2
Author: Joy Dong <joydong@umich.edu>
Date:   Fri Jan 26 12:01:24 2024 -0600

    Add kernel throughput calculatation

commit 15d0003
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Jan 10 21:11:34 2024 -0600

    change script path

commit aa62643
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jan 10 14:37:39 2024 -0600

    use hostmalloc to avoid step1 delay

commit 85f1cbb
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Nov 3 19:26:00 2023 -0500

    finish microbatch design, TODO: add batch number to config, and use hostmalloc

commit 1a585ab
Author: Joy Dong <joydong@umich.edu>
Date:   Mon Oct 30 16:23:50 2023 -0500

    Add acc_config. FIX seg fault for long_seg_count reset

commit cb1a30e
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Oct 27 13:56:12 2023 -0500

    no reset long seg on each micro batch cause fault

commit 5e1abe8
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Oct 27 11:34:49 2023 -0500

    finish minibatch, parameter is still hardcoded, debug function need fix, f_long is not copied back

commit 5598718
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Oct 26 09:46:48 2023 -0500

    add omnitrace scripts

commit 31dff82
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Thu Oct 19 23:01:03 2023 -0400

    Add aggregate long segs (#12)

commit cd225fa
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Oct 12 09:49:48 2023 -0500

    add new api translation, enable analysis print

commit 247a17f
Merge: 1d5c636 1e8781a
Author: joydddd <joydong@umich.edu>
Date:   Tue Oct 10 22:00:31 2023 -0400

    Merge commit '1e8781a' into gpu_kernel

commit 1e8781a
Author: joydddd <joydong@umich.edu>
Date:   Tue Oct 10 21:56:18 2023 -0400

    Fix debug functions

commit 1d5c636
Merge: d334c9e ccddf5c
Author: joydddd <joydong@umich.edu>
Date:   Wed Sep 27 11:22:11 2023 -0400

    Delete gpu/.depend

commit d334c9e
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Tue Sep 26 15:30:12 2023 -0400

    merge with aac code including CUDA timer (#9)

commit ccddf5c
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 26 16:55:27 2023 -0500

    add sample slurm

commit 7b65403
Merge: cabe321 6897137
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 26 16:41:38 2023 -0500

    Merge branch 'gpu_kernel' of github.com:Minimap2onGPU/minimap2 into gpu_kernel

commit cabe321
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 26 16:39:58 2023 -0500

    add profiling scripts, please check scripts/*.slurm

commit 6897137
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Tue Sep 26 15:30:12 2023 -0400

    merge with aac code including CUDA timer (#9)

commit e144d2d
Merge: 4044b9f 1d5d7ab
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 5 19:50:37 2023 -0400

    merge with aac code including CUDA timer

commit 4044b9f
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 5 19:47:31 2023 -0400

    remove const keyword

commit 1d5d7ab
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 5 17:52:39 2023 -0500

    add timer by event recorder

commit 10e7197
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:49:27 2023 -0400

    update parseing profile scripts

commit ae3ecd3
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:46:53 2023 -0400

    add profile scripts

commit cff2019
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:46:41 2023 -0400

    add profile scripts

commit 26a4044
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:45:37 2023 -0400

    add profile scripts

commit ced508a
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Jul 11 22:47:19 2023 -0400

    add cudacheck and uncomment syncthreads

commit a01dd39
Merge: ea9c438 10a5a8b
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Jul 11 16:03:45 2023 -0400

    merge with independent mid kernel

commit ea9c438
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Jul 11 15:19:51 2023 -0400

    add global config

commit 10a5a8b
Merge: 7018ece 18bc4ac
Author: joydddd <joydong@umich.edu>
Date:   Wed Jul 5 11:06:17 2023 -0400

    Merge branch 'gpu_kernel' of github.com:Minimap2onGPU/minimap2 into gpu_kernel

commit 7018ece
Author: joydddd <joydong@umich.edu>
Date:   Wed Jul 5 10:58:35 2023 -0400

    Add anchor compression

commit 18bc4ac
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jun 28 19:21:06 2023 -0400

    merge makefile with nvcc version

commit 5a12d5b
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jun 28 19:19:25 2023 -0400

    add template to mid kernel launch

commit 4318078
Author: joydddd <joydong@umich.edu>
Date:   Wed Jun 28 15:28:53 2023 -0400

    Add num segs printout

commit 5906e29
Merge: 9149e17 e6f016c
Author: joydddd <joydong@umich.edu>
Date:   Wed Jun 28 13:17:41 2023 -0400

    Merge branch 'gpu_kernel' of github.com:Minimap2onGPU/minimap2 into gpu_kernel

commit 9149e17
Author: joydddd <joydong@umich.edu>
Date:   Wed Jun 28 13:11:40 2023 -0400

    Add short-mid-long kernel

commit e6f016c
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sun Jun 25 12:21:50 2023 -0400

    add compile time launch bound using template

commit c601518
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sun Jun 25 12:15:49 2023 -0400

    add compile time launch bound

commit be01974
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sat Jun 10 16:16:23 2023 -0400

    disable new offset feature

commit 2d46499
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sat Jun 10 16:05:53 2023 -0400

    merge with cuda

* Revert "Merge gpu-kernel. Squashed commit of the following:"

This reverts commit 481f3a5.

* Merge gpu_kernel
Squashed commit of the following:

commit 938a45f
Author: joydddd <joydong@umich.edu>
Date:   Mon Mar 18 13:26:25 2024 -0400

    Add planalyze.cu(h)

commit fee935f
Merge: ec8a340 41062d0
Author: joydddd <joydong@umich.edu>
Date:   Mon Mar 18 13:18:42 2024 -0400

    Merge branch 'gpu_kernel-break' into gpu_kernel

commit 41062d0
Author: joydddd <joydong@umich.edu>
Date:   Mon Mar 18 13:17:15 2024 -0400

    Clean up compile options. Move config to cmd option / gpu_config.json

commit ec8a340
Merge: 31dff82 3218873
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Wed Mar 13 23:37:55 2024 -0400

    Merge pull request #13 from Minimap2onGPU:gpu_kernel-break

    Add long segment aggregation. Cleaned up code for sysbio submission. TODO: Add README

commit 3218873
Author: joydddd <joydong@umich.edu>
Date:   Wed Mar 13 23:31:16 2024 -0400

    Update print compile time config

commit ad70b6a
Author: joydddd <joydong@umich.edu>
Date:   Wed Mar 13 23:09:51 2024 -0400

    cleanup gpu code for open source. TODO: Add README

commit 4d2459c
Merge: 13b2eab 5c742d9
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 13 19:34:56 2024 -0500

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit 13b2eab
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 13 19:34:05 2024 -0500

    Add range distribution analysis

commit f7ecde1
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 6 12:14:26 2024 -0600

    Add data analysis script

commit 5c742d9
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Mar 6 12:14:26 2024 -0600

    Add data analysis script

commit 12c3fc0
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Mar 1 22:50:35 2024 -0600

    config aac that maximize memory usage

commit eed2640
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Mar 1 00:24:31 2024 -0600

    comment in kernel print to maximize tp

commit c982bb5
Merge: 94f333a d8ba447
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Feb 28 11:14:20 2024 -0600

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit 94f333a
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Feb 28 11:14:16 2024 -0600

    update plscore

commit d8ba447
Author: joydddd <joydong@umich.edu>
Date:   Mon Feb 26 12:40:55 2024 -0500

    Remove skip backtracking in GPU implementation. Outputs are correct

commit 6836cb7
Merge: 907748f 3239112
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 23 19:13:51 2024 -0600

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit 907748f
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 23 19:13:46 2024 -0600

    add seg count

commit 3239112
Author: joydddd <joydong@umich.edu>
Date:   Thu Feb 22 23:26:47 2024 -0500

    Add put long segs back to original reads, but output seems to be wrong??

commit 474e746
Author: joydddd <joydong@umich.edu>
Date:   Thu Feb 22 22:37:27 2024 -0500

    Temporal Fix microbacthing error (Use CPU kernel)

commit 66e2e54
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Feb 22 11:15:52 2024 -0600

    fix atomic add in long seg, only first thread in block add the atomic

commit c40bf9f
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sat Feb 17 00:38:19 2024 -0600

    fix atomicadd -> atomicsub, TODO: add more cudaCheck

commit 9678399
Author: Joy Dong <joydong@umich.edu>
Date:   Thu Feb 15 23:45:36 2024 -0600

    Fix throughput analysis

commit 39f758b
Author: Joy Dong <joydong@umich.edu>
Date:   Thu Feb 15 22:50:52 2024 -0500

    Edit throughput calculation. JIT Compilat error on cuda, push to try on HIP

commit b21c5f9
Author: Joy Dong <joydong@umich.edu>
Date:   Tue Feb 13 15:46:20 2024 -0500

    Update debug analysis

commit cff2e27
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 9 10:12:52 2024 -0600

    debug info control

commit accea33
Merge: d9e7396 e1248fb
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 9 09:44:56 2024 -0600

    Merge branch 'gpu_kernel-break' of github.com:Minimap2onGPU/minimap2 into gpu_kernel-break

commit d9e7396
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Feb 9 09:44:52 2024 -0600

    add atomic runtime balancing

commit e1248fb
Author: Joy Dong <joydong@umich.edu>
Date:   Thu Feb 8 20:01:02 2024 -0500

    Update throughput calculation, a6000 config

commit 4aeacfd
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Feb 8 12:48:15 2024 -0600

    add sorting technique

commit 8ee89c1
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Feb 6 15:09:15 2024 -0600

    update scripts

commit 216f2b2
Author: Joy Dong <joydong@umich.edu>
Date:   Fri Jan 26 12:01:24 2024 -0600

    Add kernel throughput calculatation

commit 15d0003
Author: Joy Dong <joydong@umich.edu>
Date:   Wed Jan 10 21:11:34 2024 -0600

    change script path

commit aa62643
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jan 10 14:37:39 2024 -0600

    use hostmalloc to avoid step1 delay

commit 85f1cbb
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Nov 3 19:26:00 2023 -0500

    finish microbatch design, TODO: add batch number to config, and use hostmalloc

commit 1a585ab
Author: Joy Dong <joydong@umich.edu>
Date:   Mon Oct 30 16:23:50 2023 -0500

    Add acc_config. FIX seg fault for long_seg_count reset

commit cb1a30e
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Oct 27 13:56:12 2023 -0500

    no reset long seg on each micro batch cause fault

commit 5e1abe8
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Fri Oct 27 11:34:49 2023 -0500

    finish minibatch, parameter is still hardcoded, debug function need fix, f_long is not copied back

commit 5598718
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Oct 26 09:46:48 2023 -0500

    add omnitrace scripts

commit 31dff82
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Thu Oct 19 23:01:03 2023 -0400

    Add aggregate long segs (#12)

commit cd225fa
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Thu Oct 12 09:49:48 2023 -0500

    add new api translation, enable analysis print

commit 247a17f
Merge: 1d5c636 1e8781a
Author: joydddd <joydong@umich.edu>
Date:   Tue Oct 10 22:00:31 2023 -0400

    Merge commit '1e8781a' into gpu_kernel

commit 1e8781a
Author: joydddd <joydong@umich.edu>
Date:   Tue Oct 10 21:56:18 2023 -0400

    Fix debug functions

commit 1d5c636
Merge: d334c9e ccddf5c
Author: joydddd <joydong@umich.edu>
Date:   Wed Sep 27 11:22:11 2023 -0400

    Delete gpu/.depend

commit d334c9e
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Tue Sep 26 15:30:12 2023 -0400

    merge with aac code including CUDA timer (#9)

commit ccddf5c
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 26 16:55:27 2023 -0500

    add sample slurm

commit 7b65403
Merge: cabe321 6897137
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 26 16:41:38 2023 -0500

    Merge branch 'gpu_kernel' of github.com:Minimap2onGPU/minimap2 into gpu_kernel

commit cabe321
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 26 16:39:58 2023 -0500

    add profiling scripts, please check scripts/*.slurm

commit 6897137
Author: Joy Juechu Dong <joydong@umich.edu>
Date:   Tue Sep 26 15:30:12 2023 -0400

    merge with aac code including CUDA timer (#9)

commit e144d2d
Merge: 4044b9f 1d5d7ab
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 5 19:50:37 2023 -0400

    merge with aac code including CUDA timer

commit 4044b9f
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 5 19:47:31 2023 -0400

    remove const keyword

commit 1d5d7ab
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Sep 5 17:52:39 2023 -0500

    add timer by event recorder

commit 10e7197
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:49:27 2023 -0400

    update parseing profile scripts

commit ae3ecd3
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:46:53 2023 -0400

    add profile scripts

commit cff2019
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:46:41 2023 -0400

    add profile scripts

commit 26a4044
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jul 12 10:45:37 2023 -0400

    add profile scripts

commit ced508a
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Jul 11 22:47:19 2023 -0400

    add cudacheck and uncomment syncthreads

commit a01dd39
Merge: ea9c438 10a5a8b
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Jul 11 16:03:45 2023 -0400

    merge with independent mid kernel

commit ea9c438
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Tue Jul 11 15:19:51 2023 -0400

    add global config

commit 10a5a8b
Merge: 7018ece 18bc4ac
Author: joydddd <joydong@umich.edu>
Date:   Wed Jul 5 11:06:17 2023 -0400

    Merge branch 'gpu_kernel' of github.com:Minimap2onGPU/minimap2 into gpu_kernel

commit 7018ece
Author: joydddd <joydong@umich.edu>
Date:   Wed Jul 5 10:58:35 2023 -0400

    Add anchor compression

commit 18bc4ac
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jun 28 19:21:06 2023 -0400

    merge makefile with nvcc version

commit 5a12d5b
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Wed Jun 28 19:19:25 2023 -0400

    add template to mid kernel launch

commit 4318078
Author: joydddd <joydong@umich.edu>
Date:   Wed Jun 28 15:28:53 2023 -0400

    Add num segs printout

commit 5906e29
Merge: 9149e17 e6f016c
Author: joydddd <joydong@umich.edu>
Date:   Wed Jun 28 13:17:41 2023 -0400

    Merge branch 'gpu_kernel' of github.com:Minimap2onGPU/minimap2 into gpu_kernel

commit 9149e17
Author: joydddd <joydong@umich.edu>
Date:   Wed Jun 28 13:11:40 2023 -0400

    Add short-mid-long kernel

commit e6f016c
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sun Jun 25 12:21:50 2023 -0400

    add compile time launch bound using template

commit c601518
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sun Jun 25 12:15:49 2023 -0400

    add compile time launch bound

commit be01974
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sat Jun 10 16:16:23 2023 -0400

    disable new offset feature

commit 2d46499
Author: Xueshen Liu <liuxs@umich.edu>
Date:   Sat Jun 10 16:05:53 2023 -0400

    merge with cuda

* FIXME: Warp Illegal Address at __shfl_sync()

* FIXME: seg start_idx end_idx out of range

* Fix cuda score generation kernel

* Fix merge error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants