Flash v2 #137

hypnopump · 2023-10-21T15:37:24Z

Uses flash attention v2. Works for any sequence length.

Flash attn v2 from pytorch for masked attention.
Custom flash attn v2 kernel for bias (triton-based, currently block size decided by autotuner, might need improvements to avoid recompilation).
- TF32 acceleration disabled by default, might need to check compute architecture and selectively enable it.

WARNING! Pls install the custom fork from trident provided below as a zip file. You might need to install triton-nightly afterwards (maybe also nvtx) to get it working (see triton repo for instructions). Check with ipython; import trident as td

trident.zip

* readme * add contributing and dco

* rename scripts * fix bug * refine scripts * fix renaming bug * keep 10 ckps * dump outputs * readme * hierachy * title * introduction * rm converter readme * intro * log interval * README * intro

…rp#7) * add the convert script and alphafold original configs * add the convert script for alphafold and modify the README * add the convert script for alphafold * merge from main

…eters (dptech-corp#11)

* add script for benchmarking * code clean * add benchmark in memory cost * remove use_lma * add option for LMA

* add evaluation results * par * compress img * figure layout * white bg * png -> jpg * Revert "figure layout" * revert to green

* Update README.md * Update README.md

* optional msa col attn mask * chunk attention * fix chunk_size * code clean Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

* add the colab version of Uni-Fold * add the description of Uni-Fold Colab at README

…-corp#21)

…ptech-corp#51)

* refine notebook * rephrase * rephrase * rephrase * rephrase * rm output * rephrase * rephrase Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

change default model name to avoid confusion

* fix uf-symmetry wget url * fix url

* do not download af2 params * do not download af params

* refactor the notebook * change branch * add init * fix auto chunk size * optim plot & fix pae * fix ssave file bug * merge cell * finalize

* fix wrong version * change from google drive to github release

* Update unifold.ipynb * Update unifold.ipynb * Update unifold.ipynb

* update get assembly from mmcif * add comment

* add dataset download via Volcengine * rephrase --------- Co-authored-by: Ziyao Li <36321246+ZiyaoLi@users.noreply.github.com>

* add symmetry utils to process input structures * fix import

…-corp#119) * tolerate inconsistencies between pdb_assembly and multi_label * Update dataset.py

hypnopump · 2023-10-21T15:38:11Z

train_monomer_demo.sh

@@ -2,15 +2,16 @@
 [ -z "${n_gpu}" ] && n_gpu=$(nvidia-smi -L | wc -l)
 export NCCL_ASYNC_ERROR_HANDLING=1
 export OMP_NUM_THREADS=1
-mkdir -p $1
+mkdir -p temp


once this proposal is validated, modifications to this file can be dropped; was just for easy testing of the code

guolinke · 2023-10-23T03:32:20Z

unifold/modules/attentions.py

+            k = k.view(*k.shape[:-1], self.num_heads, self.head_dim).transpose(-2, -3).contiguous()
+            v = v.view(*v.shape[:-1], self.num_heads, self.head_dim).transpose(-2, -3)
+            # (b, n, h, i, dim_head), (b, n, h, j, dim_head) -> (b, n, h, i, dim_head)
+            o = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask)


How about its speed/memory compared with the one with softmax_dropout?

ZiyaoLi and others added 30 commits August 1, 2022 14:35

Refine documentation (dptech-corp#1)

87a05d2

* readme * add contributing and dco

add the convert script and alphafold original configs (dptech-corp#2)

c900ca8

change scripts, example data and introduction (dptech-corp#3)

e2d555c

* rename scripts * fix bug * refine scripts * fix renaming bug * keep 10 ckps * dump outputs * readme * hierachy * title * introduction * rm converter readme * intro * log interval * README * intro

fix typos (dptech-corp#4)

aad1250

add docker auto-build workflow (dptech-corp#5)

6d24472

add a simple install command (dptech-corp#6)

087f49c

code clean for imports

7553b29

fix violation loss weight in monomer fine-tune

bb2d556

add the convert script for AlphaFold and modify the README (dptech-co…

2e30eb9

…rp#7) * add the convert script and alphafold original configs * add the convert script for alphafold and modify the README * add the convert script for alphafold * merge from main

improve details (dptech-corp#8)

0153644

add evaluation set (dptech-corp#9)

a2b00e8

add the description about downloading Uni-Fold pretrained model param…

f5a4157

…eters (dptech-corp#11)

add script for benchmarking & benchmark results (dptech-corp#10)

7c7ea7b

* add script for benchmarking * code clean * add benchmark in memory cost * remove use_lma * add option for LMA

dynamic length for extra_msa

d167d60

add config for model_1_ft

f50639e

fix feature stack of recycling

a97d7e9

Update README.md

d0c4c13

refine model parameters descriptions

32fbcf9

citation (dptech-corp#12)

c2c12c3

add evaluation results (dptech-corp#13)

b88286f

* add evaluation results * par * compress img * figure layout * white bg * png -> jpg * Revert "figure layout" * revert to green

link to nvidia-docker-2 (dptech-corp#16)

2f633b6

use bf16 in demo by default

e05de86

Hermite (dptech-corp#17)

782592c

* Update README.md * Update README.md

save extra msa mem (dptech-corp#18)

7fe51ed

* optional msa col attn mask * chunk attention * fix chunk_size * code clean Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

add the colab version of Uni-Fold (dptech-corp#19)

21754b8

* add the colab version of Uni-Fold * add the description of Uni-Fold Colab at README

fix the error of unexpected keyword argument in unifold.ipynb (dptech…

adb011d

…-corp#21)

hide the code blocks for default setting (dptech-corp#22)

cf6cb27

add citation

89f8efb

update the license in colab notebook

ffe132c

fix the error about pair.a3m (dptech-corp#24)

94fffd8

BaozCWJ and others added 25 commits September 16, 2022 10:46

change the download source of uf symmetry parameters to google drive (d…

8dae0b7

…ptech-corp#51)

Refine colab notebook (dptech-corp#52)

86ea44b

* refine notebook * rephrase * rephrase * rephrase * rephrase * rm output * rephrase * rephrase Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

add the default chains.txt for single-chain fasta (dptech-corp#54)

abe7f88

change default model name (dptech-corp#61)

4e65a4e

change default model name to avoid confusion

move model parameters to github release (dptech-corp#64)

029b078

fix uf-symmetry wget url (dptech-corp#68)

2ae159d

* fix uf-symmetry wget url * fix url

skip download af params (dptech-corp#70)

5a93161

* do not download af2 params * do not download af params

refine colab notebook (dptech-corp#72)

de21c73

* refactor the notebook * change branch * add init * fix auto chunk size * optim plot & fix pae * fix ssave file bug * merge cell * finalize

Add documentation about the full dataset download (dptech-corp#75)

6730e3b

refine cmd line for model names (dptech-corp#76)

1fd074b

move colab url from google drive to github release (dptech-corp#78)

5c223a1

* fix wrong version * change from google drive to github release

mv to py38 to satisfy colab (dptech-corp#79)

856e8ea

* Update unifold.ipynb * Update unifold.ipynb * Update unifold.ipynb

add script for the evaluation (dptech-corp#82)

5b87b7e

add script for label extraction (dptech-corp#89)

ab4b89a

fix type after model.float() (dptech-corp#92)

413c78b

update get assembly from mmcif (dptech-corp#96)

b57e54a

* update get assembly from mmcif * add comment

add dataset download via Volcengine (dptech-corp#103)

61edb37

* add dataset download via Volcengine * rephrase --------- Co-authored-by: Ziyao Li <36321246+ZiyaoLi@users.noreply.github.com>

Update unifold.ipynb (dptech-corp#109)

cc2ede9

add symmetry utils to process input structures (dptech-corp#110)

cc7a511

* add symmetry utils to process input structures * fix import

fix import libmsym (dptech-corp#117)

5596948

Update unifold.ipynb to cu118torch2.0.0-cp310 (dptech-corp#118)

a94b03c

tolerate inconsistencies between pdb_assembly and multi_label (dptech…

a272252

…-corp#119) * tolerate inconsistencies between pdb_assembly and multi_label * Update dataset.py

add alphafold v3 param parse & config (dptech-corp#122)

af278f0

correct load v3 (dptech-corp#125)

7e03aa2

adding flash v2 torch for mask, custom for bias

263d6dd

hypnopump commented Oct 21, 2023

View reviewed changes

guolinke reviewed Oct 23, 2023

View reviewed changes

hypnopump added 2 commits December 21, 2023 23:05

40% faster but grads are nan

9fa677d

40% faster but grads are nan

189443e

hypnopump force-pushed the hypnopump/flashattn2 branch from 27d1b4e to 189443e Compare December 21, 2023 22:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flash v2 #137

Flash v2 #137

hypnopump commented Oct 21, 2023

hypnopump Oct 21, 2023

guolinke Oct 23, 2023

Flash v2 #137

Are you sure you want to change the base?

Flash v2 #137

Conversation

hypnopump commented Oct 21, 2023

hypnopump Oct 21, 2023

Choose a reason for hiding this comment

guolinke Oct 23, 2023

Choose a reason for hiding this comment