[Op] Add attention and bias_gelu ops #41

comaniac · 2023-02-06T23:42:52Z

Description

This is a prerequisite PR for adding HF GPT-2 schedule.

Implement attention ops that use flash-attention and xformrs.
Implement bias_gelu ops that use torchscript or torch compiler.
Change the GPT-Neo schedule to use these ops, so that now GPT-Neo schedule doesn't depend on epoi anymore. Later we will update the schedules of other example models accordingly.
[Test] Add -rxXs to let pytest print reasons of skipped tests.
[Docker] Update flash-attention commit hash which improves the kernel performance by ~12%. CI image is not updated because this change doesn't impact the functionality.

examples/gpt/schedule.py

comaniac · 2023-02-07T20:58:41Z

comaniac added 6 commits February 6, 2023 21:14

[Op] Add attention ops

e2d013a

mlp

1c9b091

lint

dc6c300

info

79b8de1

lint

4780242

fix order

8a2fc91

chhzh123 mentioned this pull request Feb 7, 2023

[Example] Use .fuse() primitive when possible #42

Merged

4 tasks

chhzh123 approved these changes Feb 7, 2023

View reviewed changes

examples/gpt/schedule.py Show resolved Hide resolved

record name and fix test

d880f26

comaniac merged commit d2dbaeb into awslabs:main Feb 7, 2023

comaniac deleted the op branch February 7, 2023 20:58