Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
DiweiSun committed May 30, 2024
2 parents 70c1561 + 592aabf commit 5e91ea9
Show file tree
Hide file tree
Showing 77 changed files with 3,599 additions and 781 deletions.
31 changes: 31 additions & 0 deletions .lintrunner.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
[[linter]]
code = 'FLAKE8'
include_patterns = ['*.py']
exclude_patterns = [
'.git/**',
]
command = [
'python3',
'scripts/tools/setup/flake8.py',
'--',
'@{{PATHSFILE}}'
]

init_command = [
'python',
'-m',
'lintrunner_adapters',
'run',
'pip_init',
'--dry-run={{DRYRUN}}',
'flake8==3.8.2',
'flake8-bugbear==20.1.4',
'flake8-comprehensions==3.3.0',
'flake8-executable==2.0.4',
# 'git+https://github.com/malfet/flake8-coding.git',
'flake8-pyi==20.5.0',
'mccabe==0.6.1',
'pycodestyle==2.6.0',
'pyflakes==2.2.0',
'black==24.3.0',
]
4 changes: 3 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,9 @@ For example, if you wanted to run the test `MayContainAlias`, which is part of t
### Python Code
We can find python code style utils in `scripts/tools/setup` folder. Please install the related dependency python modules:
```bash
pip install -r scripts/tools/setup/requirements-flake8.txt
pip install lintrunner
pip install lintrunner-adapters
lintrunner init
```
Please run flake8.py to auto-format python code and check the python code style. The script will return results, please manual modify code follow the output information, and until it shows pass:
```bash
Expand Down
31 changes: 21 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Intel® Extension for PyTorch\*

</div>

**CPU** [💻main branch](https://github.com/intel/intel-extension-for-pytorch/tree/main)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[🌱Quick Start](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/getting_started.html)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[📖Documentations](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[🏃Installation](https://intel.github.io/intel-extension-for-pytorch/index.html#installation?platform=cpu&version=v2.2.0%2Bcpu)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[💻LLM Example](https://github.com/intel/intel-extension-for-pytorch/tree/main/examples/cpu/inference/python/llm) <br>
**CPU** [💻main branch](https://github.com/intel/intel-extension-for-pytorch/tree/main)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[🌱Quick Start](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/getting_started.html)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[📖Documentations](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[🏃Installation](https://intel.github.io/intel-extension-for-pytorch/index.html#installation?platform=cpu&version=v2.3.0%2Bcpu)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[💻LLM Example](https://github.com/intel/intel-extension-for-pytorch/tree/main/examples/cpu/inference/python/llm) <br>
**GPU** [💻main branch](https://github.com/intel/intel-extension-for-pytorch/tree/xpu-main)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[🌱Quick Start](https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/getting_started.html)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[📖Documentations](https://intel.github.io/intel-extension-for-pytorch/xpu/latest/)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[🏃Installation](https://intel.github.io/intel-extension-for-pytorch/index.html#installation?platform=gpu)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[💻LLM Example](https://github.com/intel/intel-extension-for-pytorch/tree/xpu-main/examples/gpu/inference/python/llm)<br>

Intel® Extension for PyTorch\* extends PyTorch\* with up-to-date features optimizations for an extra performance boost on Intel hardware. Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Vector Neural Network Instructions (VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel X<sup>e</sup> Matrix Extensions (XMX) AI engines on Intel discrete GPUs. Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs through the PyTorch* xpu device.
Expand All @@ -19,28 +19,35 @@ In the current technological landscape, Generative AI (GenAI) workloads and mode
| MODEL FAMILY | MODEL NAME (Huggingface hub) | FP32 | BF16 | Static quantization INT8 | Weight only quantization INT8 | Weight only quantization INT4 |
|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
|LLAMA| meta-llama/Llama-2-7b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟨 |
|LLAMA| meta-llama/Llama-2-13b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟨 |
|LLAMA| meta-llama/Llama-2-70b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟨 |
|LLAMA| meta-llama/Llama-2-13b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
|LLAMA| meta-llama/Llama-2-70b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
|LLAMA| meta-llama/Meta-Llama-3-8B | 🟩 | 🟩 | 🟨 | 🟩 | |
|LLAMA| meta-llama/Meta-Llama-3-70B | 🟩 | 🟩 | 🟨 | 🟩 | 🟨 |
|GPT-J| EleutherAI/gpt-j-6b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
|GPT-NEOX| EleutherAI/gpt-neox-20b | 🟩 | 🟨 | 🟨 | 🟩 | 🟨 |
|DOLLY| databricks/dolly-v2-12b | 🟩 | 🟨 | 🟨 | 🟩 | 🟨 |
|FALCON| tiiuae/falcon-7b | 🟩 | 🟩 | 🟩 | 🟩 | |
|FALCON| tiiuae/falcon-40b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
|OPT| facebook/opt-30b | 🟩 | 🟩 | 🟩 | 🟩 | 🟨 |
|OPT| facebook/opt-1.3b | 🟩 | 🟩 | 🟩 | 🟩 | 🟨 |
|Bloom| bigscience/bloom-1b7 | 🟩 | 🟨 | 🟩 | 🟩 | 🟨 |
|CodeGen| Salesforce/codegen-2B-multi | 🟩 | 🟩 | 🟨 | 🟩 | 🟩 |
|CodeGen| Salesforce/codegen-2B-multi | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
|Baichuan| baichuan-inc/Baichuan2-7B-Chat | 🟩 | 🟩 | 🟩 | 🟩 | |
|Baichuan| baichuan-inc/Baichuan2-13B-Chat | 🟩 | 🟩 | 🟩 | 🟩 | |
|Baichuan| baichuan-inc/Baichuan2-13B-Chat | 🟩 | 🟩 | 🟨 | 🟩 | |
|Baichuan| baichuan-inc/Baichuan-13B-Chat | 🟩 | 🟨 | 🟩 | 🟩 | |
|ChatGLM| THUDM/chatglm3-6b | 🟩 | 🟩 | 🟨 | 🟩 | |
|ChatGLM| THUDM/chatglm2-6b | 🟩 | 🟩 | 🟨 | 🟩 | |
|GPTBigCode| bigcode/starcoder | 🟩 | 🟩 | 🟨 | 🟩 | 🟨 |
|T5| google/flan-t5-xl | 🟩 | 🟩 | 🟨 | 🟩 | |
|T5| google/flan-t5-xl | 🟩 | 🟩 | | 🟩 | |
|MPT| mosaicml/mpt-7b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
|Mistral| mistralai/Mistral-7B-v0.1 | 🟩 | 🟩 | 🟨 | 🟩 | 🟨 |
|MPT| mosaicml/mpt-7b | 🟩 | 🟩 | 🟨 | 🟩 | 🟩 |
|Mixtral| mistralai/Mixtral-8x7B-v0.1 | 🟩 | 🟩 | | 🟩 | |
|Stablelm| stabilityai/stablelm-2-1_6b | 🟩 | 🟩 | | 🟨 | |
|Qwen| Qwen/Qwen-7B-Chat | 🟩 | 🟩 | | 🟩 | |
|Mixtral| mistralai/Mixtral-8x7B-v0.1 | 🟩 | 🟩 | | 🟩 | 🟨 |
|Stablelm| stabilityai/stablelm-2-1_6b | 🟩 | 🟩 | 🟨 | 🟩 | 🟨 |
|Qwen| Qwen/Qwen-7B-Chat | 🟩 | 🟩 | 🟨 | 🟩 | |
|LLaVA| liuhaotian/llava-v1.5-7b | 🟩 | 🟩 | | 🟩 | |
|GIT| microsoft/git-base | 🟩 | 🟩 | | 🟩 | |
|Yuan| IEITYuan/Yuan2-102B-hf | 🟩 | 🟩 | | 🟨 | |
|Phi| microsoft/phi-2 | 🟩 | 🟩 | 🟩 | 🟩 | 🟨 |

- 🟩 signifies that the model can perform well and with good accuracy (<1% difference as compared with FP32).

Expand All @@ -49,6 +56,10 @@ In the current technological landscape, Generative AI (GenAI) workloads and mode
*Note*: The above verified models (including other models in the same model family, like "codellama/CodeLlama-7b-hf" from LLAMA family) are well supported with all optimizations like indirect access KV cache, fused ROPE, and prepacked TPP Linear (fp32/bf16).
We are working in progress to better support the models in the tables with various data types. In addition, more models will be optimized in the future.

In addition, Intel® Extension for PyTorch* introduces module level optimization APIs (prototype feature) since release 2.3.0.
The feature provides optimized alternatives for several commonly used LLM modules and functionalities for the optimizations of the niche or customized LLMs.
Please read [**LLM module level optimization practice**](./examples/cpu/inference/python/llm-modeling) to better understand how to optimize your own LLM and achieve better performance.

## Support

The team tracks bugs and enhancement requests using [GitHub issues](https://github.com/intel/intel-extension-for-pytorch/issues/). Before submitting a suggestion or bug report, search the existing GitHub issues to see if your issue has already been reported.
Expand Down
8 changes: 0 additions & 8 deletions csrc/cpu/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -251,14 +251,6 @@ if(BUILD_STRIPPED_BIN)
set_target_properties(${PLUGIN_NAME_CPU} PROPERTIES LINK_FLAGS_RELEASE -s)
endif()

find_package(PythonLibs)
if(${PYTHONLIBS_FOUND})
target_link_libraries(${PLUGIN_NAME_CPU} PUBLIC ${PYTHON_LIBRARIES})
endif()

find_library(TORCH_PYTHON_LIBRARY torch_python PATH "${TORCH_INSTALL_PREFIX}/lib")
target_link_libraries(${PLUGIN_NAME_CPU} PRIVATE ${TORCH_LIBRARIES} ${TORCH_PYTHON_LIBRARY})

install(TARGETS ${PLUGIN_NAME_CPU}
ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR}
LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
Expand Down
29 changes: 28 additions & 1 deletion csrc/cpu/aten/AddLayerNorm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
// https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/layer_norm.cpp

#include "AddLayerNorm.h"

#include <torch/all.h>
#include <torch/csrc/autograd/function.h>

namespace torch_ipex {
Expand Down Expand Up @@ -57,5 +57,32 @@ at::Tensor dil_add_layernorm(
return at::layer_norm(add_res, normalized_shape, weight_opt, bias_opt, eps);
}
}

// register as a python op
at::Tensor add_layernorm(
const at::Tensor& a,
const at::Tensor& b,
int64_t alpha,
at::IntArrayRef normalized_shape,
const c10::optional<at::Tensor>& weight_opt,
const c10::optional<at::Tensor>& bias_opt,
double eps) {
RECORD_FUNCTION("add_layernorm", c10::ArrayRef<c10::IValue>({}));
return dil_add_layernorm(
a, b, alpha, normalized_shape, weight_opt, bias_opt, eps, false);
}

} // namespace cpu
} // namespace torch_ipex

namespace {

TORCH_LIBRARY_FRAGMENT(torch_ipex, m) {
m.def(
"add_layernorm(Tensor a, Tensor b, int alpha, int[] normalized_shape, Tensor ? weight_opt, \
Tensor ? bias_opt, float eps) -> Tensor");
m.impl(
"add_layernorm", c10::DispatchKey::CPU, torch_ipex::cpu::add_layernorm);
}

} // namespace
10 changes: 10 additions & 0 deletions csrc/cpu/aten/AddLayerNorm.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,16 @@ at::Tensor dil_add_layernorm(
float eps,
bool cuda_enable);

// register as a python op
at::Tensor add_layernorm(
const at::Tensor& a,
const at::Tensor& b,
int64_t alpha,
at::IntArrayRef normalized_shape,
const c10::optional<at::Tensor>& weight_opt,
const c10::optional<at::Tensor>& bias_opt,
double eps);

namespace {

at::Tensor add_layer_norm_kernel_impl(
Expand Down
Loading

0 comments on commit 5e91ea9

Please sign in to comment.