MultiheadAttention 模块资料收集 #2

ccssu · 2023-04-04T05:55:20Z

小结

AttributeError: module 'oneflow.nn' has no attribute 'MultiheadAttention'

目前 MultiheadAttention 模块开发需要的模块有点多，准备python端先绕过。

MultiheadAttention 介绍

MultiheadAttention is a PyTorch module that implements the multi-head attention mechanism used in transformer architectures¹. It takes in inputs of shape (batch_size, seq_len, hidden_dim) and returns an output tensor of shape (batch_size, seq_len, hidden_dim).

The multi-head attention mechanism is used to compute attention scores between different parts of the input sequence. It does this by computing multiple attention scores in parallel, each with its own set of parameters¹.

Let me know if you have any other questions!

源: 与必应的对话， 2023/4/4(1) MultiHeadAttention实现详解 - 知乎. https://zhuanlan.zhihu.com/p/358206572 访问时间 2023/4/4.
(2) MultiHeadAttention实现详解 | Finisky Garden. https://finisky.github.io/2020/05/25/multiheadattention/ 访问时间 2023/4/4.
(3) マルチヘッドアテンション (Multi-head Attention) [Transformerの部品]. https://cvml-expertguide.net/terms/dl/seq2seq-translation/transformer/multi-head-attention/ 访问时间 2023/4/4.
(4) MultiheadAttention — PyTorch 2.0 documentation. https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html 访问时间 2023/4/4.
(5) tf.keras.layers.MultiHeadAttention | TensorFlow v2.12.0. https://www.tensorflow.org/api_docs/python/tf/keras/layers/MultiHeadAttention 访问时间 2023/4/4.
(6) MultiHeadAttention layer - Keras. https://keras.io/api/layers/attention_layers/multi_head_attention/ 访问时间 2023/4/4.

Pytorch

_native_multi_head_attention

声明

# aten/src/ATen/native/native_functions.yaml

- func: _native_multi_head_attention(Tensor query, Tensor key, Tensor value, int embed_dim, int num_head, Tensor qkv_weight, Tensor qkv_bias, Tensor proj_weight, Tensor proj_bias, Tensor? mask=None, bool need_weights=True, bool average_attn_weights=True, int? mask_type=None) -> (Tensor, Tensor)
  variants: function
  dispatch:
    CPU, NestedTensorCPU: native_multi_head_attention_cpu
    CUDA, NestedTensorCUDA: native_multi_head_attention_cuda
  autogen: _native_multi_head_attention.out

cpu 编码: aten/src/ATen/native/transformers/attention.cpp
cuda 编码: aten/src/ATen/native/transformers/cuda/attention.cu

Renferce

torch.nn.MultiheadAttention: link
has_torch_function： link
handle_torch_function: link
torch._C._nn.scaled_dot_product_attention link

The text was updated successfully, but these errors were encountered:

ccssu closed this as completed Apr 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultiheadAttention 模块资料收集 #2

MultiheadAttention 模块资料收集 #2

ccssu commented Apr 4, 2023

MultiheadAttention 模块资料收集 #2

MultiheadAttention 模块资料收集 #2

Comments

ccssu commented Apr 4, 2023

小结

MultiheadAttention 介绍

Pytorch

Renferce