PyTorch implements a variety of Attention mechanisms used in network design in computer vision, as well as a collection of plug and play modules. Due to limited ability and energy, many modules may not be included.
If you have any suggestions or improvements, welcome to submit an issue or PR.
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, ICLR 2021, ViT
欢迎在issue中提出补充的文章paper和对应code链接。
感谢@dedekinds 指出的DIANet描述中存在的问题。
https://programmathically.com/understanding-padding-and-stride-in-convolutional-neural-networks/