title

abstract

openreview

section

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Revisiting Kernel Attention with Correlated Gaussian Process Representation

Transformers have increasingly become the de facto method to model sequential data with state-of-the-art performance. Due to its widespread use, being able to estimate and calibrate its modeling uncertainty is important to understand and design robust transformer models. To achieve this, previous works have used Gaussian processes (GPs) to perform uncertainty calibration for the attention units of transformers and attained notable successes. However, such approaches have to confine the transformers to the space of symmetric attention to ensure the necessary symmetric requirement of their GP’s kernel specification, which reduces the representation capacity of the model. To mitigate this restriction, we propose the Correlated Gaussian Process Transformer (CGPT), a new class of transformers whose self-attention units are modeled as cross-covariance between two correlated GPs (CGPs). This allows asymmetries in attention and can enhance the representation capacity of GP-based transformers. We also derive a sparse approximation for CGP to make it scale better. Our empirical studies show that both CGP-based and sparse CGP-based transformers achieve better performance than state-of-the-art GP-based transformers on a variety of benchmark tasks.

xlIK0vu3MW

Papers

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

bui24a

0

Revisiting Kernel Attention with Correlated Gaussian Process Representation

450

470

450-470

450

false

Bui, Long Minh and Tran Huu, Tho and Dinh, Duy and Nguyen, Tan Minh and Hoang, Trong Nghia

given	family
Long Minh	Bui

given	family
Tho	Tran Huu

given	family
Duy	Dinh

given	family
Tan Minh	Nguyen

given	family
Trong Nghia	Hoang

2024-09-12

Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence

244

inproceedings

date-parts

2024

9

12

https://raw.githubusercontent.com/mlresearch/v244/main/assets/bui24a/bui24a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024-09-12-bui24a.md

2024-09-12-bui24a.md

Files

2024-09-12-bui24a.md

Latest commit

History

2024-09-12-bui24a.md

File metadata and controls