Skip to content

Latest commit

 

History

History
13 lines (11 loc) · 862 Bytes

README.md

File metadata and controls

13 lines (11 loc) · 862 Bytes

NAS for Transformers

Transformers have emerged as the leading architecture in deep learning for a wide range of applications, particularly in natural language processing (NLP) and computer vision (CV). Despite their success, designing effective Transformer models remains a complex and resource-intensive task due to their intricate architecture and the substantial computational demands of training and optimization. Neural Architecture Search (NAS) offers a promising solution to these challenges by automating the search for optimal Transformer architectures. In this report, I examine the key concepts related to NAS and Transformers, present notable results achieved in the NAS for Transformers field, and discuss existing limitations as well as potential future directions.

For a more detail analysis please refer to report_nas_for_transformers.pdf