NAS for Transformers

Transformers have emerged as the leading architecture in deep learning for a wide range of applications, particularly in natural language processing (NLP) and computer vision (CV). Despite their success, designing effective Transformer models remains a complex and resource-intensive task due to their intricate architecture and the substantial computational demands of training and optimization. Neural Architecture Search (NAS) offers a promising solution to these challenges by automating the search for optimal Transformer architectures. In this report, I examine the key concepts related to NAS and Transformers, present notable results achieved in the NAS for Transformers field, and discuss existing limitations as well as potential future directions.

For a more detail analysis please refer to report_nas_for_transformers.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NAS for Transformers

Files

README.md

Latest commit

History

README.md

File metadata and controls

NAS for Transformers