To submit the Ray job to the cluster (it will install all environment in a repeatable/cacheable way)
Another nice thing is to pin flash_attn to direct whl link from GitHub releases (might be done with uv run --with=https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.8cxx11abiTRUE-cp312-cp312-linux_x86_64.whl or similar for 2.8.1), this will avoid lengthy building from sources