Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix P-tuning for Llama based models (#9300)
* Fix P-tuning for Llama based models (#9297) * Added the BOS token for Llama, Mistral and Mixtral. Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com> * Don't load an existing TRT-LLM model before export to speed up the export process and avoid possible contamination from previous runs. Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com> * Apply isort and black reformatting Signed-off-by: apanteleev <apanteleev@users.noreply.github.com> --------- Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com> Signed-off-by: apanteleev <apanteleev@users.noreply.github.com> Co-authored-by: apanteleev <apanteleev@users.noreply.github.com> Co-authored-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com> * Fix the export test --------- Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com> Signed-off-by: apanteleev <apanteleev@users.noreply.github.com> Signed-off-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com> Co-authored-by: Alexey Panteleev <alpanteleev@nvidia.com> Co-authored-by: apanteleev <apanteleev@users.noreply.github.com> Co-authored-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com>
- Loading branch information