-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tried to train #1
Comments
Hi, I found that seems you are using the Tinyllama for training, instead of our model... # Specify the model name (replace '' with your actual model name)
model_name = 'TinyLlama/TinyLlama-1.1B-Chat-v1.0' |
Ye I don't understand. Am I to use literally '' or what is the model name?
My understanding is you show how to specify the model using the config, but
not any tokenizer nor model name
The instructions have me guessing I was to pick a tokenizer (similar to
mamba).
Can you provide a complete working example or tell me what would work in
place of '' (or is it just ''?)
…On Fri, Jun 7, 2024, 11:11 PM Ridger Zhu ***@***.***> wrote:
Hi, I found that seems you are using the Tinyllama for training, instead
of our model...
# Specify the model name (replace '' with your actual model name)model_name = 'TinyLlama/TinyLlama-1.1B-Chat-v1.0'
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHKKOTEC5VHQCNRBQSXNNLZGKN7TAVCNFSM6AAAAABI7SRL42VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJVHAZTANBYG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I re reviewed the Readme |
Revised code
but kernel bombs out |
Hi, does it work for inferencing instead of training? and do you use NVIDIA GPU for training? |
that's a great troubleshooting step First error I made was AutoModel, and changed it to the custom class definition then tried inference
results in kernel quitting (no console output). nvidia yes (cuda 12.2, python 3.10) |
Hmmm do you installed the triton==2.2? |
Yes I updated the first post w my env |
Thanks! I see. I check your envs, and finds your compute compatibility is 6.0, indicating you are using Pascal arch. The Pascal arch is may not support well to triton, which may lead to this problem I guess... I use our A100 and H100 to test our code, which works well... |
triton 2.2.0
torch 2.2.0
einops 0.7.0
compute compatibility 6.0
rocky linux 9
cuda 12.2
python 3.10
setup
error
The text was updated successfully, but these errors were encountered: