-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
training gpu hours #5
Comments
Hi @Li-Jicheng , I made some statistics: Stage-1: 12 nodes for 24 hours Each node has 16 NPUs with 64G memory. For more details, you may refer to our training logs in: https://huggingface.co/VITA-MLLM/Long-VITA-16K/raw/main/log_node11.txt https://huggingface.co/VITA-MLLM/Long-VITA-128K/raw/main/log_node31.txt https://huggingface.co/VITA-MLLM/Long-VITA-1M/raw/main/log_node31.txt |
Thank you for your prompt response. I have a couple of follow-up questions, if you’re willing to assist: 1.Long-VITA’s long image context capability is impressive. In my scenario, I’ll be working with prompts that include lengthy text instructions alongside a few images. Do you think Long-VITA is well-suited for this? Are there other models you’d recommend exploring for such tasks? 2.I have access to 4 nodes, each equipped with 8 A100 GPUs. (I can potentially scale to 8 nodes, but only for short-term experiments, up to a week at most.) Do you believe this hardware setup is sufficient to replicate results similar to Long-VITA’s? Appreciate your insights! |
|
Hi, great work! Can you share how many GPUs were used and the total training time? Thanks!
The text was updated successfully, but these errors were encountered: