Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to fine a Flan T5 model on a single GPU for our dataset #2

Open
5 tasks
Rami-Ismael opened this issue Feb 17, 2023 · 0 comments
Open
5 tasks

How to fine a Flan T5 model on a single GPU for our dataset #2

Rami-Ismael opened this issue Feb 17, 2023 · 0 comments

Comments

@Rami-Ismael
Copy link
Owner

Rami-Ismael commented Feb 17, 2023

Dataset will be our baseline

  1. We don't have a dataset yet with the entire physics dataset. Instead, we will use Scicen QA from 2022 as our benchmark. [1] to train, validate and test.

We using

  1. Hugging Face Libraries
  2. DeepSeed

Optimizer

  1. The optimizer of choice will be determine computer resources
    1. Performance and Time
      1. Adam W will return to good performances. However, it needs * bytes time the number of parameters
      2. 8 Bit Adam W will return good performances. This will be slower. This method does dynamic quantization.

What GPU are you using

Current Parameters

Read this

References

[1]

@inproceedings{lu2022learn,
    title={Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering},
    author={Lu, Pan and Mishra, Swaroop and Xia, Tony and Qiu, Liang and Chang, Kai-Wei and Zhu, Song-Chun and Tafjord, Oyvind and Clark, Peter and Ashwin Kalyan},
    booktitle={The 36th Conference on Neural Information Processing Systems (NeurIPS)},
    year={2022}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant