How to fine a Flan T5 model on a single GPU for our dataset #2

Rami-Ismael · 2023-02-17T18:31:16Z

Dataset will be our baseline

We don't have a dataset yet with the entire physics dataset. Instead, we will use Scicen QA from 2022 as our benchmark. [1] to train, validate and test.

We using

Hugging Face Libraries
DeepSeed

Optimizer

The optimizer of choice will be determine computer resources
1. Performance and Time
  1. Adam W will return to good performances. However, it needs * bytes time the number of parameters
  2. 8 Bit Adam W will return good performances. This will be slower. This method does dynamic quantization.

What GPU are you using

Current Parameters

Read this

References

[1]

@inproceedings{lu2022learn,
    title={Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering},
    author={Lu, Pan and Mishra, Swaroop and Xia, Tony and Qiu, Liang and Chang, Kai-Wei and Zhu, Song-Chun and Tafjord, Oyvind and Clark, Peter and Ashwin Kalyan},
    booktitle={The 36th Conference on Neural Information Processing Systems (NeurIPS)},
    year={2022}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to fine a Flan T5 model on a single GPU for our dataset #2

How to fine a Flan T5 model on a single GPU for our dataset #2

Rami-Ismael commented Feb 17, 2023 •

edited

Loading

How to fine a Flan T5 model on a single GPU for our dataset #2

How to fine a Flan T5 model on a single GPU for our dataset #2

Comments

Rami-Ismael commented Feb 17, 2023 • edited Loading

Dataset will be our baseline

We using

Optimizer

What GPU are you using

Current Parameters

Read this

References

Rami-Ismael commented Feb 17, 2023 •

edited

Loading