-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix retriever only training #25
Conversation
SachiraKuruppu
commented
Sep 16, 2023
- Use bits and bytes to reduce model size.
- move PEFT config inside the model to be consistent with the e2e RAG.
- Remove hardcoded values. Add command line arguments to make the script generic. Be able to train with different datasets.
- Reduce tokenizer max length to 128. Our current data is less than 100 words. (Maybe this should also be configurable via command line).
@rsachira could you also add the issue with the padding = True for the future references . Would super helpful if you can add a two sentence summary of our convo. :) |
57d00ea
to
14d1f6b
Compare
@shamanez This is what I remember from our conversation. Correct me if I'm wrong. In the tokenizer, setting the padding to
We can let the tokenizer dynamically select the token size, and pad to the longest length in the batch. This can be achieved by setting the padding to
However, this is not a good approach for us when it comes to in-batch negative contrastive learning, because it may order the input IDs according to the length. |
14d1f6b
to
401dff8
Compare
dc92f2f
to
e044e84
Compare
e044e84
to
d464824
Compare
self.model = AutoModel.from_pretrained(model_name, load_in_8bit=True, device_map={"": 0}) | ||
self.model = AutoModel.from_pretrained( | ||
model_name, | ||
load_in_8bit=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this line. You don't need this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done