-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/cross encoder trainer lambdaloss #4
base: feat/cross_encoder_trainer
Are you sure you want to change the base?
Feat/cross encoder trainer lambdaloss #4
Conversation
Hi @tomaarsen, I updated I've trained the model using the same parameters you used for ListNetLoss, and I'm very pleased with the results. Notably, there's an improvement compared to ListNetLoss on evaluation, and training on just 1 epoch produced better results compared to models trained on the old implementation of this loss with 20 epochs. I would kindly ask you to review these changes and let me know if you're satisfied with them. Additionally, I plan to remove the cache from the example training script once we finalize the implementation. I'm also considering completely removing this argument: reduction: Literal["sum", "mean"] = "sum" When "sum" is set, the loss is in hundreds (approximately ~190), however, when using "mean" I get something more reasonable (~1). What do you think? I am not sure if this argument gives any value, but maybe I am missing something Here's the model from my test: https://huggingface.co/Studeni/reranker-msmarco-v1.1-MiniLM-L12-H384-uncased-lambdaloss Lastly, I added |
Hi @tomaarsen, I was testing an idea where I wanted to see if we could get better results when expanding the Here is the trained model: https://huggingface.co/Studeni/reranker-msmarco-v1.1-MiniLM-L12-H384-uncased-lambdaloss-hard-neg |
I think exclusively having
I agree, I think this is fine.
I like it! Sounds good to me. I'll review this later today, and try and get it merged as well!
|
I made some changes here and there, and trained a few models:
I think this is just about ready to go!
|
LambdaLoss Implementation for Cross Encoder Trainer
This PR adds LambdaLoss functionality to the Cross Encoder Trainer feature.
Changes
Implementation Details
LambdaLoss is a pairwise loss function that can be used for ranking problems. It's particularly useful for information retrieval tasks where the relative order of results is more important than absolute scores.
The implementation includes flexible weighing schemes that allow for different prioritization strategies when training the model.
Reference: https://marc.najork.org/papers/cikm2018.pdf