SentenceTransformerTrainer compute_metrics #2888

Samoed · 2024-08-14T14:24:30Z

Hi! I tried to train my model and evaluate it using compute_metrics, but I didn't get any metrics. Is this a bug, or is it not supposed to work?

Code for test:
https://colab.research.google.com/drive/11sml4nfhkVVoZy0fTsll6BLgpHYz-BWD

The text was updated successfully, but these errors were encountered:

ir2718 · 2024-08-14T18:45:12Z

Hi,

I think this is the expected behaviour. but I'm not 100% sure. The compute_metrics function usually takes in a list of predictions which you can then use to calculate the metrics. With triplet training, the difference is in the evaluation procedure, such that the evaluators take in a SentenceTransformer model used to calculate the anchor, positive, and negative embeddings, which are then used for calculating metrics.

Besides, there is no obvious way to define false positives and false negatives with triplet information. The only thing you can say when you've got triplet data is whether a positive is closer to the anchor than the negative, hence accuracy is only calculated.

Samoed · 2024-08-14T19:47:22Z

There's no information about this expected behavior or not. Maybe printing warning if the function is passed will be good.

jacklanda · 2024-10-18T06:02:53Z

+1

tomaarsen · 2024-10-18T07:48:09Z

Hello!

The compute_metrics argument is inherited from the transformers Trainer superclass, where it is primarily used here: https://github.com/huggingface/transformers/blob/b54109c7466f6e680156fbd30fa929e2e222d730/src/transformers/trainer.py#L4184-L4192

This happens during every evaluation, but Sentence Transformer models don't usually train with single inputs & single outputs, so the logits and labels are both None here. To be precise, they are None because the prediction_step calls compute_loss, which is implemented in the SentenceTransformerTrainer:

sentence-transformers/sentence_transformers/trainer.py

Lines 345 to 351 in 1802076

    
           if return_outputs: 
        
               # During prediction/evaluation, `compute_loss` will be called with `return_outputs=True`. 
        
               # However, Sentence Transformer losses do not return outputs, so we return an empty dictionary. 
        
               # This does not result in any problems, as the SentenceTransformerTrainingArguments sets 
        
               # `prediction_loss_only=True` which means that the output is not used. 
        
               return loss, {} 
        
           return loss

As a result, the compute_metrics function is never called over the EvalPrediction.

In Sentence Transformers, if you want to compute evaluations, it is recommended to use one of the Evaluators. There's some more information about them here.

If you want to create your own evaluator, you can subclass the SentenceEvaluator class. You can pass an evaluator to the STTrainer, or even a list if you have multiple.

Tom Aarsen

Samoed · 2024-10-18T10:51:03Z

Thank you very much! Maybe add a warning about this? Because this is a bit unexpected behavior

tomaarsen mentioned this issue Oct 18, 2024

[warn] Throw a warning if compute_metrics is set, as it's not used #3002

Merged

tomaarsen closed this as completed in #3002 Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SentenceTransformerTrainer compute_metrics #2888

SentenceTransformerTrainer compute_metrics #2888

Samoed commented Aug 14, 2024

ir2718 commented Aug 14, 2024

Samoed commented Aug 14, 2024

jacklanda commented Oct 18, 2024

tomaarsen commented Oct 18, 2024

Samoed commented Oct 18, 2024

SentenceTransformerTrainer compute_metrics #2888

SentenceTransformerTrainer compute_metrics #2888

Comments

Samoed commented Aug 14, 2024

ir2718 commented Aug 14, 2024

Samoed commented Aug 14, 2024

jacklanda commented Oct 18, 2024

tomaarsen commented Oct 18, 2024

Samoed commented Oct 18, 2024