-
Notifications
You must be signed in to change notification settings - Fork 12k
llama : add RobertaForSequenceClassification reranker support #13875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will likely need improvements to work as intended with more than 1 label
Could you clarify what would be needed for multiple labels?
I haven't looked too deeply into it, but it has relevance for the loss function at least: |
I think the user code should also somehow query the number of labels, correct? |
Btw, you can trigger |
Ahhh, that's useful! |
Probably, which equals the length of the label array in metadata. |
@ggerganov Funny, test still fails, but I can't find anything in the logs? |
Did you check the |
Ah, that's weird, still the same error... |
Oh, great, |
The |
Yep, figured out a workaround, making a PR. |
@ggerganov I looked into it a little more, and as far as I can tell all that is needed is to extract 1 score per label per sequence. I'm not sure how to do that though, I can see that 1 score is extracted from the beginning of each sequence in |
We have to introduce some notion about the number of classification labels. When we have this notion, then we just need to adjust the constant llama.cpp/src/llama-context.cpp Lines 814 to 821 in 291f2b6
And update the comment in the public header: Lines 913 to 917 in 291f2b6
For example, we can add |
I already added |
Yes, I forgot we already have it. So it should be simple to add this functionality. Maybe as simple as: // extract `n_cls_out` scores:
embd_seq_out[seq_id].resize(n_cls_out);
ggml_backend_tensor_get_async(backend_embd, t_embd, embd_seq_out[seq_id].data(), (seq_id*n_cls_out)*sizeof(float), sizeof(float)*n_cls_out); |
Heh, that worked (I could have sworn I already tried this but only got 0)! I'll make a PR... :) |
@ggerganov I got it working, but I need a place to store the labels so I can provide them to the user. Unfortunately Edit: I found a workaround. :) |
Adds support for RoBERTa reranker.
Added variable cls_out shape support to be able to load model (and yes, this will likely need improvements to work as intended with more than 1 label, I'll leave that for a free-for-all follow up PR).
Also fixes JinaBert conversion error from #13858
cc/ @huydt84