-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Adds DebertaV2/V3 #2743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds DebertaV2/V3 #2743
Conversation
| pub type Label2Id = HashMap<String, u32>; | ||
|
|
||
| #[derive(Debug, Clone, PartialEq, Deserialize)] | ||
| pub struct Config { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a lot like the normal BERT configuration, but in going through the Python Transformers code I realized there were a lot of other configuration tidbits that didn't exist in other BERT models. So, I just started over with a new one just for Deberta.
|
Using a simple Python example, such as: from transformers import pipeline
pipe = pipeline("token-classification", model="Clinical-AI-Apollo/Medical-NER")
result = pipe('45 year old woman diagnosed with CAD')
print(f"{result}")This produces the following results: This is in comparison with using the cargo run
--example debertav2
--release
--features=cuda --
--model-id=Clinical-AI-Apollo/Medical-NER
--revision=main
--sentence='45 year old woman diagnosed with CAD'results in: There's a tiny amount of precision difference between the Python and Rust versions, but from my understanding it's so insignficant that it does not make a difference in accuracy nor performance. |
LaurentMazare
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good. I've put some mostly cosmetic comments in line, would be great if you can apply them to the whole file as I didn't bother repeating them. Mostly we should avoid everything that can panic, i.e. make actual errors rather than unwrap, use bail! to shorten error generation, etc.
No problem, I can do that later today. Thanks for reviewing it, and if you see any other things that should change please let me know. |
|
Ok I pushed up some updates. Feel free to review it at your leisure and let me know about anything else you find! |
|
Thanks for the PR! I've made some small tweaks to avoid some cases with |
|
@LaurentMazare Thanks! I appreciate the merge. This was a fairly complicated model, but there seems like there could be use for it in other projects. |
|
Deberta v3 large is the best model out there for text classification, it's top rated in kaggle competitions |
This adds Microsoft's Deberta V2 and V3 model into the
candleecosystem. It also includes an example file demonstrating how to use it, as well as a README for more information.At the time of this commit, the model can only do Named Entity Recognition and Text Classification. There are other modes such as Question Answering, Multiple Choice and Masked Input that could still be developed for this at a later point in time.