Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-Tuned BERT-base on Squad v1. #47

Closed
Maaarcocr opened this issue Nov 20, 2018 · 2 comments
Closed

Fine-Tuned BERT-base on Squad v1. #47

Maaarcocr opened this issue Nov 20, 2018 · 2 comments

Comments

@Maaarcocr
Copy link

I have fine-tuned the TF model on SQuAD v1 and I've made the weights available at: https://s3.eu-west-2.amazonaws.com/nlpfiles/squad_bert_base.tgz

I get 88.5 FM using these weights on SQuAD dev. (If I recall correctly I get roughly 82 EM).

I think it may be beneficial to have these weights here, so that people could play with SQuAD and BERT without the need of fine-tuning, which requires a decent enough setup. Let me know what you think!

@thomwolf
Copy link
Member

Thanks for the details.
This PyTorch repo is starting to be used by a larger community so we would have to be a little more precise than just rough numbers if we want to include such pre-trained weights.
If you want to add your weights to the repo, you should convert the weights in the PyTorch repo model and get evaluation results on SQuAD with the PyTorch model so everybody has a clean knowledge of what they are using. Otherwise I think it's better that people do their own training and know what are the capabilities of the fine-tuned model they are using.
Feel free to come back and re-open the issue if this something you would like to do.

@wasiahmad
Copy link

wasiahmad commented Apr 18, 2019

@thomwolf On SQuAD v1.1, BERT (single) scored 85.083 EM and 91.835 F1 as reported in their paper but when I fine-tuned BERT using run_squad.py I got {"exact_match": 81.0975, "f1": 88.7005}. Why there is a difference? What I am missing?

stevezheng23 added a commit to stevezheng23/transformers that referenced this issue Mar 24, 2020
add mat-coqa runner with multitask + adversarial training support (co…
younesbelkada pushed a commit to younesbelkada/transformers that referenced this issue Nov 30, 2022
xloem pushed a commit to xloem/transformers that referenced this issue Apr 9, 2023
* Update trainer and model flows to accommodate sparseml

Disable FP16 on QAT start (huggingface#12)

* Override LRScheduler when using LRModifiers

* Disable FP16 on QAT start

* keep wrapped scaler object for training after disabling

Using QATMatMul in DistilBERT model class (huggingface#41)

Removed double quantization of output of context layer. (huggingface#45)

Fix DataParallel validation forward signatures (huggingface#47)

* Fix: DataParallel validation forward signatures

* Update: generalize forward_fn selection

Best model after epoch (huggingface#46)

fix sclaer check for non fp16 mode in trainer (huggingface#38)

Mobilebert QAT (huggingface#55)

* Remove duplicate quantization of vocabulary.

enable a QATWrapper for non-parameterized matmuls in BERT self attention (huggingface#9)

* Utils and auxillary changes

update Zoo stub loading for SparseZoo 1.1 refactor (huggingface#54)

add flag to signal NM integration is active (huggingface#32)

Add recipe_name to file names

* Fix errors introduced in manual cherry-pick upgrade

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
jameshennessytempus pushed a commit to jameshennessytempus/transformers that referenced this issue Jun 1, 2023
ocavue pushed a commit to ocavue/transformers that referenced this issue Sep 13, 2023
ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this issue Nov 14, 2024
* Update model list

* Update README.md

---------

Co-authored-by: Qubitium-modelcloud <qubitium@modelcloud.ai>
ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this issue Nov 14, 2024
…ngface#47) (huggingface#49)

* fix cannot pickle 'module' object for 8 bit

* remove unused import

* remove print

* check with tuple

* revert to len check

* add test for 8bit

* set same QuantizeConfig

* check if it's 4 bit

* fix grammar

* remove params

* it's not a list

* set gptqmodel_cuda back

* check is tuple

* format

* set desc_act=True

* set desc_act=True

* format

* format

* Refractor fix

* desc_act=True

---------

Co-authored-by: Qubitium <Qubitium@modelcloud.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants