-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
couldn't replicate scores on en_conll2003 #16
Comments
@zhaoxf4 I've checked your data, they are identical to mine and I also tested on my best model they give the same results I reported in the paper. The log is attached below. I got at least 93 over 5 runs, 92.1 is way too low:) I can find three differences:
|
@juntaoy |
@wangxinyu0922 I did 3 runs using BiLSMT+CRF model on CoNLL 02 English, the best I can get is 91.9 a bit lower than the BERT paper's 92.8. They might get lucky for that number:) |
Thank you, could you give me some details for your CRF model so that I can do more experiments on the topic
|
Yes, it is the same as in my paper, apart from the window_size is not 128 but 512, so make sure you download the latest extract_bert_features.sh |
Can you share your extracted BERT embeddings? I used the latest version of
Except the embeddings, I find a difference between the logs that |
Well, I successfully reproduce the F1 after I use the jsonlines file provided by @zhaoxf4 .
|
@wangxinyu0922 the use_lee_lstm will not affect the results:) the final version does use the custom LSTM from Lee et al 2018 system. It is just some experiment I did to see if we can get the same results by using the default TensorFlow LSTM, the answer is yes the improvement from custom LSTM is minimal (0.1-0.2). |
@juntaoy OK, I think I have replicated it on eng_conll2003 approximately.
and evaluate your ACL2020 best model can gain:
Are there any random items that didn't set seed? I want to know the reason of differences between six times and I'm checking the code. |
I didn't set seed for the final version. I tried some times ago, and for an unknown reason, after I fixed both python seed and TensorFlow seed I still get different results, it somehow does not work for my code, not sure if this is because of the multiple threading I used. |
您好,我用了你提供的数据集和模型设置复现了eng_conll2003实验,但是我碰到了模型路径保存的问题,我不知道改如何修改模型保存路径。。。 |
I'm sorry to bother you but I couldn't replicate same scores on en_conll2003 dataset.
I only reproduced to 92.12, 1.4 lower than yours.
I check my dataset and ensure the labels are the same as conll2003 annotate software. So I think my en_conll2003 is complete. This dataset is download from here .
Then I write a script named "data_process.py"(python3) to convert conllx format to your dict format. I use '-DOCSTART- -X- -X- O' to split it. I think this script is no problem because there is no error in extract_feature.sh and evaluate.py and I also check the processed dataset. so I think the difference between your dataset and mine is very small.
environment:
the only difference is I replace fastText with glove 6B because I couldn't find fasttext/cc.en.300.vec.filtered, I just find a vector file named "cc.en.300.vec" from fastText. I met a error when I try "cc.en.300.vec", this is the first I try fastText and I don't know what happened. The error log ls as follow:
Then I continue to use glove 6B because I think embedding shouldn't have huge impact.
This is the log of the en_conll2003 Pre-trained models from your link "acl2020 best models":
This is the log of the model I train again:
The raw_en_conll2003 and the en_conll2003 I processed and the data_process.py(python3) are in sharelink1 or sharelink2, and the extract_feature.sh and experiments.conf I used also in it. You can check it.
Do you know what's wrong with me?
The dataset, the fastText embedding, or the hyperparameter?
Can you help me? Thank you.
The text was updated successfully, but these errors were encountered: