MultiRC task- v1.0 results are better than v1.3 (Unknown reason, after reviewing changes) #1081

jeswan · 2020-09-17T18:35:57Z

Issue by varunchaudharycs
Wednesday Apr 29, 2020 at 06:22 GMT
Originally opened as nyu-mll/jiant#1081

Hi,
I'm trying out the MultiRC dataset task(Khasabi et. al 2018). Using the two jiant versions gave the following F1 scores(using BERT-base):
jiant v1.3 = ~58
jiant v1.0 = ~65
with both having the same configurations. I looked over the changes and found one:
v1.0 gives (para + ques + ans option) as BERT input
v1.3 gives ([CLS] para [SEP] ques + ans option [SEP]) as BERT input

I tried doing the same thing as v1.0 in v1.3 but observed no changes in the F1 score(rather F1 decreased).

Kindly let me know if I may try out some other changes between the two versions that I have missed earlier. This drastic drop in performance over version upgrade is hard to understand.

jeswan · 2020-09-17T18:35:59Z

Comment by W4ngatang
Tuesday May 05, 2020 at 02:45 GMT

Hey Varun,

Sorry to hear that. So we have more info, can you share your config file? Also, has the drop been consistent across random seeds before and after the version change?

zphang mentioned this issue Oct 16, 2020

MultiRC task- v1.0 results are better than v1.3 (Unknown reason, after reviewing changes) nyu-mll/jiant#1081

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultiRC task- v1.0 results are better than v1.3 (Unknown reason, after reviewing changes) #1081

MultiRC task- v1.0 results are better than v1.3 (Unknown reason, after reviewing changes) #1081

jeswan commented Sep 17, 2020

jeswan commented Sep 17, 2020

MultiRC task- v1.0 results are better than v1.3 (Unknown reason, after reviewing changes) #1081

MultiRC task- v1.0 results are better than v1.3 (Unknown reason, after reviewing changes) #1081

Comments

jeswan commented Sep 17, 2020

jeswan commented Sep 17, 2020