MS MARCO Document Retrieval Replication #137

Dahlia-Chehata · 2021-01-02T02:47:41Z

Re-Ranking with monoT5

Some values are not exactly the same as documented.
No issues found in replication.

First Half

precision@1 0.24
recall@3    0.44
recall@50   0.84
recall@1000 0.88
mrr         0.39115
mrr@10      0.38205

Second Half

precision@1 0.24
recall@3    0.36
recall@50   0.76
recall@1000 0.88
mrr         0.32717
mrr@10      0.30778

ronakice · 2021-01-05T11:49:05Z

Weird this is showing such large differences, were you able to isolate which PR could have possibly caused this? I'll look into it in a bit, although might take a while with some paper deadlines around. @richard3983 or @MXueguang perhaps one of you can check this out since you added this functionality.

MXueguang · 2021-01-07T21:11:34Z

https://colab.research.google.com/drive/1g_cmhSnPUZmatu1eQezMLgRBZzLV71LR?usp=sharing

seems the results are inconsistent. this is my colab link

first half

2021-01-07 20:08:00 [INFO] evaluate_document_ranker: precision@1 0.2
2021-01-07 20:08:00 [INFO] evaluate_document_ranker: recall@3    0.56
2021-01-07 20:08:00 [INFO] evaluate_document_ranker: recall@50   0.84
2021-01-07 20:08:00 [INFO] evaluate_document_ranker: recall@1000 0.88
2021-01-07 20:08:00 [INFO] evaluate_document_ranker: mrr         0.38882
2021-01-07 20:08:00 [INFO] evaluate_document_ranker: mrr@10      0.38271

yuxuan-ji · 2021-01-08T03:28:24Z

I think this might be related to #132? Seems like the weights aren't properly loaded here either. Not sure if that's expected

From the colab link above:

EDIT:
This happens even when I manually import the model (not using MonoT5.get_model) on transformers-4.0.0

EDIT:
From looking at a old replication I did, it seems like this didn't happen before:

EDIT:
Seems like it's not an issue?
huggingface/transformers#8933

EDIT:
From what I understand from huggingface/transformers#8518, it seems like it could cause very small difference in outputs.

The last replication done was on commit e815051, which used transformers==2.10.0, so I'd check if reverting transformers would fix this

ronakice · 2021-01-08T09:13:31Z

Reverting probably will, the issue is we should not revert because T5 implementation has since been "fixed" in hgf (earlier there were more inconsistencies). The weird bit here is our MS MARCO Passage results are consistent, is it the case that the tokenization differences have somehow not been fixed for Document Ranking @MXueguang ?

MXueguang · 2021-01-08T13:58:45Z

I am replicating on different commits, need to make sure if the issue comes from 4.0.0 upgrading. or locate the commit that this issue starts.

The T5 model is fixed, and based on the (huggingface/transformers#8933), (huggingface/transformers#8518), that warning should not be a big issue. If the issue is not 4.0.0 upgrading, since the Passage result never changed, the inconsistencies may come from the tokenization steps for doc in our implementation.

I think we need to locate the exact commit that causes this issue first. I tried to revert back to the commit of 4.0.0 upgrading and it has the issue. now I am running on the commit right before the upgrading see if that was good.

MXueguang · 2021-01-10T06:08:25Z

the inconsistency seems caused by the upgrading of transformers,
I compared the first 10 passages's (in first msmaro-doc) next_token_logits from decoder:

with transformers v4:

tensor([[-29.3127, -18.0408, -19.5026,  ..., -49.6962, -49.7317, -49.7931],
        [-29.6830, -18.3973, -19.9881,  ..., -49.9085, -49.9592, -50.0153],
        [-30.4825, -18.3005, -20.1569,  ..., -51.0060, -51.0359, -51.1103],
        ...,
        [-64.3875, -35.6707, -40.9387,  ..., -84.8546, -84.7865, -85.1493],
        [-65.1458, -36.1620, -41.0997,  ..., -85.3089, -85.2801, -85.6151],
        [-54.9698, -31.8845, -34.6949,  ..., -77.4517, -77.4881, -77.6714]],

with transformers v2:

tensor([[-29.3127, -18.0408, -19.5026,  ..., -49.6963, -49.7317, -49.7931],
        [-29.6830, -18.3973, -19.9881,  ..., -49.9085, -49.9592, -50.0154],
        [-30.4825, -18.3005, -20.1569,  ..., -51.0060, -51.0359, -51.1103],
        ...,
        [-64.3875, -35.6707, -40.9387,  ..., -84.8546, -84.7865, -85.1493],
        [-65.1458, -36.1620, -41.0997,  ..., -85.3089, -85.2801, -85.6150],
        [-54.9698, -31.8845, -34.6948,  ..., -77.4517, -77.4880, -77.6713]],
       device='cuda:0')

seems caused by upstream? @ronakice

KaiSun314 · 2021-01-14T17:04:18Z

I tried replicating PyGaggle: Baselines on MS MARCO Document Retrieval but got different results. Here are the results I got:

fh

sh

MXueguang · 2021-01-15T15:10:18Z

My replication result is identical to @KaiSun314 .
@Dahlia-Chehata are you replicated from current HEAD of master? and installed pygaggle in a new venv?

Dahlia-Chehata · 2021-01-15T21:06:23Z

@MXueguang yes, let me try reproduce the results one more time tho.

ronakice · 2021-01-27T10:36:32Z

Thanks @Dahlia-Chehata ! @KaiSun314 if you can make a separate PR now to say you've matched the new results it will be great :)

KaiSun314 · 2021-01-27T22:17:12Z

Yep, PR has been created

update replication log

5662395

KaiSun314 mentioned this pull request Jan 13, 2021

Fixed Document Title Type Mismatch #145

Merged

update results

d2fe8e9

ronakice merged commit 46bb71d into castorini:master Jan 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MS MARCO Document Retrieval Replication #137

MS MARCO Document Retrieval Replication #137

Dahlia-Chehata commented Jan 2, 2021

ronakice commented Jan 5, 2021

MXueguang commented Jan 7, 2021

yuxuan-ji commented Jan 8, 2021 •

edited

Loading

ronakice commented Jan 8, 2021

MXueguang commented Jan 8, 2021

MXueguang commented Jan 10, 2021 •

edited

Loading

KaiSun314 commented Jan 14, 2021

MXueguang commented Jan 15, 2021

Dahlia-Chehata commented Jan 15, 2021

ronakice commented Jan 27, 2021

KaiSun314 commented Jan 27, 2021

MS MARCO Document Retrieval Replication #137

MS MARCO Document Retrieval Replication #137

Conversation

Dahlia-Chehata commented Jan 2, 2021

ronakice commented Jan 5, 2021

MXueguang commented Jan 7, 2021

yuxuan-ji commented Jan 8, 2021 • edited Loading

ronakice commented Jan 8, 2021

MXueguang commented Jan 8, 2021

MXueguang commented Jan 10, 2021 • edited Loading

KaiSun314 commented Jan 14, 2021

MXueguang commented Jan 15, 2021

Dahlia-Chehata commented Jan 15, 2021

ronakice commented Jan 27, 2021

KaiSun314 commented Jan 27, 2021

yuxuan-ji commented Jan 8, 2021 •

edited

Loading

MXueguang commented Jan 10, 2021 •

edited

Loading