Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reproduced results and updated pygaggle/docs/experiments-msmarco-passage-subset.md #309

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
176 changes: 175 additions & 1 deletion docs/experiments-msmarco-passage-subset.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This page contains instructions for running various neural reranking baselines o
Note that there is also a separate [MS MARCO *document* ranking task](https://github.com/castorini/anserini/blob/master/docs/experiments-msmarco-doc.md).

Prior to running this, we suggest looking at our first-stage [BM25 ranking instructions](https://github.com/castorini/anserini/blob/master/docs/experiments-msmarco-passage.md).
We rerank the BM25 run files that contain ~1000 passages per query using both monoBERT and monoT5.
We rerank the BM25 run files that contain ~1000 passages per query using both monoBERT and monoT5.
monoBERT and monoT5 are pointwise rerankers. This means that each document is scored independently using either BERT or T5 respectively.

Since it can take many hours to run these models on all of the 6980 queries from the MS MARCO dev set, we will instead use a subset of 105 queries randomly sampled from the dev set.
Expand All @@ -25,6 +25,12 @@ Then install PyGaggle using:
pip install pygaggle/
```

Lastly install `faiss` using:

```
pip install faiss-cpu
```

## Models

+ monoBERT-Large: Passage Re-ranking with BERT [(Nogueira et al., 2019)](https://arxiv.org/pdf/1901.04085.pdf)
Expand All @@ -46,6 +52,24 @@ Next, we extract the contents into `data`.
unzip data/msmarco_ans_small.zip -d data
```

We should have these files in `data/msmarco_ans_small/`
```
ls data/msmarco_ans_small -1
qrels.dev.small.tsv
queries.dev.small.tsv
run.dev.small.tsv
scores
ronakice marked this conversation as resolved.
Show resolved Hide resolved
```

Let's also download MS MARCO passage dataset to visualize the actual passages after re-ranking.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we only downloading it to visualize it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since the data/msmarco_ans_small itself does not include the passages. Any suggestion on how we can only download the passages corresponding to data/msmarco_ans_small?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How were we reranking without this though?

```
mkdir collections/msmarco-passage

wget https://msmarco.blob.core.windows.net/msmarcoranking/collectionandqueries.tar.gz -P collections/msmarco-passage

tar xvfz collections/msmarco-passage/collectionandqueries.tar.gz -C collections/msmarco-passage
```

As a sanity check, we can evaluate the first-stage retrieved documents using the official MS MARCO evaluation script.

```
Expand All @@ -61,6 +85,48 @@ QueriesRanked: 105
#####################
```

<details>
<summary>What's going on here?</summary>

If you peak inside the `data/msmarco_ans_small/run.dev.small.tsv` file
```
head -5 data/msmarco_ans_small/run.dev.small.tsv
188714 2133570 1
188714 4321742 2
188714 4321745 3
188714 8523352 4
188714 3573129 5
```

You will notice that the first column is the `qid` corresponding to a query from `data/msmarco_ans_small/queries.dev.small.tsv` and the second column is the `docid` of the retrieved result (i.e., the hit), and the third column is the rank position. That is, in a search interface, for `qid` 188714 `docid` 2133570 would be shown in the top position, `docid` 4321742 would be shown in the second position, etc.

Now, let's see the actual query with `qid` 188714
```
grep 188714 data/msmarco_ans_small/qrels.dev.small.tsv
188714 foods and supplements to lower blood sugar
```


Let's also see the passage text of the first hit by grepping `docid` 2133570
```
grep 2133570 collections/msmarco-passage/collection.tsv
2133570 A healthy diet is essential to reversing prediabetes. There are no foods, herbs, drinks, or supplements that lower blood sugar. Only medication and exercise can. But there are things you can eat and drink that are low on the glycemic index (GI). This means these foods wonât raise your blood sugar and may help you avoid a blood sugar spike.
```
Let's verify if `docid` 2133570 is actually a relevant hit to our query (`qid` 188714) by checking the `data/msmarco_ans_small/qrels.dev.small.tsv` generated by human annotators

```
grep 188714 collections/msmarco-passage/qrels.dev.small.tsv
188714 0 8003843 1
188714 0 4321745 1
188714 0 8003849 1
```

Recall that in a `qrel` file, the first column is the `qid` of a certain query, the third is the `docid` of a passage, and the last column is whether or not the `docid` is a hit to the `qid` (`1` is a hit and `0` is not). In this case, notice that `docid` 2133570 does not appear in the third column of the passage hits for `qid` 188714, thus it is not a relevant passage that should be displayed to the user, especially at the top location!

We will later see if re-ranking using MonoBert and MonoT5 has helped with improving our hit rankings.
</details>
</br>

Let's download and extract the pre-built MS MARCO index into `indexes`:

```
Expand Down Expand Up @@ -102,6 +168,60 @@ In this case, assigning a batch size (using option `--batch-size`) which is smal

The re-ranked run file `run.monobert.ans_small.dev.tsv` will also be available in the `runs` directory upon completion.

<details>
<summary>What's going on here?</summary>

If you peak inside the generated `runs/run.monobert.ans_small.dev.tsv`
```
head -5 runs/run.monobert.ans_small.dev.tsv
188714 4321745 1
188714 6301923 2
188714 6442308 3
188714 1051360 4
188714 4816868 5
```
You will notice that the first column is the `qid` corresponding to a query from `data/msmarco_ans_small/queries.dev.small.tsv` and the second column is the `docid` of the retrieved result (i.e., the hit), and the third column is the rank position. That is, in a search interface, for `qid` 188714 `docid` 4321745 would be shown in the top position, `docid` 6301923 would be shown in the second position, etc.

Now, let's see the actual query with `qid` 188714
```
grep 188714 data/msmarco_ans_small/qrels.dev.small.tsv
188714 foods and supplements to lower blood sugar
```

Let's also see the passage text of the first hit by grepping `docid` 4321745
```
grep 4321745 collections/msmarco-passage/collection.tsv
4321745 Food And Supplements That Lower Blood Sugar Levels. Cinnamon: Researchers are finding that cinnamon reduces blood sugar levels naturally when taken daily. If you absolutely love cinnamon you can sprinkle the recommended six grams of cinnamon on your food throughout the day to achieve the desired effect.
```

In this case, the passage seems relevant to the query. Let's now compare this passage with the top passage hit from the original `data/msmarco_ans_small/run.dev.small.tsv`run file. Grep the first passage hit for `qid` 188714
```
grep 188714 data/msmarco_ans_small/run.dev.small.tsv | head -1
188714 2133570 1
```

Now, let's grep the passage with `docid` 2133570
```
grep 2133570 collections/msmarco-passage/collection.tsv
2133570 A healthy diet is essential to reversing prediabetes. There are no foods, herbs, drinks, or supplements that lower blood sugar. Only medication and exercise can. But there are things you can eat and drink that are low on the glycemic index (GI). This means these foods wonât raise your blood sugar and may help you avoid a blood sugar spike.
```

Notice that the top hit from the MonoBert re-ranked run file(`docid` 4321745) seems more relevant than the top hit from the original run file(`docid` 2133570) to the query with `qid` 188714.

Let's verify if `docid` 4321745 is actually a relevant hit to our query (`qid` 188714) by checking the `data/msmarco_ans_small/qrels.dev.small.tsv` generated by human annotators
```
grep 188714 collections/msmarco-passage/qrels.dev.small.tsv
188714 0 8003843 1
188714 0 4321745 1
188714 0 8003849 1
```

Recall that in a `qrel` file, the first column is the `qid` of a certain query, the third is the `docid` of a passage, and the last column is whether or not the `docid` is a hit to the `qid` (`1` is a hit and `0` is not). In this case, notice that `docid` 4321745 does appear in the third column of the passage hits relevant to `qid` 188714, thus it is a relevant passage that should be displayed to the user, unlike `docid` 2133570 (the top hit from the original run file) which does not appear at all as a relevant passage to `qid` 188714.

Thus, re-ranking with MonoBert certainly improved the top hit results.
</details>
</br>

We can use the official MS MARCO evaluation script to verify the MRR@10:

```
Expand Down Expand Up @@ -142,6 +262,60 @@ It is worth noting again that you might need to modify the batch size to best fi

Upon completion, the re-ranked run file `run.monot5.ans_small.dev.tsv` will be available in the `runs` directory.

<details>
<summary>What's going on here?</summary>

If you peak inside the generated `runs/run.monot5.ans_small.dev.tsv`
```
head -5 runs/run.monot5.ans_small.dev.tsv
188714 4321745 1
188714 1051360 2
188714 6442308 3
188714 5499899 4
188714 1022485 5
```
You will notice that the first column is the `qid` corresponding to a query from `data/msmarco_ans_small/queries.dev.small.tsv` and the second column is the `docid` of the retrieved result (i.e., the hit), and the third column is the rank position. That is, in a search interface, for `qid` 188714 `docid` 4321745 would be shown in the top position, `docid` 6301923 would be shown in the second position, etc.

Now, let's see the actual query with `qid` 188714
```
grep 188714 data/msmarco_ans_small/qrels.dev.small.tsv
188714 foods and supplements to lower blood sugar
```

Let's also see the passage text of the first hit by grepping `docid` 4321745
```
grep 4321745 collections/msmarco-passage/collection.tsv
4321745 Food And Supplements That Lower Blood Sugar Levels. Cinnamon: Researchers are finding that cinnamon reduces blood sugar levels naturally when taken daily. If you absolutely love cinnamon you can sprinkle the recommended six grams of cinnamon on your food throughout the day to achieve the desired effect.
```

In this case, the passage seems relevant to the query. Let's now compare this passage with the top passage hit from the original `data/msmarco_ans_small/run.dev.small.tsv`run file. Grep the top passage hit for `qid` 188714
```
grep 188714 data/msmarco_ans_small/run.dev.small.tsv | head -1
188714 2133570 1
```

Now, let's grep the passage with `docid` 2133570
```
grep 2133570 collections/msmarco-passage/collection.tsv
2133570 A healthy diet is essential to reversing prediabetes. There are no foods, herbs, drinks, or supplements that lower blood sugar. Only medication and exercise can. But there are things you can eat and drink that are low on the glycemic index (GI). This means these foods wonât raise your blood sugar and may help you avoid a blood sugar spike.
```

Notice that the top hit from the MonoT5 re-ranked run file(`docid` 4321745) seems more relevant than the top hit from the original run file(`docid` 2133570) to the query with `qid` 188714.

Let's verify if `docid` 4321745 is actually a relevant hit to our query (`qid` 188714) by checking the `data/msmarco_ans_small/qrels.dev.small.tsv` generated by human annotators
```
grep 188714 collections/msmarco-passage/qrels.dev.small.tsv
188714 0 8003843 1
188714 0 4321745 1
188714 0 8003849 1
```

Recall that in a `qrel` file, the first column is the `qid` of a certain query, the third is the `docid` of a passage, and the last column is whether or not the `docid` is a hit to the `qid`(`1` is a hit and `0` is not). In this case, notice that `docid` 4321745 does appear in the third column of the passage hits relevant to `qid` 188714, thus it is a relevant passage that should be displayed to the user, unlike `docid` 2133570 (the top hit from the original run file) which does not appear at all as a relevant passage to `qid` 188714.

Thus, re-ranking with MonoT5 certainly improved the top hit results.
</details>
</br>

We can use the official MS MARCO evaluation script to verify the MRR@10:

```
Expand Down