Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update irst result table #1118

Merged
merged 64 commits into from
Apr 18, 2022
Merged
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
05504fd
change query convert in ltr
stephaniewhoo Mar 24, 2022
ff3d502
add query changes to irst
stephaniewhoo Mar 24, 2022
4576594
fix irst ranking bugs
stephaniewhoo Mar 24, 2022
24a553c
bug fix
stephaniewhoo Mar 24, 2022
62e88f7
remove debug info
stephaniewhoo Mar 24, 2022
7d5f992
final fix in irst
stephaniewhoo Mar 24, 2022
dd0f725
fix unit test bug
stephaniewhoo Mar 25, 2022
d75cffb
fix bug and change parameter name
stephaniewhoo Mar 25, 2022
46940d8
fix bug
stephaniewhoo Mar 25, 2022
8cefe8f
fix bug in topic and data parameter
stephaniewhoo Mar 25, 2022
45b8545
update doc
stephaniewhoo Mar 25, 2022
6c24a28
fix bug
stephaniewhoo Mar 25, 2022
d224524
edits based on comments
stephaniewhoo Mar 26, 2022
eb54199
fix bug
stephaniewhoo Mar 26, 2022
514778b
change irst param
stephaniewhoo Mar 26, 2022
76dabc8
fix typo
stephaniewhoo Mar 26, 2022
a3948ff
fix literal bug
stephaniewhoo Mar 27, 2022
7447cb1
change dash
stephaniewhoo Mar 27, 2022
5097711
update irst instruction and remove unecessary code
Mar 29, 2022
06fc989
clean parameter for irst instruction
Mar 29, 2022
43980cb
fix exception bug and remove unnecessary lines
Mar 30, 2022
33907ca
remove qrel parameter in irst unittest
Mar 30, 2022
699c41b
Merge branch 'master' of https://github.com/castorini/pyserini into l…
stephaniewhoo Mar 31, 2022
f402d3b
Merge branch 'ltr-refac' of https://github.com/stephaniewhoo/pyserini…
stephaniewhoo Mar 31, 2022
71e81e4
add token on-the-fly
stephaniewhoo Apr 3, 2022
5cfc224
add tf
stephaniewhoo Apr 3, 2022
96fb470
move bert tokenizer to class built-in object
Apr 3, 2022
d79d6cc
fix a bug on calculating collection frequency
Apr 4, 2022
a78f691
update score for irst msmarco doc
Apr 5, 2022
8237c15
Merge branch 'ltr-refac' of https://github.com/stephaniewhoo/pyserini…
stephaniewhoo Apr 5, 2022
d4b47fc
add generatemaxP function in irst search
Apr 5, 2022
88699f7
Merge branch 'ltr-refac' of https://github.com/stephaniewhoo/pyserini…
stephaniewhoo Apr 5, 2022
cf098af
fix typo
stephaniewhoo Apr 5, 2022
8e86dc5
add more description for IBM Model 1 and reactor the code with pep8 f…
Apr 6, 2022
4692164
refactor code for new official index
Apr 6, 2022
dcf98be
fix bug
Apr 6, 2022
d68c225
change bm25search from query to text_bert_tok query
Apr 7, 2022
8ed8cf3
first draft
stephaniewhoo Apr 8, 2022
d9c8756
add stat
stephaniewhoo Apr 8, 2022
f60b76b
resolve conflict
stephaniewhoo Apr 8, 2022
4899467
Merge branch 'master' of https://github.com/castorini/pyserini into w…
stephaniewhoo Apr 8, 2022
2ae0723
tweak
stephaniewhoo Apr 8, 2022
0f5f641
minor edits
stephaniewhoo Apr 9, 2022
cf46e4a
modify rerank
stephaniewhoo Apr 9, 2022
c517923
fix bm25 search
stephaniewhoo Apr 9, 2022
c2eeb73
finish wp stat option
stephaniewhoo Apr 9, 2022
6221245
add doc full
stephaniewhoo Apr 9, 2022
936e680
fix bug
stephaniewhoo Apr 10, 2022
877eb3b
fix typo + add truncation
stephaniewhoo Apr 11, 2022
58534ed
little fix on arg
stephaniewhoo Apr 11, 2022
3a216b7
add passage seg results
stephaniewhoo Apr 11, 2022
fdfa5e1
minor tweaks according to comments
stephaniewhoo Apr 11, 2022
6367030
add results
stephaniewhoo Apr 11, 2022
801762c
add results
stephaniewhoo Apr 12, 2022
8afd958
add score for full doc
Apr 13, 2022
69cf7b3
update result
stephaniewhoo Apr 14, 2022
6a79229
update typo in irst documentation, add k1 b two parameter for bm25 se…
Apr 14, 2022
7936341
fix typo on argparsing
Apr 14, 2022
204c645
update wp stat file
stephaniewhoo Apr 15, 2022
76a2b90
Merge branch 'master' of https://github.com/castorini/pyserini into w…
stephaniewhoo Apr 16, 2022
10d8287
update results
stephaniewhoo Apr 18, 2022
2f5de9f
update results
stephaniewhoo Apr 18, 2022
b7d9704
update doc-full dev and doc-seg dev score
Apr 18, 2022
43e3aaa
change output into trec format
Apr 18, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 24 additions & 24 deletions docs/experiments-msmarco-irst.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ python -m pyserini.search.lucene.irst \
--topics topics \
--translation-model irst_test/ibm_model_1_bert_tok_20211117/ \
--index msmarco-v1-passage \
--output irst_test/regression_test_sum.irst_topics.txt \
--output irst_test/regression_test_sum.irst_topics.trec \
--alpha 0.1 \
--wp-stat irst_test/bert_wp_term_freq.msmarco-passage.20220411.pickle
```
Expand All @@ -51,7 +51,7 @@ python -m pyserini.search.lucene.irst \
--topics topics \
--translation-model irst_test/ibm_model_1_bert_tok_20211117/ \
--index msmarco-v1-passage \
--output irst_test/regression_test_max.irst_topics.txt \
--output irst_test/regression_test_max.irst_topics.trec \
--alpha 0.3 \
--max-sim \
--wp-stat irst_test/bert_wp_term_freq.msmarco-passage.20220411.pickle
Expand All @@ -76,17 +76,17 @@ After the run finishes, we can also evaluate the results using the official MS M
For TREC DL 2019, use this command to evaluate your run file:

```bash
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -l 2 dl19-passage irst_test/regression_test_sum.dl19-passage.txt
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -l 2 dl19-passage irst_test/regression_test_sum.dl19-passage.trec
```

Similarly for TREC DL 2020,
```bash
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -l 2 dl20-passage irst_test/regression_test_sum.dl20.txt
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -l 2 dl20-passage irst_test/regression_test_sum.dl20.trec
```

For MS MARCO Passage V1, no need to use -l 2 option:
```bash
python -m pyserini.eval.trec_eval -c -M 10 -m ndcg_cut.10 -m map -m recip_rank msmarco-passage-dev-subset irst_test/regression_test_sum.msmarco-passage-dev-subset.txt
python -m pyserini.eval.trec_eval -c -M 10 -m ndcg_cut.10 -m map -m recip_rank msmarco-passage-dev-subset irst_test/regression_test_sum.msmarco-passage-dev-subset.trec
```

## Document Reranking
Expand Down Expand Up @@ -117,7 +117,7 @@ python -m pyserini.search.lucene.irst \
--translation-model irst_test/ibm_model_1_bert_tok_20211117/ \
--topics topics \
--index msmarco-v1-doc \
--output irst_test/regression_test_sum.irst_topics.txt \
--output irst_test/regression_test_sum.irst_topics.trec \
--alpha 0.3 \
--hits 1000 \
--wp-stat irst_test/bert_wp_term_freq.msmarco-doc.20220411.pickle
Expand All @@ -129,7 +129,7 @@ python -m pyserini.search.lucene.irst \
--translation-model irst_test/ibm_model_1_bert_tok_20211117/ \
--topics topics \
--index msmarco-v1-doc \
--output irst_test/regression_test_max.irst_topics.txt \
--output irst_test/regression_test_max.irst_topics.trec \
--alpha 0.3 \
--hits 1000 \
--max-sim \
Expand All @@ -155,17 +155,17 @@ We can use the official TREC evaluation tool, trec_eval, to compute other metric
For TREC DL 2019, use this command to evaluate your run file:

```bash
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -M 100 dl19-doc irst_test/regression_test_sum.dl19-doc-full.txt
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -M 100 dl19-doc irst_test/regression_test_sum.dl19-doc-full.trec
```

Similarly for TREC DL 2020
```bash
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -M 100 dl20-doc irst_test/regression_test_sum.dl20-doc-full.txt
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -M 100 dl20-doc irst_test/regression_test_sum.dl20-doc-full.trec
```

For MS MARCO Doc V1
```bash
python -m pyserini.eval.trec_eval -c -M 100 -m ndcg_cut.10 -m map -m recip_rank msmarco-doc-dev irst_test/regression_test_sum.msmarco-doc-full.txt
python -m pyserini.eval.trec_eval -c -M 100 -m ndcg_cut.10 -m map -m recip_rank msmarco-doc-dev irst_test/regression_test_sum.msmarco-doc-full.trec
```


Expand Down Expand Up @@ -200,7 +200,7 @@ python -m pyserini.search.lucene.irst \
--translation-model irst_test/ibm_model_1_bert_tok_20211117/ \
--topics topics \
--index msmarco-v1-doc-segmented \
--output irst_test/regression_test_sum.irst_topics.txt \
--output irst_test/regression_test_sum.irst_topics.trec \
--alpha 0.3 \
--segments \
--hits 10000 \
Expand All @@ -213,7 +213,7 @@ python -m pyserini.search.lucene.irst \
--translation-model irst_test/ibm_model_1_bert_tok_20211117/ \
--topics topics \
--index msmarco-v1-doc-segmented \
--output irst_test/regression_test_max.irst_topics.txt \
--output irst_test/regression_test_max.irst_topics.trec \
--alpha 0.3 \
--hits 10000 \
--segments \
Expand All @@ -240,17 +240,17 @@ We can use the official TREC evaluation tool, trec_eval, to compute other metric
For TREC DL 2019, use this command to evaluate your run file:

```bash
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -M 100 dl19-doc irst_test/regression_test_sum.dl19-doc-seg.txt
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -M 100 dl19-doc irst_test/regression_test_sum.dl19-doc-seg.trec
```

Similarly for TREC DL 2020, no need to use -l 2 option:
```bash
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -M 100 dl20-doc irst_test/regression_test_max.dl20-doc-seg.txt
python -m pyserini.eval.trec_eval -c -m map -m ndcg_cut.10 -M 100 dl20-doc irst_test/regression_test_sum.dl20-doc-seg.trec
```

For MS MARCO Doc V1, no need to use -l 2 option:
```bash
python -m pyserini.eval.trec_eval -c -M 100 -m ndcg_cut.10 -m map -m recip_rank msmarco-doc-dev irst_test/regression_test_sum.msmarco-doc-seg.txt
python -m pyserini.eval.trec_eval -c -M 100 -m ndcg_cut.10 -m map -m recip_rank msmarco-doc-dev irst_test/regression_test_sum.msmarco-doc-seg.trec
```

## Results
Expand All @@ -259,9 +259,9 @@ python -m pyserini.eval.trec_eval -c -M 100 -m ndcg_cut.10 -m map -m recip_rank
| Topics | Method | MRR@10 | nDCG@10 | Map |
|:-------------------------|:------------------------|:------:|:--------:|:-----------:|
| DL19 | IRST(Sum) | - | 0.526 | 0.328 |
| DL19 | IRST(Max) | - | 0.537 | 0.328 |
| DL19 | IRST(Max) | - | 0.537 | 0.329 |
| DL20 | IRST(Sum) | -| 0.558 | 0.352 |
| DL20 | IRST(Max) | -| 0.546 | 0.337 |
| DL20 | IRST(Max) | -| 0.547 | 0.336 |
| MS MARCO Dev | IRST(Sum) | 0.221| - | - |
| MS MARCO Dev | IRST(Max) | 0.215| - | - |

Expand All @@ -270,20 +270,20 @@ python -m pyserini.eval.trec_eval -c -M 100 -m ndcg_cut.10 -m map -m recip_rank

| Topics | Method | MRR@100 | nDCG@10 | Map |
|:-------------------------|:------------------------|:------:|:--------:|:-----------:|
| DL19 | IRST(Sum) | - | 0.551 | 0.253 |
| DL19 | IRST(Max) | - | 0.491 | 0.221 |
| DL19 | IRST(Sum) | - | 0.549 | 0.252 |
| DL19 | IRST(Max) | - | 0.491 | 0.220 |
| DL20 | IRST(Sum) | - | 0.556 | 0.383 |
| DL20 | IRST(Max) | - | 0.502 | 0.337 |
| MS MARCO Dev | IRST(Sum) |0.303 | - | - |
| MS MARCO Dev | IRST(Max) |0.253 | - | - |
| MS MARCO Dev | IRST(Sum) |0.302 | - | - |
| MS MARCO Dev | IRST(Max) |0.252 | - | - |

### Document Segment Ranking Datasets

| Topics | Method | MRR@100 | nDCG@10 | Map |
|:-------------------------|:------------------------|:------:|:--------:|:-----------:|
| DL19 | IRST(Sum) | - | 0.560 | 0.271 |
| DL19 | IRST(Max) | - | 0.520 | 0.243 |
| DL20 | IRST(Sum) | - | 0.536 | 0.376 |
| DL20 | IRST(Max) | - | 0.510 | 0.350 |
| DL20 | IRST(Sum) | - | 0.534 | 0.376 |
| DL20 | IRST(Max) | - | 0.509 | 0.350 |
| MS MARCO Dev | IRST(Sum) |0.296 | - | - |
| MS MARCO Dev | IRST(Max) |0.260 | - | - |
| MS MARCO Dev | IRST(Max) |0.259 | - | - |