Add regressions for quantized BM25 dl-19 and dl-20 passage ranking (#…

…1889) Add regressions for dl-19 and dl-20 passage ranking with quantized BM25 weights.
castorini · May 26, 2022 · fc542b5 · fc542b5
1 parent dceae06
commit fc542b5
Show file tree

Hide file tree

Showing 7 changed files with 407 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -57,7 +57,7 @@ See individual pages for details!
 |--|:---:|:----:|:----:|
 | **Unsupervised Lexical** |
 | BoW baselines | [+](docs/regressions-msmarco-passage.md) | [+](docs/regressions-dl19-passage.md) | [+](docs/regressions-dl20-passage.md) |
-| Quantized BM25 | [+](docs/regressions-msmarco-passage-bm25-b8.md)
+| Quantized BM25 | [+](docs/regressions-msmarco-passage-bm25-b8.md) | [+](docs/regressions-dl19-passage-bm25-b8.md) | [+](docs/regressions-dl20-passage-bm25-b8.md) |
 | WP baselines | [+](docs/regressions-msmarco-passage-wp.md) | [+](docs/regressions-dl19-passage-wp.md) | [+](docs/regressions-dl20-passage-wp.md) |
 | doc2query | [+](docs/regressions-msmarco-passage-doc2query.md)
 | doc2query-T5 | [+](docs/regressions-msmarco-passage-docTTTTTquery.md) | [+](docs/regressions-dl19-passage-docTTTTTquery.md) | [+](docs/regressions-dl20-passage-docTTTTTquery.md) |

diff --git a/docs/regressions-dl19-passage-bm25-b8.md b/docs/regressions-dl19-passage-bm25-b8.md
@@ -0,0 +1,84 @@
+# Anserini Regressions: TREC 2019 Deep Learning Track (Passage)
+
+**Models**: BM25 with quantized weights (8 bits)
+
+This page describes baseline experiments, integrated into Anserini's regression testing framework, on the [TREC 2019 Deep Learning Track passage ranking task](https://trec.nist.gov/data/deep2019.html).
+
+Note that the NIST relevance judgments provide far more relevant passages per topic, unlike the "sparse" judgments provided by Microsoft (these are sometimes called "dense" judgments to emphasize this contrast).
+For additional instructions on working with MS MARCO passage collection, refer to [this page](experiments-msmarco-passage.md).
+
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl19-passage-bm25-b8.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl19-passage-bm25-b8.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:
+
+```
+python src/main/python/run_regression.py --index --verify --search --regression dl19-passage-bm25-b8
+```
+
+## Indexing
+
+Typical indexing command:
+
+```
+target/appassembler/bin/IndexCollection \
+  -collection JsonVectorCollection \
+  -input /path/to/msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage-bm25-b8/ \
+  -generator DefaultLuceneDocumentGenerator \
+  -threads 9 -impact -pretokenized \
+  >& logs/log.msmarco-passage &
+```
+
+The directory `/path/to/msmarco-passage/` should be a directory containing `jsonl` files containing quantized BM25 vectors for every document
+
+For additional details, see explanation of [common indexing options](common-indexing-options.md).
+
+## Retrieval
+
+Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/).
+The regression experiments here evaluate on the 43 topics for which NIST has provided judgments as part of the TREC 2019 Deep Learning Track.
+The original data can be found [here](https://trec.nist.gov/data/deep2019.html).
+
+After indexing has completed, you should be able to perform retrieval as follows:
+
+```
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.msmarco-passage-bm25-b8/ \
+  -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt \
+  -topicreader TsvInt \
+  -output runs/run.msmarco-passage.bm25-b8.topics.dl19-passage.txt \
+  -impact &
+```
+
+Evaluation can be performed using `trec_eval`:
+
+```
+tools/eval/trec_eval.9.0.4/trec_eval -m map -c -l 2 src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage.bm25-b8.topics.dl19-passage.txt
+tools/eval/trec_eval.9.0.4/trec_eval -m ndcg_cut.10 -c src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage.bm25-b8.topics.dl19-passage.txt
+tools/eval/trec_eval.9.0.4/trec_eval -m recall.100 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage.bm25-b8.topics.dl19-passage.txt
+tools/eval/trec_eval.9.0.4/trec_eval -m recall.1000 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage.bm25-b8.topics.dl19-passage.txt
+```
+
+## Effectiveness
+
+With the above commands, you should be able to reproduce the following results:
+
+| AP@1000                                                                                                      | BM25 (default parameters, quantized 8 bits)|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| [DL19 (Passage)](https://trec.nist.gov/data/deep2019.html)                                                   | 0.3046    |
+
+
+| nDCG@10                                                                                                      | BM25 (default parameters, quantized 8 bits)|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| [DL19 (Passage)](https://trec.nist.gov/data/deep2019.html)                                                   | 0.4993    |
+
+
+| R@100                                                                                                        | BM25 (default parameters, quantized 8 bits)|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| [DL19 (Passage)](https://trec.nist.gov/data/deep2019.html)                                                   | 0.4949    |
+
+
+| R@1000                                                                                                       | BM25 (default parameters, quantized 8 bits)|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| [DL19 (Passage)](https://trec.nist.gov/data/deep2019.html)                                                   | 0.7639    |
diff --git a/docs/regressions-dl20-passage-bm25-b8.md b/docs/regressions-dl20-passage-bm25-b8.md
@@ -0,0 +1,84 @@
+# Anserini Regressions: TREC 2020 Deep Learning Track (Passage)
+
+**Models**: BM25 with quantized weights (8 bits)
+
+This page describes baseline experiments, integrated into Anserini's regression testing framework, on the [TREC 2020 Deep Learning Track passage ranking task](https://trec.nist.gov/data/deep2020.html).
+
+Note that the NIST relevance judgments provide far more relevant passages per topic, unlike the "sparse" judgments provided by Microsoft (these are sometimes called "dense" judgments to emphasize this contrast).
+For additional instructions on working with MS MARCO passage collection, refer to [this page](experiments-msmarco-passage.md).
+
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl20-passage-bm25-b8.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl20-passage-bm25-b8.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:
+
+```
+python src/main/python/run_regression.py --index --verify --search --regression dl20-passage-bm25-b8
+```
+
+## Indexing
+
+Typical indexing command:
+
+```
+target/appassembler/bin/IndexCollection \
+  -collection JsonVectorCollection \
+  -input /path/to/msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage-bm25-b8/ \
+  -generator DefaultLuceneDocumentGenerator \
+  -threads 9 -impact -pretokenized \
+  >& logs/log.msmarco-passage &
+```
+
+The directory `/path/to/msmarco-passage/` should be a directory containing `jsonl` files containing quantized BM25 vectors for every document
+
+For additional details, see explanation of [common indexing options](common-indexing-options.md).
+
+## Retrieval
+
+Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/).
+The regression experiments here evaluate on the 54 topics for which NIST has provided judgments as part of the TREC 2020 Deep Learning Track.
+The original data can be found [here](https://trec.nist.gov/data/deep2020.html).
+
+After indexing has completed, you should be able to perform retrieval as follows:
+
+```
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.msmarco-passage-bm25-b8/ \
+  -topics src/main/resources/topics-and-qrels/topics.dl20.txt \
+  -topicreader TsvInt \
+  -output runs/run.msmarco-passage.bm25-b8.topics.dl20.txt \
+  -impact &
+```
+
+Evaluation can be performed using `trec_eval`:
+
+```
+tools/eval/trec_eval.9.0.4/trec_eval -m map -c -l 2 src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage.bm25-b8.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -m ndcg_cut.10 -c src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage.bm25-b8.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -m recall.100 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage.bm25-b8.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -m recall.1000 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage.bm25-b8.topics.dl20.txt
+```
+
+## Effectiveness
+
+With the above commands, you should be able to reproduce the following results:
+
+| AP@1000                                                                                                      | BM25 (default parameters, quantized 8 bits)|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| [DL20 (Passage)](https://trec.nist.gov/data/deep2020.html)                                                   | 0.2911    |
+
+
+| nDCG@10                                                                                                      | BM25 (default parameters, quantized 8 bits)|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| [DL20 (Passage)](https://trec.nist.gov/data/deep2020.html)                                                   | 0.4852    |
+
+
+| R@100                                                                                                        | BM25 (default parameters, quantized 8 bits)|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| [DL20 (Passage)](https://trec.nist.gov/data/deep2020.html)                                                   | 0.5673    |
+
+
+| R@1000                                                                                                       | BM25 (default parameters, quantized 8 bits)|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| [DL20 (Passage)](https://trec.nist.gov/data/deep2020.html)                                                   | 0.8119    |
diff --git a/src/main/resources/docgen/templates/dl19-passage-bm25-b8.template b/src/main/resources/docgen/templates/dl19-passage-bm25-b8.template
@@ -0,0 +1,53 @@
+# Anserini Regressions: TREC 2019 Deep Learning Track (Passage)
+
+**Models**: BM25 with quantized weights (8 bits)
+
+This page describes baseline experiments, integrated into Anserini's regression testing framework, on the [TREC 2019 Deep Learning Track passage ranking task](https://trec.nist.gov/data/deep2019.html).
+
+Note that the NIST relevance judgments provide far more relevant passages per topic, unlike the "sparse" judgments provided by Microsoft (these are sometimes called "dense" judgments to emphasize this contrast).
+For additional instructions on working with MS MARCO passage collection, refer to [this page](experiments-msmarco-passage.md).
+
+The exact configurations for these regressions are stored in [this YAML file](${yaml}).
+Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:
+
+```
+python src/main/python/run_regression.py --index --verify --search --regression ${test_name}
+```
+
+## Indexing
+
+Typical indexing command:
+
+```
+${index_cmds}
+```
+
+The directory `/path/to/msmarco-passage/` should be a directory containing `jsonl` files containing quantized BM25 vectors for every document
+
+For additional details, see explanation of [common indexing options](common-indexing-options.md).
+
+## Retrieval
+
+Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/).
+The regression experiments here evaluate on the 43 topics for which NIST has provided judgments as part of the TREC 2019 Deep Learning Track.
+The original data can be found [here](https://trec.nist.gov/data/deep2019.html).
+
+After indexing has completed, you should be able to perform retrieval as follows:
+
+```
+${ranking_cmds}
+```
+
+Evaluation can be performed using `trec_eval`:
+
+```
+${eval_cmds}
+```
+
+## Effectiveness
+
+With the above commands, you should be able to reproduce the following results:
+
+${effectiveness}
diff --git a/src/main/resources/docgen/templates/dl20-passage-bm25-b8.template b/src/main/resources/docgen/templates/dl20-passage-bm25-b8.template
@@ -0,0 +1,53 @@
+# Anserini Regressions: TREC 2020 Deep Learning Track (Passage)
+
+**Models**: BM25 with quantized weights (8 bits)
+
+This page describes baseline experiments, integrated into Anserini's regression testing framework, on the [TREC 2020 Deep Learning Track passage ranking task](https://trec.nist.gov/data/deep2020.html).
+
+Note that the NIST relevance judgments provide far more relevant passages per topic, unlike the "sparse" judgments provided by Microsoft (these are sometimes called "dense" judgments to emphasize this contrast).
+For additional instructions on working with MS MARCO passage collection, refer to [this page](experiments-msmarco-passage.md).
+
+The exact configurations for these regressions are stored in [this YAML file](${yaml}).
+Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:
+
+```
+python src/main/python/run_regression.py --index --verify --search --regression ${test_name}
+```
+
+## Indexing
+
+Typical indexing command:
+
+```
+${index_cmds}
+```
+
+The directory `/path/to/msmarco-passage/` should be a directory containing `jsonl` files containing quantized BM25 vectors for every document
+
+For additional details, see explanation of [common indexing options](common-indexing-options.md).
+
+## Retrieval
+
+Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/).
+The regression experiments here evaluate on the 54 topics for which NIST has provided judgments as part of the TREC 2020 Deep Learning Track.
+The original data can be found [here](https://trec.nist.gov/data/deep2020.html).
+
+After indexing has completed, you should be able to perform retrieval as follows:
+
+```
+${ranking_cmds}
+```
+
+Evaluation can be performed using `trec_eval`:
+
+```
+${eval_cmds}
+```
+
+## Effectiveness
+
+With the above commands, you should be able to reproduce the following results:
+
+${effectiveness}
diff --git a/src/main/resources/regression/dl19-passage-bm25-b8.yaml b/src/main/resources/regression/dl19-passage-bm25-b8.yaml
@@ -0,0 +1,66 @@
+---
+corpus: msmarco-passage
+corpus_path: collections/msmarco/msmarco-passage-bm25-b8/
+
+index_path: indexes/lucene-index.msmarco-passage-bm25-b8/
+collection_class: JsonVectorCollection
+generator_class: DefaultLuceneDocumentGenerator
+index_threads: 9
+index_options: -impact -pretokenized
+index_stats:
+  documents: 8841823
+  documents (non-empty): 8841823
+  total terms: 11778323673
+
+metrics:
+  - metric: AP@1000
+    command: tools/eval/trec_eval.9.0.4/trec_eval
+    params: -m map -c -l 2
+    separator: "\t"
+    parse_index: 2
+    metric_precision: 4
+    can_combine: false
+  - metric: nDCG@10
+    command: tools/eval/trec_eval.9.0.4/trec_eval
+    params: -m ndcg_cut.10 -c
+    separator: "\t"
+    parse_index: 2
+    metric_precision: 4
+    can_combine: false
+  - metric: R@100
+    command: tools/eval/trec_eval.9.0.4/trec_eval
+    params: -m recall.100 -c -l 2
+    separator: "\t"
+    parse_index: 2
+    metric_precision: 4
+    can_combine: false
+  - metric: R@1000
+    command: tools/eval/trec_eval.9.0.4/trec_eval
+    params: -m recall.1000 -c -l 2
+    separator: "\t"
+    parse_index: 2
+    metric_precision: 4
+    can_combine: false
+
+topic_reader: TsvInt
+topic_root: src/main/resources/topics-and-qrels/
+qrels_root: src/main/resources/topics-and-qrels/
+topics:
+  - name: "[DL19 (Passage)](https://trec.nist.gov/data/deep2019.html)"
+    id: dl19
+    path: topics.dl19-passage.txt
+    qrel: qrels.dl19-passage.txt
+
+models:
+  - name: bm25-b8
+    display: BM25 (default parameters, quantized 8 bits)
+    params: -impact
+    results:
+      AP@1000:
+        - 0.3046
+      nDCG@10:
+        - 0.4993
+      R@100:
+        - 0.4949
+      R@1000:
+        - 0.7639