diff --git a/README.md b/README.md index df7ef962b4..977842632a 100644 --- a/README.md +++ b/README.md @@ -105,7 +105,7 @@ See individual pages for details! + Climate-FEVER: ["flat" baseline](docs/regressions-beir-v1.0.0-climate-fever-flat.md), ["multifield" baseline](docs/regressions-beir-v1.0.0-climate-fever-multifield.md), [SPLADE-distill CoCodenser-medium](docs/regressions-beir-v1.0.0-climate-fever-splade-distil-cocodenser-medium.md) + DBPedia: ["flat" baseline](docs/regressions-beir-v1.0.0-dbpedia-entity-flat.md), [SPLADE-distill CoCodenser-medium](docs/regressions-beir-v1.0.0-dbpedia-entity-splade-distil-cocodenser-medium.md) + FEVER: ["flat" baseline](docs/regressions-beir-v1.0.0-fever-flat.md), [SPLADE-distill CoCodenser-medium](docs/regressions-beir-v1.0.0-fever-splade-distil-cocodenser-medium.md) - + FiQA-2018: ["flat" baseline](docs/regressions-beir-v1.0.0-fiqa-flat.md), [SPLADE-distill CoCodenser-medium](docs/regressions-beir-v1.0.0-fiqa-splade-distil-cocodenser-medium.md) + + FiQA-2018: ["flat" baseline](docs/regressions-beir-v1.0.0-fiqa-flat.md), ["multifield" baseline](docs/regressions-beir-v1.0.0-fiqa-multifield.md), [SPLADE-distill CoCodenser-medium](docs/regressions-beir-v1.0.0-fiqa-splade-distil-cocodenser-medium.md) + HotpotQA: ["flat" baseline](docs/regressions-beir-v1.0.0-hotpotqa-flat.md), [SPLADE-distill CoCodenser-medium](docs/regressions-beir-v1.0.0-hotpotqa-splade-distil-cocodenser-medium.md) + NFCorpus: ["flat" baseline](docs/regressions-beir-v1.0.0-nfcorpus-flat.md), [SPLADE-distill CoCodenser-medium](docs/regressions-beir-v1.0.0-nfcorpus-splade-distil-cocodenser-medium.md) + NQ: ["flat" baseline](docs/regressions-beir-v1.0.0-nq-flat.md), [SPLADE-distill CoCodenser-medium](docs/regressions-beir-v1.0.0-nq-splade-distil-cocodenser-medium.md) diff --git a/docs/regressions-beir-v1.0.0-fiqa-multifield.md b/docs/regressions-beir-v1.0.0-fiqa-multifield.md new file mode 100644 index 0000000000..803ef72dc3 --- /dev/null +++ b/docs/regressions-beir-v1.0.0-fiqa-multifield.md @@ -0,0 +1,69 @@ +# Anserini Regressions: BEIR (v1.0.0) — fiqa + +This page documents BM25 regression experiments for [BEIR (v1.0.0) — fiqa](http://beir.ai/). +These experiments index the "title" and "text" fields in corpus separately. +At retrieval time, a query is issued across both fields (equally weighted). + +The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/beir-v1.0.0-fiqa-multifield.yaml). +Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/beir-v1.0.0-fiqa-multifield.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa-multifield +``` + +## Indexing + +Typical indexing command: + +``` +target/appassembler/bin/IndexCollection \ + -collection BeirMultifieldCollection \ + -input /path/to/beir-v1.0.0-fiqa-multifield \ + -index indexes/lucene-index.beir-v1.0.0-fiqa-multifield/ \ + -generator DefaultLuceneDocumentGenerator \ + -threads 1 -storePositions -storeDocvectors -storeRaw -fields title \ + >& logs/log.beir-v1.0.0-fiqa-multifield & +``` + +For additional details, see explanation of [common indexing options](common-indexing-options.md). + +## Retrieval + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +target/appassembler/bin/SearchCollection \ + -index indexes/lucene-index.beir-v1.0.0-fiqa-multifield/ \ + -topics src/main/resources/topics-and-qrels/topics.beir-v1.0.0-fiqa.test.tsv.gz \ + -topicreader TsvString \ + -output runs/run.beir-v1.0.0-fiqa-multifield.bm25.topics.beir-v1.0.0-fiqa.test.txt \ + -bm25 -removeQuery -hits 1000 -fields contents=1.0 title=1.0 & +``` + +Evaluation can be performed using `trec_eval`: + +``` +tools/eval/trec_eval.9.0.4/trec_eval -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa-multifield.bm25.topics.beir-v1.0.0-fiqa.test.txt +tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.100 src/main/resources/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa-multifield.bm25.topics.beir-v1.0.0-fiqa.test.txt +tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.beir-v1.0.0-fiqa.test.txt runs/run.beir-v1.0.0-fiqa-multifield.bm25.topics.beir-v1.0.0-fiqa.test.txt +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +| nDCG@10 | BM25 | +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): fiqa | 0.2361 | + + +| R@100 | BM25 | +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): fiqa | 0.5395 | + + +| R@1000 | BM25 | +|:-------------------------------------------------------------------------------------------------------------|-----------| +| BEIR (v1.0.0): fiqa | 0.7393 | diff --git a/src/main/resources/docgen/templates/beir-v1.0.0-fiqa-multifield.template b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa-multifield.template new file mode 100644 index 0000000000..eee370059d --- /dev/null +++ b/src/main/resources/docgen/templates/beir-v1.0.0-fiqa-multifield.template @@ -0,0 +1,44 @@ +# Anserini Regressions: BEIR (v1.0.0) — fiqa + +This page documents BM25 regression experiments for [BEIR (v1.0.0) — fiqa](http://beir.ai/). +These experiments index the "title" and "text" fields in corpus separately. +At retrieval time, a query is issued across both fields (equally weighted). + +The exact configurations for these regressions are stored in [this YAML file](${yaml}). +Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead. + +From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end: + +``` +python src/main/python/run_regression.py --index --verify --search --regression ${test_name} +``` + +## Indexing + +Typical indexing command: + +``` +${index_cmds} +``` + +For additional details, see explanation of [common indexing options](common-indexing-options.md). + +## Retrieval + +After indexing has completed, you should be able to perform retrieval as follows: + +``` +${ranking_cmds} +``` + +Evaluation can be performed using `trec_eval`: + +``` +${eval_cmds} +``` + +## Effectiveness + +With the above commands, you should be able to reproduce the following results: + +${effectiveness} diff --git a/src/main/resources/regression/beir-v1.0.0-fiqa-multifield.yaml b/src/main/resources/regression/beir-v1.0.0-fiqa-multifield.yaml new file mode 100644 index 0000000000..37effc5abf --- /dev/null +++ b/src/main/resources/regression/beir-v1.0.0-fiqa-multifield.yaml @@ -0,0 +1,57 @@ +--- +corpus: beir-v1.0.0-fiqa-multifield +corpus_path: collections/beir-v1.0.0/corpus/fiqa/ + +index_path: indexes/lucene-index.beir-v1.0.0-fiqa-multifield/ +collection_class: BeirMultifieldCollection +generator_class: DefaultLuceneDocumentGenerator +index_threads: 1 +index_options: -storePositions -storeDocvectors -storeRaw -fields title +index_stats: + documents: 57600 + documents (non-empty): 57600 + total terms: 5288635 + +metrics: + - metric: nDCG@10 + command: tools/eval/trec_eval.9.0.4/trec_eval + params: -c -m ndcg_cut.10 + separator: "\t" + parse_index: 2 + metric_precision: 4 + can_combine: false + - metric: R@100 + command: tools/eval/trec_eval.9.0.4/trec_eval + params: -c -m recall.100 + separator: "\t" + parse_index: 2 + metric_precision: 4 + can_combine: false + - metric: R@1000 + command: tools/eval/trec_eval.9.0.4/trec_eval + params: -c -m recall.1000 + separator: "\t" + parse_index: 2 + metric_precision: 4 + can_combine: false + +topic_reader: TsvString +topic_root: src/main/resources/topics-and-qrels/ +qrels_root: src/main/resources/topics-and-qrels/ +topics: + - name: "BEIR (v1.0.0): fiqa" + id: test + path: topics.beir-v1.0.0-fiqa.test.tsv.gz + qrel: qrels.beir-v1.0.0-fiqa.test.txt + +models: + - name: bm25 + display: BM25 + params: -bm25 -removeQuery -hits 1000 -fields contents=1.0 title=1.0 + results: + nDCG@10: + - 0.2361 + R@100: + - 0.5395 + R@1000: + - 0.7393 \ No newline at end of file