Skip to content

Commit

Permalink
Update HC4 Regressions with Test Topics (#1924)
Browse files Browse the repository at this point in the history
* Add HC4 Test Regressions

* Fix HC4 Persian and Chinese documentation

* Reference HC4 paper in documentation

* fix paper reference in documentation
  • Loading branch information
ToluClassics authored Jul 13, 2022
1 parent 8010d5c commit 72f85aa
Show file tree
Hide file tree
Showing 18 changed files with 8,331 additions and 10 deletions.
20 changes: 18 additions & 2 deletions docs/regressions-hc4-v1.0-fa.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Persian

This page documents BM25 regression experiments for [HC4 (v1.0) — Persian](https://arxiv.org/pdf/2201.09992.pdf).
This page documents BM25 regression experiments for [HC4 (v1.0) — Persian](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/hc4-v1.0-fa.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/hc4-v1.0-fa.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down Expand Up @@ -58,13 +58,27 @@ target/appassembler/bin/SearchCollection \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.dev.desc.txt \
-bm25 -hits 100 -language fa &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-persian/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-fa.test.title.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.test.title.txt \
-bm25 -hits 100 -language fa &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-persian/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-fa.test.desc.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.test.desc.txt \
-bm25 -hits 100 -language fa &
```

Evaluation can be performed using `trec_eval`:

```
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-fa.dev.txt runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.dev.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-fa.dev.txt runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.dev.desc.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-fa.test.txt runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.test.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-fa.test.txt runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.test.desc.txt
```

## Effectiveness
Expand All @@ -75,5 +89,7 @@ With the above commands, you should be able to reproduce the following results:
|:-------------------------------------------------------------------------------------------------------------|-----------|
| [HC4 (Persian): dev-topic title](https://github.com/hltcoe/HC4) | 0.2919 |
| [HC4 (Persian): dev-topic description](https://github.com/hltcoe/HC4) | 0.3188 |
| [HC4 (Persian): test-topic title](https://github.com/hltcoe/HC4) | 0.2837 |
| [HC4 (Persian): test-topic description](https://github.com/hltcoe/HC4) | 0.2882 |

The Above results are reproduction of the BM25 title queries run in [table 7 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
The Above results are reproduction of the BM25 title queries run in [table 2 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
18 changes: 17 additions & 1 deletion docs/regressions-hc4-v1.0-ru.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Russian

This page documents BM25 regression experiments for [HC4 (v1.0) — Russian](https://github.com/hltcoe/HC4).
This page documents BM25 regression experiments for [HC4 (v1.0) — Russian](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/hc4-v1.0-ru.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/hc4-v1.0-ru.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down Expand Up @@ -59,13 +59,27 @@ target/appassembler/bin/SearchCollection \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.dev.desc.txt \
-bm25 -hits 100 -language ru &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-russian/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-ru.test.title.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.test.title.txt \
-bm25 -hits 100 -language ru &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-russian/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-ru.test.desc.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.test.desc.txt \
-bm25 -hits 100 -language ru &
```

Evaluation can be performed using `trec_eval`:

```
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-ru.dev.txt runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.dev.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-ru.dev.txt runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.dev.desc.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-ru.test.txt runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.test.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-ru.test.txt runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.test.desc.txt
```

## Effectiveness
Expand All @@ -76,3 +90,5 @@ With the above commands, you should be able to reproduce the following results:
|:-------------------------------------------------------------------------------------------------------------|-----------|
| [HC4 (Russian): dev-topic title](https://github.com/hltcoe/HC4) | 0.2767 |
| [HC4 (Russian): dev-topic description](https://github.com/hltcoe/HC4) | 0.2321 |
| [HC4 (Russian): test-topic title](https://github.com/hltcoe/HC4) | 0.2105 |
| [HC4 (Russian): test-topic description](https://github.com/hltcoe/HC4) | 0.1779 |
20 changes: 18 additions & 2 deletions docs/regressions-hc4-v1.0-zh.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Chinese

This page documents BM25 regression experiments for [HC4 (v1.0) — Chinese](https://github.com/hltcoe/HC4).
This page documents BM25 regression experiments for [HC4 (v1.0) — Chinese](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/hc4-v1.0-zh.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/hc4-v1.0-zh.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down Expand Up @@ -58,13 +58,27 @@ target/appassembler/bin/SearchCollection \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.dev.desc.txt \
-bm25 -hits 100 -language zh &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-chinese/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-zh.test.title.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.test.title.txt \
-bm25 -hits 100 -language zh &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-chinese/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-zh.test.desc.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.test.desc.txt \
-bm25 -hits 100 -language zh &
```

Evaluation can be performed using `trec_eval`:

```
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-zh.dev.txt runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.dev.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-zh.dev.txt runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.dev.desc.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-zh.test.txt runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.test.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-zh.test.txt runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.test.desc.txt
```

## Effectiveness
Expand All @@ -75,5 +89,7 @@ With the above commands, you should be able to reproduce the following results:
|:-------------------------------------------------------------------------------------------------------------|-----------|
| [HC4 (Chinese): dev-topic title](https://github.com/hltcoe/HC4) | 0.2914 |
| [HC4 (Chinese): dev-topic description](https://github.com/hltcoe/HC4) | 0.1983 |
| [HC4 (Chinese): test-topic title](https://github.com/hltcoe/HC4) | 0.1749 |
| [HC4 (Chinese): test-topic description](https://github.com/hltcoe/HC4) | 0.1404 |

The Above results are reproduction of the BM25 title queries run in [table 7 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
The Above results are reproduction of the BM25 title queries run in [table 2 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
4 changes: 2 additions & 2 deletions src/main/resources/docgen/templates/hc4-v1.0-fa.template
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Persian

This page documents BM25 regression experiments for [HC4 (v1.0) — Persian](https://arxiv.org/pdf/2201.09992.pdf).
This page documents BM25 regression experiments for [HC4 (v1.0) — Persian](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](${yaml}).
Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down Expand Up @@ -55,4 +55,4 @@ With the above commands, you should be able to reproduce the following results:

${effectiveness}

The Above results are reproduction of the BM25 title queries run in [table 7 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
The Above results are reproduction of the BM25 title queries run in [table 2 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
2 changes: 1 addition & 1 deletion src/main/resources/docgen/templates/hc4-v1.0-ru.template
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Russian

This page documents BM25 regression experiments for [HC4 (v1.0) — Russian](https://github.com/hltcoe/HC4).
This page documents BM25 regression experiments for [HC4 (v1.0) — Russian](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](${yaml}).
Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down
4 changes: 2 additions & 2 deletions src/main/resources/docgen/templates/hc4-v1.0-zh.template
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Chinese

This page documents BM25 regression experiments for [HC4 (v1.0) — Chinese](https://github.com/hltcoe/HC4).
This page documents BM25 regression experiments for [HC4 (v1.0) — Chinese](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](${yaml}).
Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down Expand Up @@ -55,4 +55,4 @@ With the above commands, you should be able to reproduce the following results:

${effectiveness}

The Above results are reproduction of the BM25 title queries run in [table 7 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
The Above results are reproduction of the BM25 title queries run in [table 2 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
10 changes: 10 additions & 0 deletions src/main/resources/regression/hc4-v1.0-fa.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@ topics:
id: dev_description
path: topics.hc4-v1.0-fa.dev.desc.tsv.gz
qrel: qrels.hc4-v1.0-fa.dev.txt
- name: "[HC4 (Persian): test-topic title](https://github.com/hltcoe/HC4)"
id: test_title
path: topics.hc4-v1.0-fa.test.title.tsv.gz
qrel: qrels.hc4-v1.0-fa.test.txt
- name: "[HC4 (Persian): test-topic description](https://github.com/hltcoe/HC4)"
id: test_description
path: topics.hc4-v1.0-fa.test.desc.tsv.gz
qrel: qrels.hc4-v1.0-fa.test.txt


models:
Expand All @@ -42,4 +50,6 @@ models:
MAP:
- 0.2919
- 0.3188
- 0.2837
- 0.2882

10 changes: 10 additions & 0 deletions src/main/resources/regression/hc4-v1.0-ru.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@ topics:
id: dev_description
path: topics.hc4-v1.0-ru.dev.desc.tsv.gz
qrel: qrels.hc4-v1.0-ru.dev.txt
- name: "[HC4 (Russian): test-topic title](https://github.com/hltcoe/HC4)"
id: test_title
path: topics.hc4-v1.0-ru.test.title.tsv.gz
qrel: qrels.hc4-v1.0-ru.test.txt
- name: "[HC4 (Russian): test-topic description](https://github.com/hltcoe/HC4)"
id: test_description
path: topics.hc4-v1.0-ru.test.desc.tsv.gz
qrel: qrels.hc4-v1.0-ru.test.txt


models:
Expand All @@ -42,4 +50,6 @@ models:
MAP:
- 0.2767
- 0.2321
- 0.2105
- 0.1779

10 changes: 10 additions & 0 deletions src/main/resources/regression/hc4-v1.0-zh.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@ topics:
id: dev_description
path: topics.hc4-v1.0-zh.dev.desc.tsv.gz
qrel: qrels.hc4-v1.0-zh.dev.txt
- name: "[HC4 (Chinese): test-topic title](https://github.com/hltcoe/HC4)"
id: test_title
path: topics.hc4-v1.0-zh.test.title.tsv.gz
qrel: qrels.hc4-v1.0-zh.test.txt
- name: "[HC4 (Chinese): test-topic description](https://github.com/hltcoe/HC4)"
id: test_description
path: topics.hc4-v1.0-zh.test.desc.tsv.gz
qrel: qrels.hc4-v1.0-zh.test.txt


models:
Expand All @@ -42,4 +50,6 @@ models:
MAP:
- 0.2914
- 0.1983
- 0.1749
- 0.1404

Loading

0 comments on commit 72f85aa

Please sign in to comment.