Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update HC4 Regressions with Test Topics #1924

Merged
merged 4 commits into from
Jul 13, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 18 additions & 2 deletions docs/regressions-hc4-v1.0-fa.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Persian

This page documents BM25 regression experiments for [HC4 (v1.0) — Persian](https://arxiv.org/pdf/2201.09992.pdf).
This page documents BM25 regression experiments for [HC4 (v1.0) — Persian](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/hc4-v1.0-fa.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/hc4-v1.0-fa.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down Expand Up @@ -58,13 +58,27 @@ target/appassembler/bin/SearchCollection \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.dev.desc.txt \
-bm25 -hits 100 -language fa &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-persian/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-fa.test.title.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.test.title.txt \
-bm25 -hits 100 -language fa &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-persian/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-fa.test.desc.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.test.desc.txt \
-bm25 -hits 100 -language fa &
```

Evaluation can be performed using `trec_eval`:

```
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-fa.dev.txt runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.dev.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-fa.dev.txt runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.dev.desc.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-fa.test.txt runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.test.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-fa.test.txt runs/run.hc4-v1.0-fa.bm25.topics.hc4-v1.0-fa.test.desc.txt
```

## Effectiveness
Expand All @@ -75,5 +89,7 @@ With the above commands, you should be able to reproduce the following results:
|:-------------------------------------------------------------------------------------------------------------|-----------|
| [HC4 (Persian): dev-topic title](https://github.com/hltcoe/HC4) | 0.2919 |
| [HC4 (Persian): dev-topic description](https://github.com/hltcoe/HC4) | 0.3188 |
| [HC4 (Persian): test-topic title](https://github.com/hltcoe/HC4) | 0.2837 |
| [HC4 (Persian): test-topic description](https://github.com/hltcoe/HC4) | 0.2882 |

The Above results are reproduction of the BM25 title queries run in [table 7 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
The Above results are reproduction of the BM25 title queries run in [table 2 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
18 changes: 17 additions & 1 deletion docs/regressions-hc4-v1.0-ru.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Russian

This page documents BM25 regression experiments for [HC4 (v1.0) — Russian](https://github.com/hltcoe/HC4).
This page documents BM25 regression experiments for [HC4 (v1.0) — Russian](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/hc4-v1.0-ru.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/hc4-v1.0-ru.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down Expand Up @@ -59,13 +59,27 @@ target/appassembler/bin/SearchCollection \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.dev.desc.txt \
-bm25 -hits 100 -language ru &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-russian/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-ru.test.title.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.test.title.txt \
-bm25 -hits 100 -language ru &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-russian/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-ru.test.desc.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.test.desc.txt \
-bm25 -hits 100 -language ru &
```

Evaluation can be performed using `trec_eval`:

```
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-ru.dev.txt runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.dev.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-ru.dev.txt runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.dev.desc.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-ru.test.txt runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.test.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-ru.test.txt runs/run.hc4-v1.0-ru.bm25.topics.hc4-v1.0-ru.test.desc.txt
```

## Effectiveness
Expand All @@ -76,3 +90,5 @@ With the above commands, you should be able to reproduce the following results:
|:-------------------------------------------------------------------------------------------------------------|-----------|
| [HC4 (Russian): dev-topic title](https://github.com/hltcoe/HC4) | 0.2767 |
| [HC4 (Russian): dev-topic description](https://github.com/hltcoe/HC4) | 0.2321 |
| [HC4 (Russian): test-topic title](https://github.com/hltcoe/HC4) | 0.2105 |
| [HC4 (Russian): test-topic description](https://github.com/hltcoe/HC4) | 0.1779 |
20 changes: 18 additions & 2 deletions docs/regressions-hc4-v1.0-zh.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Chinese

This page documents BM25 regression experiments for [HC4 (v1.0) — Chinese](https://github.com/hltcoe/HC4).
This page documents BM25 regression experiments for [HC4 (v1.0) — Chinese](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/hc4-v1.0-zh.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/hc4-v1.0-zh.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down Expand Up @@ -58,13 +58,27 @@ target/appassembler/bin/SearchCollection \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.dev.desc.txt \
-bm25 -hits 100 -language zh &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-chinese/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-zh.test.title.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.test.title.txt \
-bm25 -hits 100 -language zh &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.hc4-v1.0-chinese/ \
-topics src/main/resources/topics-and-qrels/topics.hc4-v1.0-zh.test.desc.tsv.gz \
-topicreader TsvInt \
-output runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.test.desc.txt \
-bm25 -hits 100 -language zh &
```

Evaluation can be performed using `trec_eval`:

```
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-zh.dev.txt runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.dev.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-zh.dev.txt runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.dev.desc.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-zh.test.txt runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.test.title.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.hc4-v1.0-zh.test.txt runs/run.hc4-v1.0-zh.bm25.topics.hc4-v1.0-zh.test.desc.txt
```

## Effectiveness
Expand All @@ -75,5 +89,7 @@ With the above commands, you should be able to reproduce the following results:
|:-------------------------------------------------------------------------------------------------------------|-----------|
| [HC4 (Chinese): dev-topic title](https://github.com/hltcoe/HC4) | 0.2914 |
| [HC4 (Chinese): dev-topic description](https://github.com/hltcoe/HC4) | 0.1983 |
| [HC4 (Chinese): test-topic title](https://github.com/hltcoe/HC4) | 0.1749 |
| [HC4 (Chinese): test-topic description](https://github.com/hltcoe/HC4) | 0.1404 |

The Above results are reproduction of the BM25 title queries run in [table 7 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
The Above results are reproduction of the BM25 title queries run in [table 2 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
4 changes: 2 additions & 2 deletions src/main/resources/docgen/templates/hc4-v1.0-fa.template
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Persian

This page documents BM25 regression experiments for [HC4 (v1.0) — Persian](https://arxiv.org/pdf/2201.09992.pdf).
This page documents BM25 regression experiments for [HC4 (v1.0) — Persian](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](${yaml}).
Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down Expand Up @@ -55,4 +55,4 @@ With the above commands, you should be able to reproduce the following results:

${effectiveness}

The Above results are reproduction of the BM25 title queries run in [table 7 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
The Above results are reproduction of the BM25 title queries run in [table 2 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
2 changes: 1 addition & 1 deletion src/main/resources/docgen/templates/hc4-v1.0-ru.template
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Russian

This page documents BM25 regression experiments for [HC4 (v1.0) — Russian](https://github.com/hltcoe/HC4).
This page documents BM25 regression experiments for [HC4 (v1.0) — Russian](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](${yaml}).
Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down
4 changes: 2 additions & 2 deletions src/main/resources/docgen/templates/hc4-v1.0-zh.template
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Anserini Regressions: HC4 (v1.0) — Chinese

This page documents BM25 regression experiments for [HC4 (v1.0) — Chinese](https://github.com/hltcoe/HC4).
This page documents BM25 regression experiments for [HC4 (v1.0) — Chinese](https://github.com/hltcoe/HC4), ([paper](https://arxiv.org/pdf/2201.09992.pdf)).

The exact configurations for these regressions are stored in [this YAML file](${yaml}).
Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
Expand Down Expand Up @@ -55,4 +55,4 @@ With the above commands, you should be able to reproduce the following results:

${effectiveness}

The Above results are reproduction of the BM25 title queries run in [table 7 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
The Above results are reproduction of the BM25 title queries run in [table 2 of this paper](https://arxiv.org/pdf/2201.08471.pdf)
10 changes: 10 additions & 0 deletions src/main/resources/regression/hc4-v1.0-fa.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@ topics:
id: dev_description
path: topics.hc4-v1.0-fa.dev.desc.tsv.gz
qrel: qrels.hc4-v1.0-fa.dev.txt
- name: "[HC4 (Persian): test-topic title](https://github.com/hltcoe/HC4)"
id: test_title
path: topics.hc4-v1.0-fa.test.title.tsv.gz
qrel: qrels.hc4-v1.0-fa.test.txt
- name: "[HC4 (Persian): test-topic description](https://github.com/hltcoe/HC4)"
id: test_description
path: topics.hc4-v1.0-fa.test.desc.tsv.gz
qrel: qrels.hc4-v1.0-fa.test.txt


models:
Expand All @@ -42,4 +50,6 @@ models:
MAP:
- 0.2919
- 0.3188
- 0.2837
- 0.2882

10 changes: 10 additions & 0 deletions src/main/resources/regression/hc4-v1.0-ru.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@ topics:
id: dev_description
path: topics.hc4-v1.0-ru.dev.desc.tsv.gz
qrel: qrels.hc4-v1.0-ru.dev.txt
- name: "[HC4 (Russian): test-topic title](https://github.com/hltcoe/HC4)"
id: test_title
path: topics.hc4-v1.0-ru.test.title.tsv.gz
qrel: qrels.hc4-v1.0-ru.test.txt
- name: "[HC4 (Russian): test-topic description](https://github.com/hltcoe/HC4)"
id: test_description
path: topics.hc4-v1.0-ru.test.desc.tsv.gz
qrel: qrels.hc4-v1.0-ru.test.txt


models:
Expand All @@ -42,4 +50,6 @@ models:
MAP:
- 0.2767
- 0.2321
- 0.2105
- 0.1779

10 changes: 10 additions & 0 deletions src/main/resources/regression/hc4-v1.0-zh.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@ topics:
id: dev_description
path: topics.hc4-v1.0-zh.dev.desc.tsv.gz
qrel: qrels.hc4-v1.0-zh.dev.txt
- name: "[HC4 (Chinese): test-topic title](https://github.com/hltcoe/HC4)"
id: test_title
path: topics.hc4-v1.0-zh.test.title.tsv.gz
qrel: qrels.hc4-v1.0-zh.test.txt
- name: "[HC4 (Chinese): test-topic description](https://github.com/hltcoe/HC4)"
id: test_description
path: topics.hc4-v1.0-zh.test.desc.tsv.gz
qrel: qrels.hc4-v1.0-zh.test.txt


models:
Expand All @@ -42,4 +50,6 @@ models:
MAP:
- 0.2914
- 0.1983
- 0.1749
- 0.1404

Loading