-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression Test for Car17 #379
Conversation
map: | ||
- 0.1354 | ||
recip_rank: | ||
- 0.1861 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not BM25+AX
and QL+AX
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do it!
@Peilin-Yang @lintool The axiom results are added. Could you review this PR now? |
${ranking_cmds} | ||
``` | ||
|
||
Evaluation can be performed using `trec_eval` and `gdeval.pl`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems it is not using gdeval.pl
threads: 40 | ||
index_options: | ||
- -storePositions | ||
- -storeDocvectors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've decided to enable -storeRawDocs
for regression test.
Could you please add it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
- name: bm25+ax | ||
params: | ||
- -bm25 | ||
- -axiom |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to make the axiomatic reranking deterministic, you will need to add
- -rerankCutoff 20
- -axiom.deterministic
Please see other yaml files for examples
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I will add it.
- name: ql+ax | ||
params: | ||
- -bm25 | ||
- -axiom |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- -rerankCutoff 20
- -axiom.deterministic
Topics and qrels are stored in `src/main/resources/topics-and-qrels/`, downloaded from NIST: | ||
|
||
+ `topics.car17.test200.txt`: [Topics for the test200 subset (TREC 2017 Complex Answer Retrieval Track)](http://trec-car.cs.unh.edu/datareleases/v1.5/test200-v1.5.tar.xz) | ||
+ `qrel: qrels.car17.test200.hierarchical.txt`: [adhoc qrels (TREC 2017 Complex Answer Retrieval Track)](http://trec-car.cs.unh.edu/datareleases/v1.5/test200-v1.5.tar.xz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this file called qrels.car17.test200.hierarchical
, which is not compatible with the topics file?
I think topics.car17.txt
and qrels.car17.txt
is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In car17, there are two test sets: test200
and benchmark_test
. I follow benchmark paper and use test200
as the evaluation set in Anserini.
And for each test set, there are three kinds of qrels
file provided: article
, toplevel
, hierarchical
. Car17 use hierarchical
for the final evaluation.
That's where test200
and hierarchical
are from. To clarify which test set and qrels
file I use, I add them in the file name. Is that good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, this makes more sense to me!
generator: LuceneDocumentGenerator | ||
threads: 40 | ||
index_options: | ||
- -storeRawDocs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put -storeRawDocs
right before -optimize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Round figures to 4 digits
@Peilin-Yang @lintool Could you review this PR?