Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add (new) pre-built indexes for MS MARCO (V1) #972

Closed
lintool opened this issue Jan 27, 2022 · 2 comments · Fixed by #977
Closed

Add (new) pre-built indexes for MS MARCO (V1) #972

lintool opened this issue Jan 27, 2022 · 2 comments · Fixed by #977
Assignees

Comments

@lintool
Copy link
Member

lintool commented Jan 27, 2022

Following the resolution of castorini/anserini#1721 and castorini/anserini#1730 - we need to rebuild new pre-built indexes.

@lintool
Copy link
Member Author

lintool commented Jan 27, 2022

@ronakice @MXueguang how would you like me to name them?

Currently, they are:

  • msmarco-passage
  • msmarco-passage-expanded
  • msmarco-doc
  • msmarco-doc-per-passage
  • msmarco-doc-expanded-per-doc
  • msmarco-doc-expanded-per-passage

Should we keep the existing ids and just replace the underlying tarballs? That would be a silent change.

Alternatively, we could use brand new ids, like msmarco-v1... and make the names parallel to V2.

Thoughts?

@ronakice
Copy link
Member

I think perhaps the msmarco-v1... path is better, as it also helps disambiguate for the future. As to whether these other ones are required, I don't actually have a strong opinion but I'm leaning toward removing them as they might confuse people.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants