How to generate sparse vectors (SPLADE model) #2333
-
I am trying to find an efficient solution in the Anserini and Pyserini repositories for generating collections containing sparse vectors obtained from the SPLADE model. I would like to generate these collections from a large dataset compiled in the form of text files, where each line contains one record. I attempted to extract the source code from the Pyserini repository that could address this; however, vector retrieval on GPU is very slow and inefficient. I would like to ask, please, how do you generate sparse vectors? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Is your question about generating the sparse vectors, i.e., encoding the corpus, or searching the sparse vectors? For the latter, one of our reproduction docs would be a good starting point: |
Beta Was this translation helpful? Give feedback.
Encoding the documents scales linearly, so the only solutions are to procure more GPUs or better GPUs...