Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading to Pyserini 0.24 means .raw option not available. #1762

Closed
bevankoopman opened this issue Jan 8, 2024 · 1 comment
Closed

Upgrading to Pyserini 0.24 means .raw option not available. #1762

bevankoopman opened this issue Jan 8, 2024 · 1 comment

Comments

@bevankoopman
Copy link

Upgrading to pyserini 0.24 causes the following error:

File "/Users/.../service.py", line 44, in search
    hybrid_hits, sparse_raws = self.hybrid(dense_hits, sparse_hits, 0.7, k)
File "/Users/../service.py", line 85, in hybrid
    sparse_raws[hit.docid] = hit.raw
AttributeError: 'io.anserini.search.ScoredDoc' object has no attribute 'raw'

This works fine with Pyserini 0.22.

Seems the API has changed so raw attribute no longer exists? What should we use instead? (I note too that a lot of tutorial / code example seem to reference the use of this attribute.)

@lintool
Copy link
Member

lintool commented Jan 10, 2024

See discussion here: #1758 for alternatives

I made this API change because, previously, the Result object would load the raw doc content eagerly. Instead it does so lazily.

Change to something like this?

hits[0].lucene_document.get('raw')

Closing... but please reopen if you have any other issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants