Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Data from retriever is not working with reader #229

Open
UmerTariq1 opened this issue Aug 25, 2022 · 1 comment
Open

Data from retriever is not working with reader #229

UmerTariq1 opened this issue Aug 25, 2022 · 1 comment

Comments

@UmerTariq1
Copy link

Background:
I am working on an idea which is an extension of DPR. so i used DPR retriever to get some retrieval results and I want to use them with DPR reader..

The output of retriever/input to the reader is a list and list's each item's structure is:

{
"question": <str>
"answers":  [ <str> ]
 "ctxs":  [  { "id": <str> ,  "title":<str>, "text":<str>, "score": <str>, "has_answer": true/false }  ]
}

But I am unable to use this file as the input to reader. The reason i think is because of the difference in structure to what DPR expects.
I successfully ran DPR reader on nq-single dataset but their format is ReaderSample (question, answers, positive_passages, .....).

My question
Is there any way I can convert the json file I have to the required format by the reader?

Relevant Issue
I found issue #73 to be somewhat related but i think the code has been changed since then because I am unable to find the file preprocess_reader_data.py.

What I am expecting
I am trying to use this data/json file (whose structure is mentioned above in the background section) for the input of DPR reader (train_extractive_reader.py)

@tkabir1
Copy link

tkabir1 commented Feb 3, 2023

Have you been able to solve this issue? Any help will be appreciated.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants