Can I use document retriever component only? #4

serenayj · 2022-06-23T19:42:59Z

Hi,

Congrats on finishing such nice work! I would like to test my encoder (document reader) and want to use the IR document retriever component only. Could you tell me where I could find this part of the codes and how to do it? Thank you in advance!

jind11 · 2022-06-29T07:13:15Z

I am sorry for the late reply. Thanks for reaching out to me! This code base provides the elastic search based IR baseline and you can follow the readme file to implement it. Specifically for the text (sentence or paragraph) retrieval, you can refer to this file: https://github.com/jind11/MedQA/blob/master/IR/aristomini/solvers/textsearch.py

serenayj · 2022-07-12T17:23:57Z

Hi,

Thanks for answering my question!

A following question I have is: in your paper where you describe the fine-tuning pre-training BERT models, you mentioned that :
Specifically, we construct the input sequence by concatenating [CLS], tokens in c, [SEP], tokens in qai, [SEP], where [CLS] and [SEP] are the classifier token and sentence separator in a pre-trained language model, respectively
My understanding is that context c is a concatenation of all textbooks. Wouldn't that exceed the BERT token limit if you concatenate both questions, answers, and the context c ?

jind11 · 2022-07-12T23:11:05Z

The c here should be the top-K retrieved sentences/paragraphs in the textbooks so that we do not need to concatenate all textbooks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can I use document retriever component only? #4

Can I use document retriever component only? #4

serenayj commented Jun 23, 2022

jind11 commented Jun 29, 2022

serenayj commented Jul 12, 2022

jind11 commented Jul 12, 2022

Can I use document retriever component only? #4

Can I use document retriever component only? #4

Comments

serenayj commented Jun 23, 2022

jind11 commented Jun 29, 2022

serenayj commented Jul 12, 2022

jind11 commented Jul 12, 2022