-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[BB2] FAQ #4172
Conversation
You'll need to do two things: | ||
|
||
1. Set `--knowledge-access-method search_only` | ||
2. Set `--query-generator-model-file zoo:sea/bart_sq_gen/model` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this one necessary? Isn't that the default atm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is not the default. The default uses --knowledge-access-method classify
, with the query generator from BB2 (trained to either search or retrieve from memory)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was referring to --query-generator-model-file zoo:sea/bart_sq_gen/model
. That can be default, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment it's not, the default is zoo:blenderbot2/query_generator/model
, which determines whether to search or retrieve from memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah, I remember now.
1. Set `--knowledge-access-method search_only` | ||
2. Set `--query-generator-model-file zoo:sea/bart_sq_gen/model` | ||
|
||
### How can I train with gold documents provided? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a note here that what we mean by gold docs. Maybe to keep it general enough we could say something like this:
Gold documents are any set of documents that you need you retriever to surface. We add them to make sure the generator model sees (is conditioned on) a certain set of document. Example use cases are as follows:
1- Adding documents to the mix of retrieved that you are certain have useful knowledge that the model should use for generating response.
2- Ensure reproducibility between experiments for retrievers that have randomized responses. For example, if you wanted your model to see the same documents that was shown to crowdsourcing agent
(eg WizIntGoldDocRetrieverFiDAgent
model with wizard of internet
dataset).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea, i'll add that note
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this. Reading it was useful for me too.
address mojtaba comments
Patch description
I've included a README for BB2 with FAQ that I've seen in our issues. Hopefully we can direct people towards this if it has the answer...