-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Conversation
KnowledgeAccessMethod(opt['knowledge_access_method']) | ||
is KnowledgeAccessMethod.ALL | ||
): | ||
self.n_docs *= 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is this because we need n_docs
for search and n_docs
for memory? Could you add a small comment on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, good call, will add comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question: if this was added in this PR, how was it handled so far?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we did not require a specific BlenderBot2Fid
-specific model interface prior; this n_docs
is used for incremental decoding, however
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Patch description
I've implemented incremental decoding for FiD and RAG (and, by extension, BlenderBot2).
CC #4073
Testing steps
CI
Existing CI, both for rag and bb2. Local runs:
Empirical Evaluations
Empirical Speed tests indicate the following speedups (batch size 16, generation parameters from the hallucination project zoo models). Two observations: