You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 3, 2023. It is now read-only.
As stated herethe query generator model for BB2 is just a regular BART model trained on the query generation task for Wizard of Internet (and multitasked with the MSC tasks to predict when to access memory).
Looking in here I couldn't find the MSC tasks to predict when to access memory.
Furthermore I have some questions about memory;
As far I understood, it is the query generator who decides when to access the memory and/or search the internet based on the dialog context after each dialog turn; is this mechanism just trained as in the previous statement or are there some hyperparameters? (in the case --knowledge_access_method "classify")
In the case of retrieval-augmentation (without summarization); is the model storing after each dialog turn the context? In which form, raw or encoded by DPR?
How was trained the memory decoder to make it learns what knowledge store?
Thanks in advance for the help.
The text was updated successfully, but these errors were encountered:
multitasked with the MSC tasks to predict when to access memory
I actually used a custom --mutator to achieve this (see here for some instructions on how to use mutators). PR #3966 adds this mutator to ParlAI
As for your other questions:
The mechanism is trained as in the statement; multitasked to either generate a search query or generate a token indicating access to long term memory
Without summarization, the model encodes "memories" (extracted from the context) and stores them in a pseudo-DPR index (i'm not sure I fully understand this question; does this make sense?)
The memory decoder was trained on --task msc:PersonaSummary; see the project page for more details
For what concern point 2.: are the memories always extracted after each dialog turn, or is there some kind of mechanism to decide wheter the model should store new memories? (always in the case without summarization).
when not considering summarization, there are a couple heuristic controls for the model to extract memories from the dialogue context:
you can set the --memory-extractor-phrase, which essentially tells the model to only extract memories from the context containing said phrase (during training, for example, this might be --memory-extractor-phrase persona:
if you are using a custom dataset, you can specify the --memory-key, which is the key in the output dataset example dict that contains memories you want the model to write
As stated here the query generator model for BB2 is just a regular BART model trained on the query generation task for Wizard of Internet (and multitasked with the MSC tasks to predict when to access memory).
Looking in here I couldn't find the MSC tasks to predict when to access memory.
Furthermore I have some questions about memory;
Thanks in advance for the help.
The text was updated successfully, but these errors were encountered: