Clarifications about BlenderBot 2.0 memory managment #3963

gianlucabusatta · 2021-08-20T07:50:24Z

As stated here the query generator model for BB2 is just a regular BART model trained on the query generation task for Wizard of Internet (and multitasked with the MSC tasks to predict when to access memory).

Looking in here I couldn't find the MSC tasks to predict when to access memory.

Furthermore I have some questions about memory;

As far I understood, it is the query generator who decides when to access the memory and/or search the internet based on the dialog context after each dialog turn; is this mechanism just trained as in the previous statement or are there some hyperparameters? (in the case --knowledge_access_method "classify")
In the case of retrieval-augmentation (without summarization); is the model storing after each dialog turn the context? In which form, raw or encoded by DPR?
How was trained the memory decoder to make it learns what knowledge store?

Thanks in advance for the help.

klshuster · 2021-08-20T20:43:32Z

multitasked with the MSC tasks to predict when to access memory

I actually used a custom --mutator to achieve this (see here for some instructions on how to use mutators). PR #3966 adds this mutator to ParlAI

As for your other questions:

The mechanism is trained as in the statement; multitasked to either generate a search query or generate a token indicating access to long term memory
Without summarization, the model encodes "memories" (extracted from the context) and stores them in a pseudo-DPR index (i'm not sure I fully understand this question; does this make sense?)
The memory decoder was trained on --task msc:PersonaSummary; see the project page for more details

Hope that helps

gianlucabusatta · 2021-08-22T06:50:37Z

Thank you!

For what concern point 2.: are the memories always extracted after each dialog turn, or is there some kind of mechanism to decide wheter the model should store new memories? (always in the case without summarization).

klshuster · 2021-09-08T16:58:15Z

when not considering summarization, there are a couple heuristic controls for the model to extract memories from the dialogue context:

you can set the --memory-extractor-phrase, which essentially tells the model to only extract memories from the context containing said phrase (during training, for example, this might be --memory-extractor-phrase persona:
if you are using a custom dataset, you can specify the --memory-key, which is the key in the output dataset example dict that contains memories you want the model to write

klshuster self-assigned this Aug 20, 2021

gianlucabusatta closed this as completed Sep 24, 2021

kabayan mentioned this issue Jul 6, 2022

How to make memory_decorder model from scratch? #4645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarifications about BlenderBot 2.0 memory managment #3963

Clarifications about BlenderBot 2.0 memory managment #3963

gianlucabusatta commented Aug 20, 2021 •

edited

Loading

klshuster commented Aug 20, 2021

gianlucabusatta commented Aug 22, 2021 •

edited

Loading

klshuster commented Sep 8, 2021

Clarifications about BlenderBot 2.0 memory managment #3963

Clarifications about BlenderBot 2.0 memory managment #3963

Comments

gianlucabusatta commented Aug 20, 2021 • edited Loading

klshuster commented Aug 20, 2021

gianlucabusatta commented Aug 22, 2021 • edited Loading

klshuster commented Sep 8, 2021

gianlucabusatta commented Aug 20, 2021 •

edited

Loading

gianlucabusatta commented Aug 22, 2021 •

edited

Loading