SeeKeR #4447

klshuster · 2022-03-25T01:03:01Z

Patch description

Project code for SeeKeR. This PR includes the following:

SeeKeR Agents

SeeKeR Dialogue

projects.seeker.agents.seeker:ComboFidAgent: This agent is used to train SeeKeR models, as it is the core agent that handles all of the various functionalities simultaneously.
projects.seeker.agents.seeker:SeekerAgent: The SeeKeR agent itself. See the provided opt preset, gen/seeker_dialogue, for an example invocation.

GPT2 SeeKeR

projects.seeker.agents.gpt2_seeker:GPT2WithRetrieverAgent: This agent packs in the retrieved documents and input context as one big "prompt" to the language model. Utilizes FidAgent for retriever components
projects.seeker.agents.gpt2_seeker:GPT2ComboAgent: Agent that can both retrieve for some contexts, and not retrieve for others. This agent is used to train GPT2 SeeKeR models.
projects.seeker.agents.gpt2_seeker:GPT2SeekerAgent: Full SeeKeR agent, that handles search query, knowledge, and response generation

Tasks

I've included manually constructed teachers for all 30 task variants used to train the SeeKeR dialogue model. They are found in projects.seeker.agents.tasks.<dialogue/search_query/search_decision/knowledge>

Other

Scripts

generate_lm_data.py: Script used for generating the LM data for seeker_lm models.

Opt Presets

arch/r2c2_base_3B: architecture for R2C2 3B model
arch/r2c2_base_400M: architecture for R2C2 400M model
gen/seeker_dialogue: generation parameters for a SeeKeR Dialogue model
gen/seeker_lm: generation parameters for a SeeKeR (GPT2) LM

General Task Additions/Fixes

Fix the ConvAI2 Normalized teacher to allow the no_cands option
Add NaturalQuestionsOpenTeacher

Testing Steps

Included CI to test the most important functionality.

stephenroller

incredible

projects/seeker/agents/gpt2_seeker.py

projects/seeker/scripts/generate_lm_data.py

stephenroller · 2022-03-25T02:52:16Z

projects/seeker/scripts/generate_lm_data.py

+        )
+        try:
+            self.generate_data()
+        except:


mojtaba-komeili · 2022-03-25T13:49:04Z

projects/seeker/agents/gpt2_seeker_modules.py

+                self.label_vec[batch_id, :-1]
+            )  # type: ignore
+            for i, doc in enumerate(search_results):
+                url = doc['url']


nit: thinking about the variation of the search module used, often they may not return or title. And the KeyError generated here may not be very informative for someone who uses it but doesn't know it's internal. How about checking if all these exist and having an assert that tells user what was expected, or a load warning and then replacing the missing entries with a default?

that's not a bad idea; I took this function literally verbatim from the one in parlai/agents/rag/retrievers.py; the only change is line 104 to 106. if we want to change that we should go directly to the function there

mojtaba-komeili · 2022-03-25T13:52:53Z

projects/seeker/agents/gpt2_seeker_modules.py

+#######################################
+
+
+class IdentityLayer(torch.nn.Module):


I remember Spencer had added an IdentityLayer to ParlAI recently. Is that possible to use that one instead of the custom one here?

Perhaps, but that PR is not merged yet actually (#4329) As I noted in the docstring, we do have an IdentityLayer in parlai.utils.torch.IdentityLayer`, but I required custom output here.

mojtaba-komeili · 2022-03-25T14:13:38Z

projects/seeker/agents/gpt2_seeker_modules.py

+        # if no padding, docs are assumed to be min doc length long
+        doc_lens[doc_lens.le(0)] = self.min_doc_len
+        new_enc_out = enc_out.clone()
+        # BEFORE:


Question: we had this before and after for FiD because we needed to encode the doc with the input. In this module that only does string operations, is it still a reason for doing this way other than using pre-existing methods for making masks etc.?

well we're actually still doing operations on tensors, which is why it's a bit confusing. This mess of logic is a result of me trying to format a GPT2 decoder-style model into a FiD setup, rather than the other way around

mojtaba-komeili · 2022-03-25T14:52:56Z

projects/seeker/agents/seeker.py

+            and self.search_decision is SearchDecision.COMPUTE
+        ):
+            self.search_decision_agent.self_observe(self_message)
+            self.search_decision_agent.history.reset()


Does the decision agent need to self_observe with the history getting reset right after?

i'm calling history.reset directly, so the agent still needs a self observe.

This may be old code though; what i've realized is now I can just call self.search_decision_agent.reset(), which will reset the invariants

mojtaba-komeili · 2022-03-25T15:54:20Z

projects/seeker/scripts/generate_lm_data.py

+        best_doc = None
+        best_doc_idx = None
+        best_f1 = None
+        for f1, ind in zip(f1s, inds):


The function name says best doc. Why does it looks like it picks the first doc that is above the threshold?

ahh because we assume that the docs are retrieved in order of their quality. so presumably the first doc that is above the threshold is in fact the best doc

They are, but the metric used by the search engine might be different by the F1 metric we use here.

agreed, we're also placing our trust more so in the search engine than in a heuristic word overlap metric

klshuster added 10 commits March 17, 2022 17:47

seeker

9345505

todo

2a61dcd

Merge branch 'main' into seeker

517e2a3

readme updates; add test

993cf8b

small config changes

f8de0aa

Merge branch 'main' into seeker

dc40ab0

various updates

56c9924

readme fix

8ff1f67

model card

b95d9ad

add arxiv link

8d9f1e1

klshuster requested review from jaseweston and mojtaba-komeili March 25, 2022 01:03

facebook-github-bot added the CLA Signed label Mar 25, 2022

surround spacy with try catch

c95933e

jaseweston approved these changes Mar 25, 2022

View reviewed changes

klshuster added 2 commits March 24, 2022 22:05

more protected

6e8ed90

Merge branch 'main' into seeker

839b075

stephenroller approved these changes Mar 25, 2022

View reviewed changes

klshuster added 2 commits March 24, 2022 23:28

more protection of imports

6caeabc

lint

337fd31

jaseweston merged commit 7e45300 into main Mar 25, 2022

jaseweston deleted the seeker branch March 25, 2022 13:21

mojtaba-komeili reviewed Mar 25, 2022

View reviewed changes

mojtaba-komeili approved these changes Mar 25, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SeeKeR #4447

SeeKeR #4447

klshuster commented Mar 25, 2022

stephenroller left a comment

stephenroller Mar 25, 2022

mojtaba-komeili Mar 25, 2022 •

edited

Loading

klshuster Mar 25, 2022

mojtaba-komeili Mar 25, 2022

klshuster Mar 25, 2022

mojtaba-komeili Mar 25, 2022

klshuster Mar 25, 2022

mojtaba-komeili Mar 25, 2022

klshuster Mar 25, 2022

mojtaba-komeili Mar 25, 2022

klshuster Mar 25, 2022

mojtaba-komeili Mar 25, 2022

klshuster Mar 25, 2022

		#######################################


		class IdentityLayer(torch.nn.Module):

SeeKeR #4447

SeeKeR #4447

Conversation

klshuster commented Mar 25, 2022

Patch description

SeeKeR Agents

SeeKeR Dialogue

GPT2 SeeKeR

Tasks

Other

Scripts

Opt Presets

General Task Additions/Fixes

Testing Steps

stephenroller left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mojtaba-komeili Mar 25, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mojtaba-komeili Mar 25, 2022 •

edited

Loading