-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incredible
) | ||
try: | ||
self.generate_data() | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit lint
self.label_vec[batch_id, :-1] | ||
) # type: ignore | ||
for i, doc in enumerate(search_results): | ||
url = doc['url'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: thinking about the variation of the search module used, often they may not return or title. And the KeyError generated here may not be very informative for someone who uses it but doesn't know it's internal. How about checking if all these exist and having an assert that tells user what was expected, or a load warning and then replacing the missing entries with a default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not a bad idea; I took this function literally verbatim from the one in parlai/agents/rag/retrievers.py
; the only change is line 104 to 106. if we want to change that we should go directly to the function there
####################################### | ||
|
||
|
||
class IdentityLayer(torch.nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember Spencer had added an IdentityLayer to ParlAI recently. Is that possible to use that one instead of the custom one here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, but that PR is not merged yet actually (#4329) As I noted in the docstring, we do have an IdentityLayer
in parlai.utils.torch.IdentityLayer`, but I required custom output here.
# if no padding, docs are assumed to be min doc length long | ||
doc_lens[doc_lens.le(0)] = self.min_doc_len | ||
new_enc_out = enc_out.clone() | ||
# BEFORE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: we had this before and after for FiD because we needed to encode the doc with the input. In this module that only does string operations, is it still a reason for doing this way other than using pre-existing methods for making masks etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well we're actually still doing operations on tensors, which is why it's a bit confusing. This mess of logic is a result of me trying to format a GPT2 decoder-style model into a FiD setup, rather than the other way around
and self.search_decision is SearchDecision.COMPUTE | ||
): | ||
self.search_decision_agent.self_observe(self_message) | ||
self.search_decision_agent.history.reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the decision agent need to self_observe with the history getting reset right after?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm calling history.reset
directly, so the agent still needs a self observe.
This may be old code though; what i've realized is now I can just call self.search_decision_agent.reset()
, which will reset the invariants
best_doc = None | ||
best_doc_idx = None | ||
best_f1 = None | ||
for f1, ind in zip(f1s, inds): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name says best doc. Why does it looks like it picks the first doc that is above the threshold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh because we assume that the docs are retrieved in order of their quality. so presumably the first doc that is above the threshold is in fact the best doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are, but the metric used by the search engine might be different by the F1 metric we use here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, we're also placing our trust more so in the search engine than in a heuristic word overlap metric
Patch description
Project code for SeeKeR. This PR includes the following:
SeeKeR Agents
SeeKeR Dialogue
projects.seeker.agents.seeker:ComboFidAgent
: This agent is used to train SeeKeR models, as it is the core agent that handles all of the various functionalities simultaneously.projects.seeker.agents.seeker:SeekerAgent
: The SeeKeR agent itself. See the provided opt preset,gen/seeker_dialogue
, for an example invocation.GPT2 SeeKeR
projects.seeker.agents.gpt2_seeker:GPT2WithRetrieverAgent
: This agent packs in the retrieved documents and input context as one big "prompt" to the language model. UtilizesFidAgent
for retriever componentsprojects.seeker.agents.gpt2_seeker:GPT2ComboAgent
: Agent that can both retrieve for some contexts, and not retrieve for others. This agent is used to train GPT2 SeeKeR models.projects.seeker.agents.gpt2_seeker:GPT2SeekerAgent
: Full SeeKeR agent, that handles search query, knowledge, and response generationTasks
projects.seeker.agents.tasks.<dialogue/search_query/search_decision/knowledge>
Other
Scripts
generate_lm_data.py
: Script used for generating the LM data forseeker_lm
models.Opt Presets
arch/r2c2_base_3B
: architecture for R2C2 3B modelarch/r2c2_base_400M
: architecture for R2C2 400M modelgen/seeker_dialogue
: generation parameters for a SeeKeR Dialogue modelgen/seeker_lm
: generation parameters for a SeeKeR (GPT2) LMGeneral Task Additions/Fixes
no_cands
optionNaturalQuestionsOpenTeacher
Testing Steps
Included CI to test the most important functionality.