[RAD] OSS RAG & FiD #3611

klshuster · 2021-04-21T16:50:51Z

Patch description
RAG and FiD, in ParlAI. There's too much to describe in the PR Description, so I would direct readers to the included READMEs for more detailed instructions. I'll give a quick list here of what's encompassed by these changes:

RAG and FiD are now agents in ParlAI -> --model rag, --model fid
Included are scripts for generating FAISS indices to use with RAG and FiD
Included are build files for eight pre-trained models from this project
Included are 31 CI tests that ensure all appropriate combinations of RAG/FiD can be run, and which also ensure that the pre-trained models load appropriately
Included is a DropoutPolyencoder, the base model of the PolyFAISS method described in (this paper)[https://arxiv.org/abs/2104.07567]
Included are comprehensive README additions (for RAG) and updates (for RAD)
Included is an opt_preset file for BART-Large.

The following RAG Options are implemented:

Generation Models: BART, T5, Transformer/Generator (the last is good for e.g. BlenderBot)
Model Types: RAG Sequence, RAG Turn, RAG Token, RAG-ReGReT
Retrievers: DPR, TFIDF, DPR-Poly, PolyFAISS
Indexes: Exact, Compressed

Testing steps
CI


$ pytest -x test_rag.py
==================test session starts ==================
platform linux -- Python 3.7.9, pytest-6.2.1, py-1.10.0, pluggy-1.0.0.dev0
rootdir: /private/home/kshuster/ParlAI, configfile: pytest.ini
plugins: hydra-core-1.0.0, requests-mock-1.8.0, regressions-2.1.1, datadir-1.3.1
collected 31 items

test_rag.py ...............................                                                                                                                                   [100%]

================== slowest 10 durations ==================
257.50s call     tests/nightly/gpu/test_rag.py::TestRagDpr::test_t5_rag_turn
183.55s call     tests/nightly/gpu/test_rag.py::TestRagDprPoly::test_bart_rag_turn
131.02s call     tests/nightly/gpu/test_rag.py::TestRagDpr::test_reddit_rag_turn
101.48s call     tests/nightly/gpu/test_rag.py::TestRagDpr::test_bart_rag_turn
90.21s call     tests/nightly/gpu/test_rag.py::TestRagDpr::test_t5_rag_sequence
67.14s call     tests/nightly/gpu/test_rag.py::TestZooModels::test_bart_rag_dpr_poly
62.91s call     tests/nightly/gpu/test_rag.py::TestRagDprPoly::test_bart_rag_sequence
55.08s call     tests/nightly/gpu/test_rag.py::TestZooModels::test_bart_fid_rag_dpr_poly
53.19s call     tests/nightly/gpu/test_rag.py::TestZooModels::test_bart_rag_sequence
52.52s call     tests/nightly/gpu/test_rag.py::TestZooModels::test_bart_rag_turn_do
==================31 passed, 16 warnings in 1711.35s (0:28:31) ==================

…to the RAD project readme

parlai/core/torch_ranker_agent.py

parlai/agents/transformer/dropout_poly.py

spencerp · 2021-04-21T20:09:11Z

parlai/agents/transformer/dropout_poly.py

+        if self.use_codes:
+            ctxt_rep, ctxt_rep_mask, _ = self.model(**self._model_context_input(batch))
+        else:
+            model = self.model.module if hasattr(self.model, 'module') else self.model


nit; comment mentioning DistributedDataParallel

parlai/agents/transformer/dropout_poly.py

parlai/agents/rag/dpr_biencoder.py

parlai/agents/rag/retrievers.py

spencerp · 2021-04-21T22:49:48Z

parlai/agents/rag/model_types.py

+        torch.Tensor,
+    ]:
+        """
+        Reorder the encoder states, for bean search.


lol bean search

i'm tempted to leave this in haha

spencerp · 2021-04-21T22:53:26Z

parlai/agents/rag/model_types.py

+    return dec_inputs  # type: ignore
+
+
+class RagModelInterface(ABC):


parlai/agents/fid/fid.py

spencerp · 2021-04-21T23:23:34Z

parlai/agents/fid/fid.py

+    def reorder_encoder_states(
+        self,
+        encoder_states: Tuple[torch.Tensor, ...],
+        indices: Union[List[int], torch.LongTensor],
+    ) -> Tuple[torch.Tensor, torch.Tensor, List[List[Document]], torch.Tensor]:
+        """
+        Reorder the encoder states.
+
+        Override TGM.reorder_encoder_states to make sure we only pass enc, mask.
+
+        See ``TorchGeneratorModel.reorder_encoder_states`` for a description.
+        """
+        enc, mask, *_ = encoder_states
+        return TransformerGeneratorModel.reorder_encoder_states(
+            self, (enc, mask), indices
+        )


This may be out of the scope of this PR. But I feel like we could eliminate having to override this if encoder_states were a dict or **kwargs instead of a tuple.

In the scope of this PR, you could remove the need for this function by modifying TransformerGeneratorModel.reorder_encoder_states to do enc, mask = encoder_states[:2].

I actually find the forcing nice, as it causes people to realize they need to handle reordering states. kwargs could hide something that isn't being properly shuffled

i should probably clarify this docstring, as we're overriding RagModel.reorder_encoder_states, not TGM

parlai/agents/fid/README.md

parlai/agents/rag/README.md

parlai/agents/rag/conversion_utils.py

parlai/agents/rag/dpr_biencoder.py

parlai/agents/rag/indexers.py

parlai/agents/rag/model_types.py

parlai/agents/rag/modules.py

parlai/agents/rag/rag.py

parlai/agents/rag/retrievers.py

klshuster · 2021-04-22T02:19:57Z

parlai/zoo/model_list.py

+            "[Fid]: I love Elvis Presley! He is my favorite singer, songwriter, actor, and producer."
+        ),
+        "example2": (
+            "parlai eval_model -mf zoo:hallucination/bart_fid_rag_dpr_poly/model -t wizard_of_wikipedia --num-examples 100"


gotta remove this and the following --num-examples 100 as these are numbers from full valid set

klshuster · 2021-04-22T02:20:20Z

parlai/zoo/model_list.py

+        "example": (
+            "parlai eval_model -mf zoo:hallucination/bart_rag_token/model --indexer-type exact --path-to-index zoo:hallucination/wow_passages/exact --path-to-dpr-passages zoo:hallucination/wow_passages/wow_articles.paragraphs.tsv -ne 100"
+        ),
+        "result": ("TODO"),


also gotta fill in these results

klshuster · 2021-04-22T02:22:19Z

projects/hallucination/README.md

+    --batchsize 16 --fp16 True --gradient-clip 0.1 --label-truncate 128 \
+    --log-every-n-secs 30 --lr-scheduler reduceonplateau --lr-scheduler-patience 1 \
+    --model-parallel True --optimizer adam --text-truncate 512 --truncate 512 \
+    -lr 1e-05 -vmm min -veps 0.25 -vme 1000 -vmt ppl -vp 5 \


maybe add a note here to open an issue if other options are desired?

Option prefixes are available now...

i know i added one for BART. i wasnt sure what to call these though... opt/rag? since these are optimization/training parameters

you can put them in the project folder and specify it with -o projects/hallucination/very_long_name.opt

wow, wasn't aware of that, i'll try it out

parlai/agents/rag/modules.py

parlai/agents/fid/fid.py

spencerp

This is great! Unless @stephenroller or @moyapchen have any concerns, it feels ready to merge.

spencerp · 2021-04-26T14:24:07Z

parlai/agents/rag/modules.py

+            encoder=opt['t5'].get_encoder(),
+            encoder_class=ParlaiT5Encoder,


This is pretty funky. It should be made better by the "swappable subcomponents" change.

stephenroller · 2021-04-26T14:28:37Z

Looks like tests still don't pass (maybe merge master into this?) but I defer.

klshuster · 2021-04-26T14:42:09Z

i'll try getting tests to pass before i merge this

rag oss

5db90a4

klshuster requested review from spencerp and moyapchen April 21, 2021 16:50

facebook-github-bot added the CLA Signed label Apr 21, 2021

klshuster added 4 commits April 21, 2021 13:35

update model list

7b5c17b

faiss imports

4eaab57

fix model list again

3eb78f5

add opt presents; add information to the model list; add information …

434ffad

…to the RAD project readme

spencerp reviewed Apr 21, 2021

View reviewed changes

fix dpr download

7997ab6

spencerp reviewed Apr 21, 2021

View reviewed changes