Breaking transformer modules.py into smaller files #3466

spencerp · 2021-02-24T00:00:05Z

Patch description
This is one small step in a larger effort to refactor the transformer code to make it easier to modify and extend. There shouldn't be any functional changes as part of this diff, just organizational.

Breaking modules.py into smaller files is motivated by a couple things:

It's currently 1490 lines long. With smaller files, I hope to make each component easier to read, and make it more obvious the ways in which they are independent from or dependent of each other.
It contains a mixture of composable components that are used throughout the code (TransformerEncoder, MultiHeadAttention, etc) and complete models that usually stand on their own (TransformerMemNetModel and TransformerGeneratorModel). By making the independence of these subcomponents more explicit, I hope to encourage their reuse.

If this seems more confusing instead of less confusing to anyone, please let me know!

Testing steps
Circle CI.
Sanity check with a canonical example:

parlai train_model --task twitter --model-file /tmp/tr_twitter --model transformer/ranker --batchsize 16 --validation-every-n-secs 3600 --candidates batch --eval-candidates batch --data-parallel True

klshuster

i'm certainly not opposed to this, as it seems to be a cosmetically pleasing change!

would just hope that nothing breaks of course. looks like tests currently failing but seemed flaky so i re-ran them

parlai/agents/image_seq2seq/modules.py

parlai/agents/transformer/modules/__init__.py

jaseweston · 2021-02-26T00:11:03Z

parlai/agents/transformer/modules/encoder.py

+    LayerNorm,
+    normalize,
+    MultiHeadAttention,
+    TransformerFFN,


what we really need is though, if you wanted to say change the FFN but not copy all the code this shouldnt be a hard import in here, otherwise when trying out new architectures you still have to copy this file just to change this one line.. so i'm not sure separating into separate files is actually helping that much

@stephenroller ?

Ah yeah this isn't the whole refactor. How I'm thinking about it is from two angles:

(the simpler change that this PR is moving toward) It should be easier to reuse/reorder components of the transformer. So if you want to build a new architecture you could mostly build it from components instead of copying and pasting all the code into an empty file.

(the harder change that you're referring to) It should be easier to swap out a specific component while keeping the rest of the architecture intact.

For 2, I've been leaning towards a builder + protocol pattern. But I'm hoping to get feedback on that part in more detail in our next ParlAI meeting.

Honestly we could accomplish 1 without this change, but as I was playing with the code this reorganization felt easier to understand. And I'm hoping that keeping components in separate files encourages the code to stay as clean as it is now. But if this feels more confusing or cumbersome for any reason we don't need to do this.

got it.

hmm maybe the laziest way of doing (2) with very little code change would be to import the modules from modules.py which is a file in the same directory.. and that file points to the actual implementations such as modules/ffn.py (but doesn't have any other code)
This way in a new agent directory you only have to have that small file pointing to your reimplementations (+ the top level model.py which inherits transformer)

would that work?

Interesting! It sounds similar to what I'm thinking, but exploits the directory structure more which has some advantages (one big one being less code like you mentioned).

With that pattern, how would you tell the TransformerGeneratorModel that it should look in .modules relative to the agent directory? Are you thinking we'd just look for a standard class name (like how we look for a DefaultTeacher)?

I'm curious how unittest.mock.patch works. Maybe your pattern could whatever that does. I think it might just reassign the module attribute after importing.

One thing I'd like to move away from is inheritance (though I'm sure people will have opinions about this). Personally I've found that trying to follow the movement of data through the code gets very difficult when I have to keep moving between classes. And it can get particularly hairy when subclass logic modifies properties used by superclass logic. So moving towards protocols as part of the protocol + builder pattern I mentioned is kind of accomplishing multiple things from my perspective. But if it turns out that I'm the only one and everyone else loves inheritance then that pattern becomes less enticing.

something like:

from .ffn import TransformerFFN

already looks in the current agent's directory...
if you want to use the classic you use
from parlai.agents.transformer.modules.fnn import TransformerFFN
i guess?

the good thing about this it's a pretty small code change from where we are now, so we could just try it out...

i will +1 the confusion that comes with inheritance; I'm curious to see what this protocol/builder pattern is

You maybe could do Jason's proposal with something like:

try: from .local_modules import Whatever except ImportError: from parlai.agents.transformer.modules.whatever import Whatever

I'm not reaaaaally sure i like that tho.

Hm, so does this proposal still require copying TransformerGeneratorModel into your new agent?

The way I'm interpreting what you guys are saying is that you'd do something like:

$ cd ~/ParlAI/parlai $ cp agents/transformer/generator.py agents/my_custom_transformer/my_custom_transformer.py $ touch agents/my_custom_transformer/modules.py

Then you'd implement your own TransformerFFN in agents/my_custom_transformer/modules.py.

This would require that

try: from .modules import TransformerFFN except ImportError: from parlai.agents.transformer.modules import TransformerFFN

and all other component imports live in whichever file you've copied into your agent directory (in this example, generator.py). Which I think brings us back to a monolithic file?

A way to accomplish this without all the imports living in a single file could be monkey patching. I think it would something like this in agents/my_custom_transformer/my_custom_transformer.py:

import parlai.agents.transformer.generator as transformer_generator from parlai.core.torch_generator_agent import TorchGeneratorAgent class MyCustomMHA(nn.Module): ... transformer_generator.modules.MultiHeadAttention = MyCustomMHA class MyCustomTransformerAgent(TorchGeneratorAgent): def build_model(self, states=None): return transformer_generator.TransformerGeneratorModel(self.opt, self.dict)

This provides a lot of flexibility and doesn't require much code, but comes with a lot of dangers (the wikipedia article has a nice list under "Pitfalls": https://en.wikipedia.org/wiki/Monkey_patch)

Hmm, my intuition probably would have been either to do monkey patching or registering subcomponents, but I admit the limitations with monkey patching so I'll defer to others who have thought more about this. But personally I'll say that I've never minded inheritance-based solutions - I use an IDE (PyCharm) that tracks the value of all methods/attributes in arbitrary subclasses. But that's just me...

This whole debate just sounds like a poor man's protocol pattern to me...

EricMichaelSmith

I think this is good to have! I'm curious about your builder + protocol pattern - I think I'd need to visualize more what the immediate usecases will be of swapping out certain components for others

parlai/agents/transformer/modules/__init__.py

EricMichaelSmith · 2021-02-26T22:33:01Z

parlai/agents/transformer/modules/encoder.py

+    LayerNorm,
+    normalize,
+    MultiHeadAttention,
+    TransformerFFN,


Hmm, my intuition probably would have been either to do monkey patching or registering subcomponents, but I admit the limitations with monkey patching so I'll defer to others who have thought more about this. But personally I'll say that I've never minded inheritance-based solutions - I use an IDE (PyCharm) that tracks the value of all methods/attributes in arbitrary subclasses. But that's just me...

spencerp added 4 commits February 23, 2021 08:00

Separating out components of transformer into their own files

cdd1fda

move the rest of transformer modules out of modules.py

0ac9a18

remove transformer modules.py

e6da0f6

resolving imports

acd8ea0

spencerp requested review from stephenroller, EricMichaelSmith, klshuster, jaseweston and emilydinan February 24, 2021 00:00

facebook-github-bot added the CLA Signed label Feb 24, 2021

klshuster reviewed Feb 26, 2021

View reviewed changes

parlai/agents/image_seq2seq/modules.py Outdated Show resolved Hide resolved

parlai/agents/transformer/modules/__init__.py Show resolved Hide resolved

jaseweston reviewed Feb 26, 2021

View reviewed changes

EricMichaelSmith reviewed Feb 26, 2021

View reviewed changes

spencerp added 2 commits March 2, 2021 21:39

Merge branch 'master' into tf-components

2861b64

move functions into modules, fix imports

998e37d

spencerp mentioned this pull request Mar 4, 2021

Pass Opt to TransformerEncoder/Decoder #3486

Merged

spencerp added 3 commits March 23, 2021 08:24

Merge branch 'master' into tf-components

ff28367

make a constants.py

3d4ff2c

noqa comment placement 🙄

05d4dd6

spencerp changed the title ~~[RFC] Breaking transformer modules.py into smaller files~~ Breaking transformer modules.py into smaller files Mar 23, 2021

klshuster approved these changes Mar 23, 2021

View reviewed changes

EricMichaelSmith approved these changes Mar 23, 2021

View reviewed changes

spencerp merged commit 7171d7c into master Mar 23, 2021

spencerp deleted the tf-components branch March 23, 2021 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Breaking transformer modules.py into smaller files #3466

Breaking transformer modules.py into smaller files #3466

spencerp commented Feb 24, 2021 •

edited

Loading

klshuster left a comment

jaseweston Feb 26, 2021

spencerp Feb 26, 2021

jaseweston Feb 26, 2021 •

edited

Loading

spencerp Feb 26, 2021 •

edited

Loading

jaseweston Feb 26, 2021

klshuster Feb 26, 2021

stephenroller Feb 26, 2021

spencerp Feb 26, 2021 •

edited

Loading

EricMichaelSmith Feb 26, 2021

stephenroller Feb 27, 2021

EricMichaelSmith left a comment

EricMichaelSmith Feb 26, 2021

Breaking transformer modules.py into smaller files #3466

Breaking transformer modules.py into smaller files #3466

Conversation

spencerp commented Feb 24, 2021 • edited Loading

klshuster left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaseweston Feb 26, 2021 • edited Loading

Choose a reason for hiding this comment

spencerp Feb 26, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spencerp Feb 26, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EricMichaelSmith left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spencerp commented Feb 24, 2021 •

edited

Loading

jaseweston Feb 26, 2021 •

edited

Loading

spencerp Feb 26, 2021 •

edited

Loading

spencerp Feb 26, 2021 •

edited

Loading