Add `XLNetBackbone` #928

shivance · 2023-03-26T18:20:32Z

Closes #753

Referring XLNetBase, TransformerXL, Two stream Attention

shivance · 2023-03-26T19:24:46Z

Still WIP.

shivance · 2023-03-27T15:09:28Z

Hey @mattdangerw & @chenmoneygithub Turns out that addition of XLNet needs MultiHeadRelativeAttention and TwoStreamRelativeAttention apart from relative encoding, transformerXL layers.

Should I break this PR up into multiple ones? I'm not able to make entire model work correctly, just yet.

mattdangerw · 2023-03-28T18:13:20Z

@shivance timely question! We actually have a contribution guide for models about to land here. #820

Could you try following that and give us feedback?

Re the overall stucture, lets keep everything in a models/xlnet directory for now (don't contribute directly to keras_nlp.layers just yet, though we can follow up with that), and while looking at huggingface or model garden for inspiration is totally fine, try to make sure we stick to the "local" style of KerasNLP wherever possible when writing layers, models etc.

We also should make sure we have the full forward pass matching correctly before we get too deep into review on style, etc. You can usually try making a colab for starters that aligns the forward pass with a reference implementation and share that here!

abheesht17 · 2023-04-06T12:41:42Z

@shivance, let me know when this is ready for review!

shivance · 2023-04-06T14:59:23Z

@shivance, let me know when this is ready for review!

Really Sorry for slacking off here, am not able to find time cuz of personal reasons. Will follow up soon.

abheesht17 · 2023-04-06T15:19:49Z

No hurry, take your time! Take care!

susnato · 2023-06-04T18:48:46Z

Hi @shivance can I please continue this PR?

shivance · 2023-06-04T19:08:14Z

@susnato I have done basic ground work already, however the PR is bug prone, I'm not planning to work on it any sooner though as I've been onto #1052 . So you could probably fork my branch, and continue with my commits (which I made so far). What say?

susnato · 2023-06-04T19:16:59Z

Hi @shivance thanks for quick reply, I would love to continue your work. How about I fork your branch and push the commits on your repo(to shivance/keras-nlp_branch...) and you merge them, that way we both will share credits? or if you want I can copy the existing code and make a new PR. What do you say?

shivance · 2023-06-04T19:18:42Z

I think if I add you as collaborator on my fork, and then when you push to my branch, the same PR should get updated without getting closed.

…

On Mon, 5 Jun, 2023, 00:47 Susnato Dhar, ***@***.***> wrote: Hi @shivance <https://github.com/shivance> thanks for quick reply, I would love to continue your work. How about I fork your branch and push the commits on your repo(to shivance/keras-nlp_branch...) and you merge them, that way we both will share credits? or if you want I can copy the existing code and make a new PR. What do you say? — Reply to this email directly, view it on GitHub <#928 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMK2NO5D7TXLOOIO572OOITXJTNLNANCNFSM6AAAAAAWIKP6E4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

susnato · 2023-06-05T14:32:09Z

then please add me as a collaborator @shivance .

shivance · 2023-06-06T21:40:35Z

@susnato just gave!

susnato · 2023-06-10T17:27:37Z

Hi @mattdangerw, I am currently working on this integration. I have one small doubt -
As you said here
"We also should make sure we have the full forward pass matching correctly before we get too deep into review on style"
I am using the HuggingFace Implementation and weights as a reference for forward pass since it is Tensorflow-2. Is it ok or should I switch to another implementation?

susnato · 2023-06-16T15:40:30Z

Hi @mattdangerw @abheesht17 , In xlnet implementation there are some optional arguments such as mems and perm_mask which are mostly used during pretraining but during inference we mostly leave it as None, but as in keras-nlp we don't define the call method for Backbones(as I noticed in all models), instead we define all the Input layers in __init__ but that makes things complicated as if we define a Input layer for mems, we can't leave it Noneduring the inference otherwise they will give error.

Should I then define the call method in the backbone?(That would make this backbone an exception) or should I do a workaround(eg, instead of passing None we can pass -1e+9)?

susnato · 2023-06-17T11:45:41Z

Hi @shivance I am sorry but I think it's better to open a new PR. Since you created this PR I need to tag you all time to ask for changing description, to make it ready for review from draft or asking reviews from maintainers, and also you might get bothered about getting tagged all the time, so I am making a new PR, so please close this one also please do not worry, I am going to continue from your commits.

shivance mentioned this pull request Mar 26, 2023

Add XLNetBackbone #837

Closed

shivance marked this pull request as draft March 27, 2023 14:34

mattdangerw requested review from mattdangerw and abheesht17 March 28, 2023 18:07

mattdangerw assigned mattdangerw and abheesht17 Mar 28, 2023

shivance closed this Jun 4, 2023

shivance reopened this Jun 6, 2023

shivance and others added 3 commits June 7, 2023 10:09

added everything from prev PR

5966ae8

code format

c1235b6

check and rebase

e6fe7dc

susnato force-pushed the xlnet-backbone branch from 4841c31 to e6fe7dc Compare June 7, 2023 04:42

susnato added 3 commits June 14, 2023 03:34

encoder done + tested(w/o FF)

4388c0d

.

1e5fdf9

outputs same now

948d325

shivance mentioned this pull request Jun 16, 2023

Add Transformer-XL backbone #1077

Closed

susnato mentioned this pull request Jun 17, 2023

Add XLNetBackbone #1084

Merged

shivance closed this Jun 27, 2023

shivance deleted the xlnet-backbone branch June 27, 2023 08:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `XLNetBackbone` #928

Add `XLNetBackbone` #928

shivance commented Mar 26, 2023 •

edited

Loading

shivance commented Mar 26, 2023 •

edited

Loading

shivance commented Mar 27, 2023

mattdangerw commented Mar 28, 2023

abheesht17 commented Apr 6, 2023

shivance commented Apr 6, 2023

abheesht17 commented Apr 6, 2023

susnato commented Jun 4, 2023

shivance commented Jun 4, 2023 •

edited

Loading

susnato commented Jun 4, 2023

shivance commented Jun 4, 2023 via email

susnato commented Jun 5, 2023

shivance commented Jun 6, 2023

susnato commented Jun 10, 2023

susnato commented Jun 16, 2023

susnato commented Jun 17, 2023 •

edited

Loading

Add XLNetBackbone #928

Add XLNetBackbone #928

Conversation

shivance commented Mar 26, 2023 • edited Loading

shivance commented Mar 26, 2023 • edited Loading

shivance commented Mar 27, 2023

mattdangerw commented Mar 28, 2023

abheesht17 commented Apr 6, 2023

shivance commented Apr 6, 2023

abheesht17 commented Apr 6, 2023

susnato commented Jun 4, 2023

shivance commented Jun 4, 2023 • edited Loading

susnato commented Jun 4, 2023

shivance commented Jun 4, 2023 via email

susnato commented Jun 5, 2023

shivance commented Jun 6, 2023

susnato commented Jun 10, 2023

susnato commented Jun 16, 2023

susnato commented Jun 17, 2023 • edited Loading

Add `XLNetBackbone` #928

Add `XLNetBackbone` #928

shivance commented Mar 26, 2023 •

edited

Loading

shivance commented Mar 26, 2023 •

edited

Loading

shivance commented Jun 4, 2023 •

edited

Loading

susnato commented Jun 17, 2023 •

edited

Loading