You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The idea is really cool (outline is really important for long generations),
However, based on my own experience, you can get far better results if you include in the prompt both outline, summary up to this point (generate it with recursive summarization), previous paragraph, and the model's goal are to predict the next paragraph.
It would be interesting to see the results of your approach since I did not try this with J1 models.
Do you have any results you are willing to share about this?
(BTW, the best way I found to generalize to "longer than training prompt" length is using relative attention (such Transformer-XL) but I assume this is off the table with an already pre-trained 178B model..)
The text was updated successfully, but these errors were encountered:
The idea is really cool (outline is really important for long generations),
However, based on my own experience, you can get far better results if you include in the prompt both outline, summary up to this point (generate it with recursive summarization), previous paragraph, and the model's goal are to predict the next paragraph.
It would be interesting to see the results of your approach since I did not try this with J1 models.
Do you have any results you are willing to share about this?
(BTW, the best way I found to generalize to "longer than training prompt" length is using relative attention (such Transformer-XL) but I assume this is off the table with an already pre-trained 178B model..)
The text was updated successfully, but these errors were encountered: