-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In-context Learning task #4
Comments
Thanks for taking an interest in the code! I'm not immediately sure what the issue could be, but some things to try are:
Note that you likely need to try a number of random seeds to get a model that successfully learns the task. To save time, we also used a "patience" of 25 (this is a possible argument to the Could you also share any more details about what you observe? Do you get "loss is NaN" right away, or only after some training? |
Hi @danfriedman0, I also had issue replicating the induction experiment. The command was as suggested above, which is also copied below. I used a modified file "experiment_run_n.py" that iterates through seed when a training running out of the patience with an additional "patience" argument. The training seems to return a constant loss=5.81e+29 from seed 0 all the way to 100. By the way, some other experiments seemed to work, such as "sort", "reverse", etc. CUDA_VISIBLE_DEVICES=0 python experiment_run_n.py Also, it would be great if you can share the configuration for replicating all experiments in the paper, like that for "sort" and "conll_ner" in the README.md. Thanks! |
Add an example training script to reproduce the in-context learning experiment from the paper (see issue #4). An important detail is to set `--unembed_mask 0` (otherwise the model will be prevented from predicting the `unk`, which is used for this task). You may need to run the script with multiple seeds (e.g. 10) to get an initialization that learns to solve the task.
Hi all, sorry for the trouble, and thanks for the additional detail. I think I found the main problem: you need to set @Wangcheng-Xu : The scripts directory contains configurations used for the other experiments in the paper. Please let me know if you have any more questions. |
Thank you! I have tested the fixed configuration for the induction task, which works for me. |
Thank you to everyone involved for identifying and resolving the issue. |
Hello,
Thanks for your work. I attempted the in-context learning training command from the experiment details, but encountered a 'loss is NaN' error. Could you share the command you used? Appreciate it.
The text was updated successfully, but these errors were encountered: