Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Training: Step 2 #16

Open
affromero opened this issue Aug 15, 2017 · 9 comments
Open

Training: Step 2 #16

affromero opened this issue Aug 15, 2017 · 9 comments

Comments

@affromero
Copy link

Hello,

Regarding the training procedure on step 2:
python scripts/train_model.py --model_type EE --program_generator_start_from data/program_generator.py --num_iterations 100000 --checkpoint_path data/execution_engine.pt

I do not know if I have missed something, but program_generator_start_from is only invoked inside get_program_generator, for 'PG+EE' and 'PG' model types.

Thank you.

@rizar
Copy link

rizar commented Mar 9, 2018

Same question here. According to TRAINING.md, in Step 2 "we train the execution engine, using programs predicted from the program generator in the previous step". In train_model.py line 238 says "train execution engine with ground-truth programs". Can you please explain this discrepancy?

In any case, when I train --model_type=EE without Step 1 pretraining, the learning doesn't really progress (still at ~50% accuracy after 100000) iterations.

@liuweide01
Copy link

have you solve the problem?

@rizar
Copy link

rizar commented Apr 10, 2018

Yes, it's the learning rate. It should be decreased to 1e-5, and then step 2 works. Note, that it will indeed use the groundtruth programs in Step 2.

@liuweide01
Copy link

But how about the val accuracy?

@rizar
Copy link

rizar commented Apr 14, 2018

I get smth like 95-96%, which is what is reported in the paper.

@ankursikarwar
Copy link

@rizar can you please tell how did you get 95-96% accuracy by directly training the execution engine using the ground truth programs (as in step 2). My accuracy is oscillating around 0.47 even after 5000 iterations when using lr = 1e-5

@rizar
Copy link

rizar commented Aug 18, 2020

Can you try training longer?

@ankursikarwar
Copy link

Can you try training longer?

Thanks to you, I trained the execution engine for 100000 iterations with lr=1e-5 and got around 89% accuracy. Actually, accuracy increased quite slowly initially, and then between 40k and 60k iterations, it increased steeply.
accuracy graph

@smdp2000
Copy link

minimum pc specs you all are using to train this model, can anyone suggest

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants