In this project, a sentence generator is developed using LSTM modules. To optimize the final model, there are several aspects that can be optimized. Among those aspects, Hidden Dimension, Number of Layers, Embedding Dim, and Learning Rate was studied.
In case studies other parameters of the model was fixed to ensure accurate comparision.
Model name | H10 | H100 | S1 | S5 | E50 | E200 | LR0.1 | LR0.01 |
Embedding Size | 50 | 50 | 50 | 50 | 50 | 200 | 50 | 50 |
Hidden Layer Size | 10 | 100 | 10 | 10 | 10 | 10 | 10 | 10 |
Number of Layers | 2 | 2 | 1 | 5 | 2 | 2 | 2 | 2 |
Batch Size | 256 | 256 | 256 | 256 | 256 | 256 | 256 | 256 |
Epochs | 50 | 50 | 50 | 50 | 50 | 50 | 50 | 50 |
Learning Rate | 0.10 | 0.10 | 0.10 | 0.10 | 0.10 | 0.10 | 0.10 | 0.01 |
Hidden Dimension
- perplexity reduction speed is the same for both models
- H10 converges sooner than H100
- H100 has a better performance on the test split compared with H10
- H100 generates more accurate sentences compared with H10
- perplexity reduction speed is NOT the same for both models
- E50 has a better convergence properties compared with E200
- E200 has a better performance on the test split compared with E50
- E50 generates more accurate sentences compared with E200
- perplexity reduction speed is the same for both models
- convergance speed is the same for both models
- S1 has a better performance on the test split compared with S5
- S1 generates more accurate sentences compared with S5
- perplexity reduction speed is more for LR0.1 compared with LR0.01
- LR0.1 has a better convergence properties compared with LR0.01
- LR0.1 has a better performance on the test split compared with LR0.01
- LR0.1 generates more accurate sentences compared with LR0.01