Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The strategy of searching hyperparameters for C&S #13

Open
skepsun opened this issue Feb 17, 2022 · 0 comments
Open

The strategy of searching hyperparameters for C&S #13

skepsun opened this issue Feb 17, 2022 · 0 comments

Comments

@skepsun
Copy link

skepsun commented Feb 17, 2022

Hi, thanks for your excellent work. I tried to search hyperparameters for MLP+C&S on arxiv. The performance of base MLP model is:

Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012

With the default hyperparameter setting provided in your example script run_experiments.py, I confirm the similar result:

Valid acc: 0.7401±0.0016 | Test acc: 0.7310±0.0015

However, when I tried to search values of alpha1, alpha2, adj1, adj2 (using autoscale) for better performance by validation accuracy, I found it is easy to obtain obviously higher validation accuracy but lower test accuracy. For example, after 200 trials using Optuna:

[I 2022-02-17 11:13:19,941] Trial 171 finished with value: 0.7397664351152723 and parameters: {'alpha1': 0.9998697337619668, 'adj1': 'AD', 'alpha2': 0.5793203196953342, 'adj2': 'DAD'}. Best is trial 106 with value: 0.741508104298802.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7414±0.0009 | Test acc: 0.7262±0.0014
[I 2022-02-17 11:13:23,649] Trial 172 finished with value: 0.74142085304876 and parameters: {'alpha1': 0.980621104544987, 'adj1': 'AD', 'alpha2': 0.6102579143062772, 'adj2': 'DAD'}. Best is trial 106 with value: 0.741508104298802.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7415±0.0009 | Test acc: 0.7260±0.0013
[I 2022-02-17 11:13:27,362] Trial 173 finished with value: 0.7414510554045438 and parameters: {'alpha1': 0.9845767019485405, 'adj1': 'AD', 'alpha2': 0.5881524805875032, 'adj2': 'DAD'}. Best is trial 106 with value: 0.741508104298802.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7406±0.0011 | Test acc: 0.7249±0.0014
[I 2022-02-17 11:13:31,084] Trial 174 finished with value: 0.7406355917983825 and parameters: {'alpha1': 0.9442333700288734, 'adj1': 'AD', 'alpha2': 0.563141874204418, 'adj2': 'DAD'}. Best is trial 106 with value: 0.741508104298802.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7416±0.0009 | Test acc: 0.7260±0.0013
[I 2022-02-17 11:13:34,819] Trial 175 finished with value: 0.7415550857411323 and parameters: {'alpha1': 0.9879853998605097, 'adj1': 'AD', 'alpha2': 0.5882664075898522, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7392±0.0010 | Test acc: 0.7250±0.0013
[I 2022-02-17 11:13:38,523] Trial 176 finished with value: 0.7392362159804021 and parameters: {'alpha1': 0.9995818269286275, 'adj1': 'AD', 'alpha2': 0.48803181308787474, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7397±0.0011 | Test acc: 0.7261±0.0013
[I 2022-02-17 11:13:42,242] Trial 177 finished with value: 0.7397362327594885 and parameters: {'alpha1': 0.9999159900091027, 'adj1': 'AD', 'alpha2': 0.5940887446932098, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7408±0.0009 | Test acc: 0.7250±0.0014
[I 2022-02-17 11:13:45,950] Trial 178 finished with value: 0.7407798919426826 and parameters: {'alpha1': 0.9543255302847481, 'adj1': 'AD', 'alpha2': 0.5497164972534224, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7414±0.0008 | Test acc: 0.7259±0.0013
[I 2022-02-17 11:13:49,662] Trial 179 finished with value: 0.7413805832410484 and parameters: {'alpha1': 0.983210322945432, 'adj1': 'AD', 'alpha2': 0.5881338038548104, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7405±0.0010 | Test acc: 0.7251±0.0013
[I 2022-02-17 11:13:53,379] Trial 180 finished with value: 0.7404980032887009 and parameters: {'alpha1': 0.9275816290037779, 'adj1': 'AD', 'alpha2': 0.6223246918695156, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7413±0.0009 | Test acc: 0.7258±0.0013
[I 2022-02-17 11:13:57,086] Trial 181 finished with value: 0.7413369576160274 and parameters: {'alpha1': 0.9833652973767343, 'adj1': 'AD', 'alpha2': 0.5736522033930277, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7412±0.0011 | Test acc: 0.7256±0.0014
[I 2022-02-17 11:14:00,802] Trial 182 finished with value: 0.741222859827511 and parameters: {'alpha1': 0.9610791967043985, 'adj1': 'AD', 'alpha2': 0.6077306751373425, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7414±0.0009 | Test acc: 0.7264±0.0013
[I 2022-02-17 11:14:04,526] Trial 183 finished with value: 0.7414040739622135 and parameters: {'alpha1': 0.9801855084637857, 'adj1': 'AD', 'alpha2': 0.6318577630605813, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7409±0.0010 | Test acc: 0.7256±0.0015
[I 2022-02-17 11:14:08,244] Trial 184 finished with value: 0.7408704990100339 and parameters: {'alpha1': 0.9430149061785248, 'adj1': 'AD', 'alpha2': 0.6348513831949518, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7413±0.0008 | Test acc: 0.7263±0.0014
[I 2022-02-17 11:14:11,950] Trial 185 finished with value: 0.7413268901640995 and parameters: {'alpha1': 0.9955916811039495, 'adj1': 'AD', 'alpha2': 0.5967686115392363, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7409±0.0009 | Test acc: 0.7264±0.0014
[I 2022-02-17 11:14:15,659] Trial 186 finished with value: 0.7409040571831269 and parameters: {'alpha1': 0.9986618985811345, 'adj1': 'AD', 'alpha2': 0.6233606296775285, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7413±0.0010 | Test acc: 0.7261±0.0013
[I 2022-02-17 11:14:19,369] Trial 187 finished with value: 0.7412899761736971 and parameters: {'alpha1': 0.9635071926679348, 'adj1': 'AD', 'alpha2': 0.6482700143571917, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7413±0.0009 | Test acc: 0.7258±0.0013
[I 2022-02-17 11:14:23,079] Trial 188 finished with value: 0.7412698412698413 and parameters: {'alpha1': 0.9807221398927547, 'adj1': 'AD', 'alpha2': 0.5715132614425644, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7405±0.0010 | Test acc: 0.7263±0.0014
[I 2022-02-17 11:14:26,796] Trial 189 finished with value: 0.7405382730964126 and parameters: {'alpha1': 0.9992832309685767, 'adj1': 'AD', 'alpha2': 0.6098670231892414, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7361±0.0009 | Test acc: 0.7218±0.0011
[I 2022-02-17 11:14:30,469] Trial 190 finished with value: 0.736108594248129 and parameters: {'alpha1': 0.948309908753022, 'adj1': 'AD', 'alpha2': 0.5250033002666431, 'adj2': 'DA'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7413±0.0010 | Test acc: 0.7267±0.0012
[I 2022-02-17 11:14:34,183] Trial 191 finished with value: 0.7412597738179134 and parameters: {'alpha1': 0.973154541171484, 'adj1': 'AD', 'alpha2': 0.6893967352205037, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7414±0.0009 | Test acc: 0.7265±0.0013
[I 2022-02-17 11:14:37,895] Trial 192 finished with value: 0.7413738716064298 and parameters: {'alpha1': 0.9807824762755216, 'adj1': 'AD', 'alpha2': 0.6396159416388099, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7413±0.0011 | Test acc: 0.7264±0.0012
[I 2022-02-17 11:14:41,608] Trial 193 finished with value: 0.7412597738179134 and parameters: {'alpha1': 0.9612390597009817, 'adj1': 'AD', 'alpha2': 0.6839165981115675, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7408±0.0010 | Test acc: 0.7264±0.0014
[I 2022-02-17 11:14:45,332] Trial 194 finished with value: 0.7407798919426827 and parameters: {'alpha1': 0.999002486052839, 'adj1': 'AD', 'alpha2': 0.61920737978392, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7406±0.0009 | Test acc: 0.7249±0.0013
[I 2022-02-17 11:14:49,049] Trial 195 finished with value: 0.7406020336252894 and parameters: {'alpha1': 0.9337309403205698, 'adj1': 'AD', 'alpha2': 0.5892864918403496, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7413±0.0011 | Test acc: 0.7262±0.0013
[I 2022-02-17 11:14:52,765] Trial 196 finished with value: 0.7413168227121715 and parameters: {'alpha1': 0.9668120293198653, 'adj1': 'AD', 'alpha2': 0.6508764478982865, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7401±0.0010 | Test acc: 0.7257±0.0014
[I 2022-02-17 11:14:56,481] Trial 197 finished with value: 0.7400953052115843 and parameters: {'alpha1': 0.9991146402719119, 'adj1': 'AD', 'alpha2': 0.5498901233953032, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7412±0.0009 | Test acc: 0.7264±0.0013
[I 2022-02-17 11:15:00,205] Trial 198 finished with value: 0.7412295714621296 and parameters: {'alpha1': 0.9766324963371931, 'adj1': 'AD', 'alpha2': 0.6295796864898812, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
Valid acc: 0.7289±0.0008 | Test acc: 0.7150±0.0012
Valid acc: 0.7390±0.0017 | Test acc: 0.7293±0.0015
[I 2022-02-17 11:15:03,918] Trial 199 finished with value: 0.7389811738648947 and parameters: {'alpha1': 0.9529192480940544, 'adj1': 'DA', 'alpha2': 0.6504111365505655, 'adj2': 'DAD'}. Best is trial 175 with value: 0.7415550857411323.
FrozenTrial(number=175, values=[0.7415550857411323], datetime_start=datetime.datetime(2022, 2, 17, 11, 13, 31, 101794), datetime_complete=datetime.datetime(2022, 2, 17, 11, 13, 34, 819114), params={'alpha1': 0.9879853998605097, 'adj1': 'AD', 'alpha2': 0.5882664075898522, 'adj2': 'DAD'}, distributions={'alpha1': UniformDistribution(high=1, low=0), 'adj1': CategoricalDistribution(choices=('DA', 'AD', 'DAD')), 'alpha2': UniformDistribution(high=1, low=0), 'adj2': CategoricalDistribution(choices=('DA', 'AD', 'DAD'))}, user_attrs={}, system_attrs={}, intermediate_values={}, trial_id=175, state=TrialState.COMPLETE, value=None)

The best hyperparameter setting with the highest validation accruacy has result:

Valid acc: 0.7416±0.0009 | Test acc: 0.7260±0.0013

Would you mind providing your strategy of searching hyperparameters?
Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant