forked from lessw2020/transformer_framework
-
Notifications
You must be signed in to change notification settings - Fork 0
/
food101.txt
611 lines (598 loc) · 52.6 KB
/
food101.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
******* loading model args.model='vitsmart'
******* loading model args.model='vitsmart'
******* loading model args.model='vitsmart'
******* loading model args.model='vitsmart'
--> World Size = 4
--> Device_count = 4
--> running with these defaults train_config(seed=2022, verbose=True, total_steps_to_run=None, print_memory_summary=False, num_epochs=2, model_weights_bf16=False, use_mixed_precision=False, use_low_precision_gradient_policy=False, use_tf32=False, optimizer='AnyPrecision', ap_use_kahan_summation=False, sharding_strategy=<ShardingStrategy.FULL_SHARD: 1>, print_sharding_plan=False, run_profiler=False, profile_folder='fsdp/profile_tracing', log_every=1, num_workers_dataloader=2, batch_size_training=68, fsdp_activation_checkpointing=False, run_validation=True, memory_report=True, nccl_debug_handler=True, distributed_debug=True, use_non_recursive_wrapping=False, use_tp=False, image_size=224, use_synthetic_data=False, use_pokemon_dataset=False, use_beans_dataset=False, save_model_checkpoint=False, load_model_checkpoint=False, checkpoint_max_save_count=2, save_optimizer=False, load_optimizer=False, optimizer_checkpoint_file='Adam-vit--1.pt', checkpoint_model_filename='vit--1.pt')
clearing gpu cache for all ranks
--> running with torch dist debug set to detail
--> total memory per gpu (GB) = 22.0626
policy is None
--> Prepping vit_relpos_base_patch16_rpn_224 model ...
stats is ready....? _stats=defaultdict(<class 'list'>, {}), local_rank=0, rank=0
--> vit_relpos_base_patch16_rpn_224 built.
built model with 85.722869M params
--> Warning - bf16 support not available. Using fp32
backward prefetch set to None
sharding set to ShardingStrategy.FULL_SHARD
--> Batch Size = 68
local rank 0 init time = 1.5327604980000018
memory stats reset, ready to track
Running with AnyPrecision Optimizer, momo=torch.float32, var = torch.float32, kahan summation = False
Epoch: 1 starting...
step: 1: time taken for the last 1 steps is 6.224457420999997, loss is 4.556710243225098
step: 2: time taken for the last 1 steps is 0.25443180500001006, loss is 4.726100444793701
step: 3: time taken for the last 1 steps is 0.26126896999998905, loss is 4.740593910217285
step: 4: time taken for the last 1 steps is 0.25712853800000346, loss is 4.724956512451172
step: 5: time taken for the last 1 steps is 0.26038229899999976, loss is 4.608603477478027
step: 6: time taken for the last 1 steps is 0.2569102459999897, loss is 4.639562606811523
step: 7: time taken for the last 1 steps is 0.2609058249999947, loss is 4.620149612426758
step: 8: time taken for the last 1 steps is 0.2560444850000039, loss is 4.645939826965332
step: 9: time taken for the last 1 steps is 0.2588406399999883, loss is 4.646860599517822
step: 10: time taken for the last 1 steps is 0.2588266599999969, loss is 4.529321193695068
step: 11: time taken for the last 1 steps is 0.2554669780000012, loss is 4.625144958496094
step: 12: time taken for the last 1 steps is 0.25876852000000383, loss is 4.520711898803711
step: 13: time taken for the last 1 steps is 0.25901369199999635, loss is 4.676128387451172
step: 14: time taken for the last 1 steps is 0.255653690999992, loss is 4.4814324378967285
step: 15: time taken for the last 1 steps is 0.2611768480000052, loss is 4.547496795654297
step: 16: time taken for the last 1 steps is 0.2544821259999992, loss is 4.657937526702881
step: 17: time taken for the last 1 steps is 0.2574680120000039, loss is 4.442592620849609
step: 18: time taken for the last 1 steps is 0.26373780000000124, loss is 4.521785736083984
step: 19: time taken for the last 1 steps is 0.2592381829999937, loss is 4.626273155212402
step: 20: time taken for the last 1 steps is 0.2570713380000029, loss is 4.532219409942627
step: 21: time taken for the last 1 steps is 0.25521710499999983, loss is 4.586917877197266
step: 22: time taken for the last 1 steps is 0.2537483569999921, loss is 4.631921291351318
step: 23: time taken for the last 1 steps is 0.25214174799999967, loss is 4.854366302490234
step: 24: time taken for the last 1 steps is 0.24851453299999093, loss is 4.5460991859436035
step: 25: time taken for the last 1 steps is 0.2523288790000038, loss is 4.5151166915893555
step: 26: time taken for the last 1 steps is 0.259327356, loss is 4.580170631408691
step: 27: time taken for the last 1 steps is 0.25792064799999537, loss is 4.560575485229492
step: 28: time taken for the last 1 steps is 0.25086665200001335, loss is 4.456768035888672
step: 29: time taken for the last 1 steps is 0.26047701900000675, loss is 4.499762535095215
step: 30: time taken for the last 1 steps is 0.25528731700001117, loss is 4.523378849029541
step: 31: time taken for the last 1 steps is 0.26271464800001354, loss is 4.533198833465576
step: 32: time taken for the last 1 steps is 0.25917369399999757, loss is 4.610552787780762
step: 33: time taken for the last 1 steps is 0.25706605700000296, loss is 4.555049896240234
step: 34: time taken for the last 1 steps is 0.26019912700002124, loss is 4.644559383392334
step: 35: time taken for the last 1 steps is 0.2515043689999743, loss is 4.4100236892700195
step: 36: time taken for the last 1 steps is 0.25164398100000085, loss is 4.489560604095459
step: 37: time taken for the last 1 steps is 0.2521807489999901, loss is 4.469703674316406
step: 38: time taken for the last 1 steps is 0.2548025700000096, loss is 4.491578102111816
step: 39: time taken for the last 1 steps is 0.26133294999999634, loss is 4.50607967376709
step: 40: time taken for the last 1 steps is 0.2578487179999911, loss is 4.486103057861328
step: 41: time taken for the last 1 steps is 0.26081843400001503, loss is 4.702195644378662
step: 42: time taken for the last 1 steps is 0.2574591220000002, loss is 4.504721641540527
step: 43: time taken for the last 1 steps is 0.2620693590000087, loss is 4.42141056060791
step: 44: time taken for the last 1 steps is 0.2550191830000017, loss is 4.46730899810791
step: 45: time taken for the last 1 steps is 0.25533965600001807, loss is 4.464085102081299
step: 46: time taken for the last 1 steps is 0.2582529330000227, loss is 4.499287128448486
step: 47: time taken for the last 1 steps is 0.26063145099999474, loss is 4.5415873527526855
step: 48: time taken for the last 1 steps is 0.25658790299999623, loss is 4.539662837982178
step: 49: time taken for the last 1 steps is 0.2536832860000118, loss is 4.5748209953308105
step: 50: time taken for the last 1 steps is 0.260121995999981, loss is 4.369707107543945
step: 51: time taken for the last 1 steps is 0.26197083800002474, loss is 4.47456693649292
step: 52: time taken for the last 1 steps is 0.2603090570000006, loss is 4.582790374755859
step: 53: time taken for the last 1 steps is 0.2510197630000164, loss is 4.347904682159424
step: 54: time taken for the last 1 steps is 0.2516130710000084, loss is 4.510412216186523
step: 55: time taken for the last 1 steps is 0.2569418369999994, loss is 4.485291004180908
step: 56: time taken for the last 1 steps is 0.26207493000001136, loss is 4.45244026184082
step: 57: time taken for the last 1 steps is 0.25972696099998416, loss is 4.43674373626709
step: 58: time taken for the last 1 steps is 0.2622236219999934, loss is 4.463484764099121
step: 59: time taken for the last 1 steps is 0.2598912809999945, loss is 4.4032464027404785
step: 60: time taken for the last 1 steps is 0.25398738499998785, loss is 4.397443771362305
step: 61: time taken for the last 1 steps is 0.2520642179999868, loss is 4.250542640686035
step: 62: time taken for the last 1 steps is 0.259571655000002, loss is 4.447912216186523
step: 63: time taken for the last 1 steps is 0.2555896060000009, loss is 4.437596797943115
step: 64: time taken for the last 1 steps is 0.26227825800000915, loss is 4.372616767883301
step: 65: time taken for the last 1 steps is 0.25532826200000613, loss is 4.282342910766602
step: 66: time taken for the last 1 steps is 0.2614504619999991, loss is 4.4875030517578125
step: 67: time taken for the last 1 steps is 0.26115508600000226, loss is 4.505937099456787
step: 68: time taken for the last 1 steps is 0.25712286699999254, loss is 4.456249237060547
step: 69: time taken for the last 1 steps is 0.2519989760000101, loss is 4.468084335327148
step: 70: time taken for the last 1 steps is 0.2551300880000156, loss is 4.32630729675293
step: 71: time taken for the last 1 steps is 0.253717139999992, loss is 4.329172611236572
step: 72: time taken for the last 1 steps is 0.2555635260000031, loss is 4.283868312835693
step: 73: time taken for the last 1 steps is 0.26223194700000363, loss is 4.537152290344238
step: 74: time taken for the last 1 steps is 0.26280980699999645, loss is 4.518974781036377
step: 75: time taken for the last 1 steps is 0.25795741299998554, loss is 4.479353427886963
step: 76: time taken for the last 1 steps is 0.2609946819999891, loss is 4.544875144958496
step: 77: time taken for the last 1 steps is 0.2483915960000047, loss is 4.347236633300781
step: 78: time taken for the last 1 steps is 0.2609416909999993, loss is 4.527989864349365
step: 79: time taken for the last 1 steps is 0.25855089499998485, loss is 4.4142255783081055
step: 80: time taken for the last 1 steps is 0.2526250989999994, loss is 4.46660852432251
step: 81: time taken for the last 1 steps is 0.25598162399998614, loss is 4.463648319244385
step: 82: time taken for the last 1 steps is 0.25559938700001794, loss is 4.502534866333008
step: 83: time taken for the last 1 steps is 0.25871039800000517, loss is 4.649286270141602
step: 84: time taken for the last 1 steps is 0.25609726700000124, loss is 4.508419036865234
step: 85: time taken for the last 1 steps is 0.25164947999999754, loss is 4.464930534362793
step: 86: time taken for the last 1 steps is 0.25470685999999887, loss is 4.450496196746826
step: 87: time taken for the last 1 steps is 0.2588861419999944, loss is 4.545038223266602
step: 88: time taken for the last 1 steps is 0.25199988700001086, loss is 4.434495449066162
step: 89: time taken for the last 1 steps is 0.26075993800000674, loss is 4.518768787384033
step: 90: time taken for the last 1 steps is 0.2590612739999756, loss is 4.492814540863037
step: 91: time taken for the last 1 steps is 0.2586902680000094, loss is 4.455684661865234
step: 92: time taken for the last 1 steps is 0.2565992460000075, loss is 4.374605178833008
step: 93: time taken for the last 1 steps is 0.2528125130000092, loss is 4.327826023101807
step: 94: time taken for the last 1 steps is 0.2583574709999823, loss is 4.3085408210754395
step: 95: time taken for the last 1 steps is 0.2593954710000048, loss is 4.34464168548584
step: 96: time taken for the last 1 steps is 0.252541817000008, loss is 4.4452714920043945
step: 97: time taken for the last 1 steps is 0.25419546899999546, loss is 4.565735340118408
step: 98: time taken for the last 1 steps is 0.2565571060000025, loss is 4.539857864379883
step: 99: time taken for the last 1 steps is 0.2563246910000032, loss is 4.644400119781494
step: 100: time taken for the last 1 steps is 0.2632913180000003, loss is 4.381629467010498
step: 101: time taken for the last 1 steps is 0.2628080879999857, loss is 4.4490156173706055
step: 102: time taken for the last 1 steps is 0.2571874879999996, loss is 4.361649036407471
step: 103: time taken for the last 1 steps is 0.2600162940000246, loss is 4.6384782791137695
step: 104: time taken for the last 1 steps is 0.2600821550000205, loss is 4.630778789520264
step: 105: time taken for the last 1 steps is 0.2540322469999978, loss is 4.514150619506836
step: 106: time taken for the last 1 steps is 0.25883444999999483, loss is 4.39536714553833
step: 107: time taken for the last 1 steps is 0.2584115820000079, loss is 4.485714435577393
step: 108: time taken for the last 1 steps is 0.26006182400001876, loss is 4.468821048736572
step: 109: time taken for the last 1 steps is 0.25974721899999054, loss is 4.567099094390869
step: 110: time taken for the last 1 steps is 0.25267529999999283, loss is 4.386360168457031
step: 111: time taken for the last 1 steps is 0.26327380699999026, loss is 4.325007915496826
step: 112: time taken for the last 1 steps is 0.26067661700000144, loss is 4.396040439605713
step: 113: time taken for the last 1 steps is 0.25966461699999854, loss is 4.383045196533203
step: 114: time taken for the last 1 steps is 0.2625774740000111, loss is 4.5528082847595215
step: 115: time taken for the last 1 steps is 0.2596532260000117, loss is 4.544152736663818
step: 116: time taken for the last 1 steps is 0.2446388630000058, loss is 4.470468997955322
step: 117: time taken for the last 1 steps is 0.25887863099998754, loss is 4.473755836486816
step: 118: time taken for the last 1 steps is 0.2514864469999907, loss is 4.466084003448486
step: 119: time taken for the last 1 steps is 0.2595187439999904, loss is 4.458154678344727
step: 120: time taken for the last 1 steps is 0.25970338700000184, loss is 4.572976589202881
step: 121: time taken for the last 1 steps is 0.2571176170000058, loss is 4.582334518432617
step: 122: time taken for the last 1 steps is 0.25083667399999854, loss is 4.496822357177734
step: 123: time taken for the last 1 steps is 0.24888235599999575, loss is 4.464864253997803
step: 124: time taken for the last 1 steps is 0.2501425800000163, loss is 4.56265926361084
step: 125: time taken for the last 1 steps is 0.2562360389999867, loss is 4.4036335945129395
step: 126: time taken for the last 1 steps is 0.25835999099999185, loss is 4.376188278198242
step: 127: time taken for the last 1 steps is 0.25613184799999544, loss is 4.47650671005249
step: 128: time taken for the last 1 steps is 0.25296842599999536, loss is 4.513776779174805
step: 129: time taken for the last 1 steps is 0.259235958000005, loss is 4.417741775512695
step: 130: time taken for the last 1 steps is 0.25202945700002033, loss is 4.447234630584717
step: 131: time taken for the last 1 steps is 0.2531651790000069, loss is 4.208217620849609
step: 132: time taken for the last 1 steps is 0.25711992600000144, loss is 4.3844733238220215
step: 133: time taken for the last 1 steps is 0.2544991860000039, loss is 4.48995304107666
step: 134: time taken for the last 1 steps is 0.2609652120000021, loss is 4.435860633850098
step: 135: time taken for the last 1 steps is 0.252543916999997, loss is 4.402706623077393
step: 136: time taken for the last 1 steps is 0.258415862000021, loss is 4.3937554359436035
step: 137: time taken for the last 1 steps is 0.2603776600000174, loss is 4.373472213745117
step: 138: time taken for the last 1 steps is 0.2575525449999816, loss is 4.507457733154297
step: 139: time taken for the last 1 steps is 0.2555811770000105, loss is 4.530117511749268
step: 140: time taken for the last 1 steps is 0.2552334000000087, loss is 4.448877811431885
step: 141: time taken for the last 1 steps is 0.2534439799999859, loss is 4.451408386230469
step: 142: time taken for the last 1 steps is 0.2559829719999982, loss is 4.331961631774902
step: 143: time taken for the last 1 steps is 0.2625455049999914, loss is 4.558147430419922
step: 144: time taken for the last 1 steps is 0.25998349399998233, loss is 4.4368486404418945
step: 145: time taken for the last 1 steps is 0.25940637200000083, loss is 4.276172161102295
step: 146: time taken for the last 1 steps is 0.25792453099998625, loss is 4.553081512451172
step: 147: time taken for the last 1 steps is 0.2584908629999916, loss is 4.38977575302124
step: 148: time taken for the last 1 steps is 0.2631084880000003, loss is 4.492251396179199
step: 149: time taken for the last 1 steps is 0.2568482789999962, loss is 4.420027732849121
step: 150: time taken for the last 1 steps is 0.2574529120000193, loss is 4.313426971435547
step: 151: time taken for the last 1 steps is 0.256303067999994, loss is 4.543499946594238
step: 152: time taken for the last 1 steps is 0.2547805269999799, loss is 4.285205841064453
step: 153: time taken for the last 1 steps is 0.2508665259999816, loss is 4.411813735961914
step: 154: time taken for the last 1 steps is 0.25922828800000275, loss is 4.2857441902160645
step: 155: time taken for the last 1 steps is 0.2597400989999983, loss is 4.444000244140625
step: 156: time taken for the last 1 steps is 0.2532377250000195, loss is 4.442224979400635
step: 157: time taken for the last 1 steps is 0.257684936000004, loss is 4.592156887054443
step: 158: time taken for the last 1 steps is 0.25736490100001674, loss is 4.472250938415527
step: 159: time taken for the last 1 steps is 0.2531340939999893, loss is 4.346929550170898
step: 160: time taken for the last 1 steps is 0.2568183190000184, loss is 4.5280914306640625
step: 161: time taken for the last 1 steps is 0.2600725150000187, loss is 4.414948463439941
step: 162: time taken for the last 1 steps is 0.26214661900002056, loss is 4.439939498901367
step: 163: time taken for the last 1 steps is 0.26022669799999676, loss is 4.286535263061523
step: 164: time taken for the last 1 steps is 0.2507720549999988, loss is 4.532018661499023
step: 165: time taken for the last 1 steps is 0.25475988700000585, loss is 4.342003345489502
step: 166: time taken for the last 1 steps is 0.2607287280000037, loss is 4.426819324493408
step: 167: time taken for the last 1 steps is 0.2584712130000071, loss is 4.397274017333984
step: 168: time taken for the last 1 steps is 0.2588808910000182, loss is 4.305877208709717
step: 169: time taken for the last 1 steps is 0.25874874800001635, loss is 4.4019880294799805
step: 170: time taken for the last 1 steps is 0.25662401400001045, loss is 4.406705856323242
step: 171: time taken for the last 1 steps is 0.262720929000011, loss is 4.458943843841553
step: 172: time taken for the last 1 steps is 0.2637851009999963, loss is 4.4185333251953125
step: 173: time taken for the last 1 steps is 0.2594104310000205, loss is 4.379662990570068
step: 174: time taken for the last 1 steps is 0.258292689000001, loss is 4.401475429534912
step: 175: time taken for the last 1 steps is 0.252365387999987, loss is 4.25689697265625
step: 176: time taken for the last 1 steps is 0.2542289060000087, loss is 4.24041223526001
step: 177: time taken for the last 1 steps is 0.25809521500002575, loss is 4.1480817794799805
step: 178: time taken for the last 1 steps is 0.26435420300001056, loss is 4.138119697570801
step: 179: time taken for the last 1 steps is 0.25605084299999703, loss is 4.343449592590332
step: 180: time taken for the last 1 steps is 0.2604353130000163, loss is 4.479880332946777
step: 181: time taken for the last 1 steps is 0.2586327859999926, loss is 4.525196075439453
step: 182: time taken for the last 1 steps is 0.2607958609999912, loss is 4.313002586364746
step: 183: time taken for the last 1 steps is 0.2533237970000073, loss is 4.448480129241943
step: 184: time taken for the last 1 steps is 0.25938049199999114, loss is 4.382685661315918
step: 185: time taken for the last 1 steps is 0.25864964600000917, loss is 4.193593502044678
step: 186: time taken for the last 1 steps is 0.25800690300002316, loss is 4.353750705718994
step: 187: time taken for the last 1 steps is 0.2638172919999988, loss is 4.274607181549072
step: 188: time taken for the last 1 steps is 0.25893050199999834, loss is 4.38017463684082
step: 189: time taken for the last 1 steps is 0.26112918699999454, loss is 4.4130964279174805
step: 190: time taken for the last 1 steps is 0.2585761850000381, loss is 4.518107891082764
step: 191: time taken for the last 1 steps is 0.257553683000026, loss is 4.45909309387207
step: 192: time taken for the last 1 steps is 0.2611592580000206, loss is 4.433404445648193
step: 193: time taken for the last 1 steps is 0.24741916699997546, loss is 4.434021472930908
step: 194: time taken for the last 1 steps is 0.26111826700002894, loss is 4.334896087646484
step: 195: time taken for the last 1 steps is 0.25549793199996884, loss is 4.103496551513672
step: 196: time taken for the last 1 steps is 0.2583862700000168, loss is 4.123104095458984
step: 197: time taken for the last 1 steps is 0.25070317300003353, loss is 4.374506950378418
step: 198: time taken for the last 1 steps is 0.26513947900002677, loss is 4.312832355499268
step: 199: time taken for the last 1 steps is 0.257064583999977, loss is 4.364517688751221
step: 200: time taken for the last 1 steps is 0.25475813700001027, loss is 4.276547908782959
step: 201: time taken for the last 1 steps is 0.2602753290000237, loss is 4.392114162445068
step: 202: time taken for the last 1 steps is 0.2628212509999912, loss is 4.169902801513672
step: 203: time taken for the last 1 steps is 0.26160398700000087, loss is 4.260707855224609
step: 204: time taken for the last 1 steps is 0.2558927890000291, loss is 4.295074939727783
step: 205: time taken for the last 1 steps is 0.25681487800000014, loss is 4.263187885284424
step: 206: time taken for the last 1 steps is 0.260273159999997, loss is 4.278339385986328
step: 207: time taken for the last 1 steps is 0.26173398899999256, loss is 4.581784248352051
step: 208: time taken for the last 1 steps is 0.2598755809999602, loss is 4.12169075012207
step: 209: time taken for the last 1 steps is 0.25582867899998973, loss is 4.376415252685547
step: 210: time taken for the last 1 steps is 0.2619955050000158, loss is 4.300075531005859
step: 211: time taken for the last 1 steps is 0.258408390999989, loss is 4.371187686920166
step: 212: time taken for the last 1 steps is 0.24944178800001282, loss is 4.292278289794922
step: 213: time taken for the last 1 steps is 0.25588329900000417, loss is 4.3001275062561035
step: 214: time taken for the last 1 steps is 0.2590421250000077, loss is 4.370543956756592
step: 215: time taken for the last 1 steps is 0.25961762600002203, loss is 4.191922664642334
step: 216: time taken for the last 1 steps is 0.2650899879999997, loss is 4.38754415512085
step: 217: time taken for the last 1 steps is 0.262825351999993, loss is 4.216153621673584
step: 218: time taken for the last 1 steps is 0.25955601499998693, loss is 4.374068737030029
step: 219: time taken for the last 1 steps is 0.2557204959999808, loss is 4.335903644561768
step: 220: time taken for the last 1 steps is 0.25875234799997315, loss is 4.248699188232422
step: 221: time taken for the last 1 steps is 0.2553890199999955, loss is 4.281352519989014
step: 222: time taken for the last 1 steps is 0.26401346699998385, loss is 4.3788275718688965
step: 223: time taken for the last 1 steps is 0.25741274899996824, loss is 4.4351677894592285
step: 224: time taken for the last 1 steps is 0.2508081200000447, loss is 4.27348518371582
step: 225: time taken for the last 1 steps is 0.25407534499998974, loss is 4.3527655601501465
step: 226: time taken for the last 1 steps is 0.26032811700002867, loss is 4.1881303787231445
step: 227: time taken for the last 1 steps is 0.24877194100002953, loss is 4.4105305671691895
step: 228: time taken for the last 1 steps is 0.2571053639999832, loss is 4.311103820800781
step: 229: time taken for the last 1 steps is 0.25710870400001795, loss is 4.210919380187988
step: 230: time taken for the last 1 steps is 0.25830068699997355, loss is 4.263632774353027
step: 231: time taken for the last 1 steps is 0.2617945559999839, loss is 4.136380672454834
step: 232: time taken for the last 1 steps is 0.2586165639999649, loss is 4.133800983428955
step: 233: time taken for the last 1 steps is 0.26150511999998116, loss is 4.268002033233643
step: 234: time taken for the last 1 steps is 0.25724366600002213, loss is 4.23779821395874
step: 235: time taken for the last 1 steps is 0.2510931360000086, loss is 4.327415943145752
step: 236: time taken for the last 1 steps is 0.2607689559999926, loss is 4.083348751068115
step: 237: time taken for the last 1 steps is 0.256901809999988, loss is 4.1864705085754395
step: 238: time taken for the last 1 steps is 0.2605832630000009, loss is 4.30706787109375
step: 239: time taken for the last 1 steps is 0.2559698719999801, loss is 4.1464643478393555
step: 240: time taken for the last 1 steps is 0.25319333700002744, loss is 4.3162994384765625
step: 241: time taken for the last 1 steps is 0.26174438499998587, loss is 4.037209510803223
step: 242: time taken for the last 1 steps is 0.2614033579999955, loss is 4.227962493896484
step: 243: time taken for the last 1 steps is 0.2579666109999721, loss is 4.183880805969238
step: 244: time taken for the last 1 steps is 0.25777276700000584, loss is 4.078238010406494
step: 245: time taken for the last 1 steps is 0.2545846749999896, loss is 4.299968719482422
step: 246: time taken for the last 1 steps is 0.25669459599998845, loss is 4.384758472442627
step: 247: time taken for the last 1 steps is 0.265473218000011, loss is 4.18385124206543
step: 248: time taken for the last 1 steps is 0.25879014700001335, loss is 3.889302968978882
step: 249: time taken for the last 1 steps is 0.26663682100002006, loss is 4.418885231018066
step: 250: time taken for the last 1 steps is 0.2566412140000125, loss is 4.30267333984375
step: 251: time taken for the last 1 steps is 0.259615182999994, loss is 4.105579376220703
step: 252: time taken for the last 1 steps is 0.2547257669999681, loss is 4.181215763092041
step: 253: time taken for the last 1 steps is 0.26540830599998344, loss is 4.274713516235352
step: 254: time taken for the last 1 steps is 0.25243632200005095, loss is 4.273143291473389
step: 255: time taken for the last 1 steps is 0.25957230199998094, loss is 4.336678981781006
step: 256: time taken for the last 1 steps is 0.2635708809999642, loss is 4.1258111000061035
step: 257: time taken for the last 1 steps is 0.25923046600001953, loss is 4.173645496368408
step: 258: time taken for the last 1 steps is 0.2581271239999978, loss is 4.333664417266846
step: 259: time taken for the last 1 steps is 0.24891347299995914, loss is 4.020784854888916
step: 260: time taken for the last 1 steps is 0.2584670709999841, loss is 4.313625335693359
step: 261: time taken for the last 1 steps is 0.2507348390000175, loss is 4.3096137046813965
step: 262: time taken for the last 1 steps is 0.26164700399999674, loss is 3.9310176372528076
step: 263: time taken for the last 1 steps is 0.2514092129999881, loss is 4.145771026611328
step: 264: time taken for the last 1 steps is 0.26232634699999835, loss is 4.142434120178223
step: 265: time taken for the last 1 steps is 0.2544166220000079, loss is 4.321177005767822
step: 266: time taken for the last 1 steps is 0.2597196650000342, loss is 4.194943904876709
step: 267: time taken for the last 1 steps is 0.2504601229999821, loss is 4.267314910888672
step: 268: time taken for the last 1 steps is 0.25638195100003713, loss is 3.9375503063201904
step: 269: time taken for the last 1 steps is 0.25710876499999813, loss is 4.302390098571777
step: 270: time taken for the last 1 steps is 0.26005489200002785, loss is 4.018846035003662
step: 271: time taken for the last 1 steps is 0.2612267349999797, loss is 4.288052558898926
step: 272: time taken for the last 1 steps is 0.2636828830000013, loss is 3.987633228302002
step: 273: time taken for the last 1 steps is 0.26190039800002296, loss is 4.17900276184082
step: 274: time taken for the last 1 steps is 0.2585914029999685, loss is 4.28294038772583
step: 275: time taken for the last 1 steps is 0.2609577389999913, loss is 4.20253849029541
step: 276: time taken for the last 1 steps is 0.2546831170000132, loss is 4.150814056396484
step: 277: time taken for the last 1 steps is 0.2545747549999646, loss is 4.421744346618652
step: 278: time taken for the last 1 steps is 0.2627312150000307, loss is 4.14412784576416
step: 279: time taken for the last 1 steps is 0.23578439599998546, loss is 4.081680774688721
val_loss : 4.0692 : val_acc: 0.0797
updating stats...
Epoch: 2 starting...
step: 1: time taken for the last 1 steps is 0.6895236540000269, loss is 4.060554027557373
step: 2: time taken for the last 1 steps is 0.255047229000013, loss is 4.1662750244140625
step: 3: time taken for the last 1 steps is 0.257597928999985, loss is 4.416499137878418
step: 4: time taken for the last 1 steps is 0.25465816199999836, loss is 4.190380096435547
step: 5: time taken for the last 1 steps is 0.25246123699997725, loss is 4.199824333190918
step: 6: time taken for the last 1 steps is 0.25101609900002586, loss is 4.1920366287231445
step: 7: time taken for the last 1 steps is 0.2520030090000205, loss is 3.890657424926758
step: 8: time taken for the last 1 steps is 0.2571485010000174, loss is 4.049408435821533
step: 9: time taken for the last 1 steps is 0.25378751400000965, loss is 4.208953380584717
step: 10: time taken for the last 1 steps is 0.25124547399997255, loss is 4.103820323944092
step: 11: time taken for the last 1 steps is 0.25142426799999384, loss is 4.126346588134766
step: 12: time taken for the last 1 steps is 0.2537671740000178, loss is 4.031586170196533
step: 13: time taken for the last 1 steps is 0.2543134239999745, loss is 4.236934185028076
step: 14: time taken for the last 1 steps is 0.24835310699995716, loss is 4.272700309753418
step: 15: time taken for the last 1 steps is 0.24928251500000442, loss is 3.9991979598999023
step: 16: time taken for the last 1 steps is 0.25865470000002233, loss is 4.10144567489624
step: 17: time taken for the last 1 steps is 0.2431046329999731, loss is 4.187726020812988
step: 18: time taken for the last 1 steps is 0.2513429559999736, loss is 4.126735687255859
step: 19: time taken for the last 1 steps is 0.24605886099999452, loss is 4.2733988761901855
step: 20: time taken for the last 1 steps is 0.2512526750000461, loss is 4.1199822425842285
step: 21: time taken for the last 1 steps is 0.2494953399999531, loss is 4.3857197761535645
step: 22: time taken for the last 1 steps is 0.2521242509999979, loss is 4.071629047393799
step: 23: time taken for the last 1 steps is 0.2556379209999591, loss is 4.299444675445557
step: 24: time taken for the last 1 steps is 0.2507160539999518, loss is 4.1478352546691895
step: 25: time taken for the last 1 steps is 0.25028089500000306, loss is 3.9528934955596924
step: 26: time taken for the last 1 steps is 0.24598034999996798, loss is 4.102459907531738
step: 27: time taken for the last 1 steps is 0.2484587689999671, loss is 4.148408889770508
step: 28: time taken for the last 1 steps is 0.24950185000000147, loss is 4.009032249450684
step: 29: time taken for the last 1 steps is 0.24847834899998134, loss is 4.109431266784668
step: 30: time taken for the last 1 steps is 0.2538478459999851, loss is 4.0545477867126465
step: 31: time taken for the last 1 steps is 0.2520302300000026, loss is 4.252261161804199
step: 32: time taken for the last 1 steps is 0.25529340399998546, loss is 4.322540283203125
step: 33: time taken for the last 1 steps is 0.2528188150000119, loss is 4.501019477844238
step: 34: time taken for the last 1 steps is 0.24769680299999663, loss is 4.263233661651611
step: 35: time taken for the last 1 steps is 0.2544575469999586, loss is 3.9750115871429443
step: 36: time taken for the last 1 steps is 0.2559571869999786, loss is 4.203736305236816
step: 37: time taken for the last 1 steps is 0.2529225369999608, loss is 4.022042751312256
step: 38: time taken for the last 1 steps is 0.25179738499997484, loss is 4.098234176635742
step: 39: time taken for the last 1 steps is 0.2539763579999885, loss is 4.185070037841797
step: 40: time taken for the last 1 steps is 0.25534832599998936, loss is 4.022120475769043
step: 41: time taken for the last 1 steps is 0.2565016880000144, loss is 4.226497650146484
step: 42: time taken for the last 1 steps is 0.25094139800000903, loss is 4.202787399291992
step: 43: time taken for the last 1 steps is 0.24928080500001215, loss is 3.866891384124756
step: 44: time taken for the last 1 steps is 0.2496636330000115, loss is 4.1885809898376465
step: 45: time taken for the last 1 steps is 0.2540368489999878, loss is 4.051706790924072
step: 46: time taken for the last 1 steps is 0.2575352080000357, loss is 4.0146074295043945
step: 47: time taken for the last 1 steps is 0.25197204800002737, loss is 4.0135087966918945
step: 48: time taken for the last 1 steps is 0.25425183399994467, loss is 4.074001312255859
step: 49: time taken for the last 1 steps is 0.26017739100001336, loss is 4.126014232635498
step: 50: time taken for the last 1 steps is 0.24913729299998977, loss is 4.017709732055664
step: 51: time taken for the last 1 steps is 0.24822025399998893, loss is 4.087454795837402
step: 52: time taken for the last 1 steps is 0.2576250600000094, loss is 4.096774578094482
step: 53: time taken for the last 1 steps is 0.25192096699998956, loss is 3.8628809452056885
step: 54: time taken for the last 1 steps is 0.2502871949999985, loss is 4.042823791503906
step: 55: time taken for the last 1 steps is 0.2551518509999937, loss is 4.096584320068359
step: 56: time taken for the last 1 steps is 0.24927577500000098, loss is 4.093833923339844
step: 57: time taken for the last 1 steps is 0.25601656899999625, loss is 4.1299357414245605
step: 58: time taken for the last 1 steps is 0.25305437899999106, loss is 3.8931045532226562
step: 59: time taken for the last 1 steps is 0.25280408499997975, loss is 4.020094394683838
step: 60: time taken for the last 1 steps is 0.25229055500000186, loss is 3.9865925312042236
step: 61: time taken for the last 1 steps is 0.2509068280000406, loss is 3.7625646591186523
step: 62: time taken for the last 1 steps is 0.25223026299994444, loss is 4.132228851318359
step: 63: time taken for the last 1 steps is 0.25400657899996304, loss is 4.251020431518555
step: 64: time taken for the last 1 steps is 0.2543748550000373, loss is 3.9893126487731934
step: 65: time taken for the last 1 steps is 0.2478567969999972, loss is 4.047043800354004
step: 66: time taken for the last 1 steps is 0.25233300600001485, loss is 4.010395050048828
step: 67: time taken for the last 1 steps is 0.25430066400002715, loss is 4.039310455322266
step: 68: time taken for the last 1 steps is 0.25297912799999267, loss is 3.915067672729492
step: 69: time taken for the last 1 steps is 0.24731605699997772, loss is 4.043298244476318
step: 70: time taken for the last 1 steps is 0.2510627600000248, loss is 3.990957736968994
step: 71: time taken for the last 1 steps is 0.252221703000032, loss is 3.917525291442871
step: 72: time taken for the last 1 steps is 0.2518561360000149, loss is 3.5985665321350098
step: 73: time taken for the last 1 steps is 0.2528815359999612, loss is 3.968690872192383
step: 74: time taken for the last 1 steps is 0.2578979450000247, loss is 3.951397657394409
step: 75: time taken for the last 1 steps is 0.2597007910000002, loss is 3.9562714099884033
step: 76: time taken for the last 1 steps is 0.2527217329999871, loss is 4.12548828125
step: 77: time taken for the last 1 steps is 0.25109309100002974, loss is 3.622377872467041
step: 78: time taken for the last 1 steps is 0.2581781509999814, loss is 3.7929651737213135
step: 79: time taken for the last 1 steps is 0.2570306339999888, loss is 3.689714193344116
step: 80: time taken for the last 1 steps is 0.24850777599999674, loss is 3.8619191646575928
step: 81: time taken for the last 1 steps is 0.25081470200001377, loss is 4.062615394592285
step: 82: time taken for the last 1 steps is 0.2585468339999579, loss is 4.077060222625732
step: 83: time taken for the last 1 steps is 0.25180443200002856, loss is 4.073394775390625
step: 84: time taken for the last 1 steps is 0.24951591699999653, loss is 4.137074947357178
step: 85: time taken for the last 1 steps is 0.25397839400000066, loss is 4.117764472961426
step: 86: time taken for the last 1 steps is 0.251094237000018, loss is 3.9760029315948486
step: 87: time taken for the last 1 steps is 0.2530821359999891, loss is 4.024318695068359
step: 88: time taken for the last 1 steps is 0.25651510399995914, loss is 4.068992614746094
step: 89: time taken for the last 1 steps is 0.2539558839999927, loss is 4.054355621337891
step: 90: time taken for the last 1 steps is 0.24800747699998738, loss is 4.033942699432373
step: 91: time taken for the last 1 steps is 0.2541161970000303, loss is 4.038212299346924
step: 92: time taken for the last 1 steps is 0.25360788699998693, loss is 3.9520492553710938
step: 93: time taken for the last 1 steps is 0.2506446589999882, loss is 3.8900487422943115
step: 94: time taken for the last 1 steps is 0.2521775089999778, loss is 3.825072765350342
step: 95: time taken for the last 1 steps is 0.2520350560000111, loss is 3.8265624046325684
step: 96: time taken for the last 1 steps is 0.25294021299998803, loss is 3.9427266120910645
step: 97: time taken for the last 1 steps is 0.24719355099995255, loss is 4.084011077880859
step: 98: time taken for the last 1 steps is 0.25724487800005136, loss is 4.029963970184326
step: 99: time taken for the last 1 steps is 0.2536463269999558, loss is 4.29442024230957
step: 100: time taken for the last 1 steps is 0.2539252429999692, loss is 4.036984920501709
step: 101: time taken for the last 1 steps is 0.24771063200000754, loss is 4.077581882476807
step: 102: time taken for the last 1 steps is 0.25306863699995574, loss is 3.823577404022217
step: 103: time taken for the last 1 steps is 0.2508145720000243, loss is 4.06989860534668
step: 104: time taken for the last 1 steps is 0.24493633599996656, loss is 4.044546604156494
step: 105: time taken for the last 1 steps is 0.25689133200000924, loss is 4.241550445556641
step: 106: time taken for the last 1 steps is 0.2465660980000166, loss is 3.9005274772644043
step: 107: time taken for the last 1 steps is 0.2574096010000062, loss is 3.9601924419403076
step: 108: time taken for the last 1 steps is 0.2569696730000146, loss is 3.9257404804229736
step: 109: time taken for the last 1 steps is 0.2519359140000006, loss is 4.191210746765137
step: 110: time taken for the last 1 steps is 0.25879740900001025, loss is 3.7536303997039795
step: 111: time taken for the last 1 steps is 0.25032065300001705, loss is 3.822922706604004
step: 112: time taken for the last 1 steps is 0.2596106149999855, loss is 3.990379810333252
step: 113: time taken for the last 1 steps is 0.2434905380000032, loss is 3.814377784729004
step: 114: time taken for the last 1 steps is 0.25473566800002345, loss is 4.110430717468262
step: 115: time taken for the last 1 steps is 0.24308763999999883, loss is 4.144354343414307
step: 116: time taken for the last 1 steps is 0.2531678579999834, loss is 3.8247945308685303
step: 117: time taken for the last 1 steps is 0.24499736700005315, loss is 4.005329608917236
step: 118: time taken for the last 1 steps is 0.24327732300002936, loss is 3.952025890350342
step: 119: time taken for the last 1 steps is 0.25320832900001733, loss is 3.9203929901123047
step: 120: time taken for the last 1 steps is 0.24508568999999625, loss is 4.096575736999512
step: 121: time taken for the last 1 steps is 0.25340446300003805, loss is 3.988690137863159
step: 122: time taken for the last 1 steps is 0.2477578019999669, loss is 3.873231887817383
step: 123: time taken for the last 1 steps is 0.24619195100001434, loss is 4.040771007537842
step: 124: time taken for the last 1 steps is 0.2529802249999875, loss is 3.9246785640716553
step: 125: time taken for the last 1 steps is 0.2514415050000025, loss is 4.095849514007568
step: 126: time taken for the last 1 steps is 0.25678181999995786, loss is 3.991701126098633
step: 127: time taken for the last 1 steps is 0.2541619080000146, loss is 3.820143699645996
step: 128: time taken for the last 1 steps is 0.2580276239999648, loss is 3.8458521366119385
step: 129: time taken for the last 1 steps is 0.26074380700003985, loss is 4.060593605041504
step: 130: time taken for the last 1 steps is 0.2500258950000216, loss is 3.8175909519195557
step: 131: time taken for the last 1 steps is 0.2533995229999846, loss is 3.787524461746216
step: 132: time taken for the last 1 steps is 0.2557434080000007, loss is 3.849271535873413
step: 133: time taken for the last 1 steps is 0.2488642240000445, loss is 3.757674217224121
step: 134: time taken for the last 1 steps is 0.24563854999996693, loss is 3.9758071899414062
step: 135: time taken for the last 1 steps is 0.24942779399998471, loss is 3.9901294708251953
step: 136: time taken for the last 1 steps is 0.2534782440000072, loss is 3.977137327194214
step: 137: time taken for the last 1 steps is 0.2473351829999615, loss is 4.172210216522217
step: 138: time taken for the last 1 steps is 0.2552080180000189, loss is 3.9610865116119385
step: 139: time taken for the last 1 steps is 0.25595652399999835, loss is 4.180133819580078
step: 140: time taken for the last 1 steps is 0.24921426099996324, loss is 3.8822977542877197
step: 141: time taken for the last 1 steps is 0.25635792099996024, loss is 3.9525887966156006
step: 142: time taken for the last 1 steps is 0.2552621599999725, loss is 4.059017658233643
step: 143: time taken for the last 1 steps is 0.2514825350000365, loss is 4.070084571838379
step: 144: time taken for the last 1 steps is 0.2565383049999923, loss is 4.012814044952393
step: 145: time taken for the last 1 steps is 0.2613172380000037, loss is 3.870054244995117
step: 146: time taken for the last 1 steps is 0.24905203800000209, loss is 4.240328788757324
step: 147: time taken for the last 1 steps is 0.24596436699999913, loss is 3.921361207962036
step: 148: time taken for the last 1 steps is 0.25676491900003384, loss is 3.908442735671997
step: 149: time taken for the last 1 steps is 0.25625224799995294, loss is 3.8179917335510254
step: 150: time taken for the last 1 steps is 0.2557446790000313, loss is 3.8214547634124756
step: 151: time taken for the last 1 steps is 0.2406664530000171, loss is 4.048144340515137
step: 152: time taken for the last 1 steps is 0.2537321990000123, loss is 3.8156778812408447
step: 153: time taken for the last 1 steps is 0.252552105999996, loss is 3.6674673557281494
step: 154: time taken for the last 1 steps is 0.2544121130000008, loss is 3.79384708404541
step: 155: time taken for the last 1 steps is 0.25277454100000796, loss is 3.996262550354004
step: 156: time taken for the last 1 steps is 0.253717689000041, loss is 3.8331313133239746
step: 157: time taken for the last 1 steps is 0.2543513319999988, loss is 4.0882134437561035
step: 158: time taken for the last 1 steps is 0.2510073760000182, loss is 3.9235732555389404
step: 159: time taken for the last 1 steps is 0.2459378159999801, loss is 3.9841980934143066
step: 160: time taken for the last 1 steps is 0.25027798100001064, loss is 4.073812007904053
step: 161: time taken for the last 1 steps is 0.25966157100003784, loss is 3.716912031173706
step: 162: time taken for the last 1 steps is 0.25909871999999723, loss is 3.886120080947876
step: 163: time taken for the last 1 steps is 0.2563324060000127, loss is 3.8924529552459717
step: 164: time taken for the last 1 steps is 0.2573229449999985, loss is 3.8922181129455566
step: 165: time taken for the last 1 steps is 0.2515256130000125, loss is 3.8636953830718994
step: 166: time taken for the last 1 steps is 0.2533797990000153, loss is 3.7816336154937744
step: 167: time taken for the last 1 steps is 0.2560928010000225, loss is 3.526089906692505
step: 168: time taken for the last 1 steps is 0.2527447470000084, loss is 3.8599436283111572
step: 169: time taken for the last 1 steps is 0.2515059119999705, loss is 3.696331024169922
step: 170: time taken for the last 1 steps is 0.25794713799996316, loss is 3.927506446838379
step: 171: time taken for the last 1 steps is 0.25362101299998585, loss is 3.8404932022094727
step: 172: time taken for the last 1 steps is 0.25688296699996727, loss is 3.868551731109619
step: 173: time taken for the last 1 steps is 0.2534251399999903, loss is 3.814709424972534
step: 174: time taken for the last 1 steps is 0.2518525990000171, loss is 3.9838602542877197
step: 175: time taken for the last 1 steps is 0.24380251200000203, loss is 3.748842716217041
step: 176: time taken for the last 1 steps is 0.2538228679999861, loss is 3.817082643508911
step: 177: time taken for the last 1 steps is 0.25673818399997117, loss is 3.6511948108673096
step: 178: time taken for the last 1 steps is 0.2505686040000228, loss is 3.578589677810669
step: 179: time taken for the last 1 steps is 0.2542758460000414, loss is 3.9722259044647217
step: 180: time taken for the last 1 steps is 0.2530582919999915, loss is 3.8177130222320557
step: 181: time taken for the last 1 steps is 0.25506143100000145, loss is 4.040961742401123
step: 182: time taken for the last 1 steps is 0.24902076400002215, loss is 3.745741605758667
step: 183: time taken for the last 1 steps is 0.2505137820000414, loss is 3.9821670055389404
step: 184: time taken for the last 1 steps is 0.25556414200002564, loss is 4.015752792358398
step: 185: time taken for the last 1 steps is 0.24718692799996234, loss is 3.8230693340301514
step: 186: time taken for the last 1 steps is 0.25544902000001457, loss is 3.804311752319336
step: 187: time taken for the last 1 steps is 0.25311335300000337, loss is 3.91082763671875
step: 188: time taken for the last 1 steps is 0.25685841699998946, loss is 3.861664295196533
step: 189: time taken for the last 1 steps is 0.25431159599997955, loss is 3.89089298248291
step: 190: time taken for the last 1 steps is 0.2452378099999919, loss is 4.118781566619873
step: 191: time taken for the last 1 steps is 0.2539693100000022, loss is 3.999220848083496
step: 192: time taken for the last 1 steps is 0.2536703640000155, loss is 3.9700286388397217
step: 193: time taken for the last 1 steps is 0.24752892499998325, loss is 4.015350341796875
step: 194: time taken for the last 1 steps is 0.2586442510000211, loss is 3.9552369117736816
step: 195: time taken for the last 1 steps is 0.2508691900000031, loss is 3.662324905395508
step: 196: time taken for the last 1 steps is 0.24761721599998054, loss is 3.624174118041992
step: 197: time taken for the last 1 steps is 0.25599907999998095, loss is 3.678323745727539
step: 198: time taken for the last 1 steps is 0.25487487800000963, loss is 3.817084312438965
step: 199: time taken for the last 1 steps is 0.24435280299996975, loss is 3.858469247817993
step: 200: time taken for the last 1 steps is 0.2553516269999818, loss is 3.921903610229492
step: 201: time taken for the last 1 steps is 0.2503238689999989, loss is 3.938459873199463
step: 202: time taken for the last 1 steps is 0.25415111300003446, loss is 3.6183090209960938
step: 203: time taken for the last 1 steps is 0.24515994900002624, loss is 3.6317505836486816
step: 204: time taken for the last 1 steps is 0.2545516719999341, loss is 3.7844176292419434
step: 205: time taken for the last 1 steps is 0.25750616899995293, loss is 3.894787073135376
step: 206: time taken for the last 1 steps is 0.2506427360000316, loss is 3.838219165802002
step: 207: time taken for the last 1 steps is 0.25419664399998965, loss is 4.095887184143066
step: 208: time taken for the last 1 steps is 0.2532784960000072, loss is 3.713913917541504
step: 209: time taken for the last 1 steps is 0.2560210299999426, loss is 3.9571402072906494
step: 210: time taken for the last 1 steps is 0.2566587129999789, loss is 3.9521591663360596
step: 211: time taken for the last 1 steps is 0.24923003800006427, loss is 3.9129881858825684
step: 212: time taken for the last 1 steps is 0.2555076999999528, loss is 3.869257688522339
step: 213: time taken for the last 1 steps is 0.257520649000071, loss is 3.799757957458496
step: 214: time taken for the last 1 steps is 0.2443356930000391, loss is 3.9002232551574707
step: 215: time taken for the last 1 steps is 0.25321700599999986, loss is 3.7885563373565674
step: 216: time taken for the last 1 steps is 0.25528722599995035, loss is 3.834613084793091
step: 217: time taken for the last 1 steps is 0.25569950399994923, loss is 3.5112221240997314
step: 218: time taken for the last 1 steps is 0.25716096199994354, loss is 3.905047655105591
step: 219: time taken for the last 1 steps is 0.2559244190000527, loss is 3.730198860168457
step: 220: time taken for the last 1 steps is 0.2553120970000009, loss is 3.8626229763031006
step: 221: time taken for the last 1 steps is 0.2575756910000564, loss is 3.7866694927215576
step: 222: time taken for the last 1 steps is 0.2508513389999507, loss is 3.9708120822906494
step: 223: time taken for the last 1 steps is 0.25141810100001294, loss is 4.011201858520508
step: 224: time taken for the last 1 steps is 0.25715839200006485, loss is 3.9431447982788086
step: 225: time taken for the last 1 steps is 0.25817251199998736, loss is 4.010535717010498
step: 226: time taken for the last 1 steps is 0.2538321880000467, loss is 3.9456958770751953
step: 227: time taken for the last 1 steps is 0.2550697920000857, loss is 4.164735794067383
step: 228: time taken for the last 1 steps is 0.25393684000005123, loss is 3.784292221069336
step: 229: time taken for the last 1 steps is 0.25452005100009956, loss is 3.7148101329803467
step: 230: time taken for the last 1 steps is 0.2524506309999879, loss is 3.8933990001678467
step: 231: time taken for the last 1 steps is 0.25317876500002967, loss is 3.461291790008545
step: 232: time taken for the last 1 steps is 0.2507163359999822, loss is 3.664804458618164
step: 233: time taken for the last 1 steps is 0.25696423899989895, loss is 3.606860637664795
step: 234: time taken for the last 1 steps is 0.25430755699994734, loss is 3.821992874145508
step: 235: time taken for the last 1 steps is 0.25036125999997694, loss is 3.9950404167175293
step: 236: time taken for the last 1 steps is 0.2534058990000858, loss is 3.7123961448669434
step: 237: time taken for the last 1 steps is 0.2577454539999735, loss is 3.6645419597625732
step: 238: time taken for the last 1 steps is 0.24497987399990961, loss is 3.897904396057129
step: 239: time taken for the last 1 steps is 0.25467665299993314, loss is 3.7769601345062256
step: 240: time taken for the last 1 steps is 0.24843114199995853, loss is 3.9095804691314697
step: 241: time taken for the last 1 steps is 0.25231481799994526, loss is 3.7583563327789307
step: 242: time taken for the last 1 steps is 0.2469408359999079, loss is 3.8762714862823486
step: 243: time taken for the last 1 steps is 0.25697288899993964, loss is 3.4795138835906982
step: 244: time taken for the last 1 steps is 0.2522369680000338, loss is 3.8226609230041504
step: 245: time taken for the last 1 steps is 0.25324653699999544, loss is 3.9292125701904297
step: 246: time taken for the last 1 steps is 0.25298608299999614, loss is 4.060460090637207
step: 247: time taken for the last 1 steps is 0.2493032109999831, loss is 3.7012999057769775
step: 248: time taken for the last 1 steps is 0.2509841039999401, loss is 3.2624080181121826
step: 249: time taken for the last 1 steps is 0.2549889809999968, loss is 3.9142446517944336
step: 250: time taken for the last 1 steps is 0.25917626199998267, loss is 3.756697177886963
step: 251: time taken for the last 1 steps is 0.25079305999997814, loss is 3.7828471660614014
step: 252: time taken for the last 1 steps is 0.2496865489999891, loss is 3.9785420894622803
step: 253: time taken for the last 1 steps is 0.2571733930000164, loss is 3.918090343475342
step: 254: time taken for the last 1 steps is 0.25692794799999774, loss is 3.9352080821990967
step: 255: time taken for the last 1 steps is 0.257614440999987, loss is 3.782555103302002
step: 256: time taken for the last 1 steps is 0.246064279000052, loss is 3.760831117630005
step: 257: time taken for the last 1 steps is 0.25145027199994274, loss is 3.706961154937744
step: 258: time taken for the last 1 steps is 0.252839799999947, loss is 3.647899866104126
step: 259: time taken for the last 1 steps is 0.25370151599997826, loss is 3.4003546237945557
step: 260: time taken for the last 1 steps is 0.253571093000005, loss is 3.8092315196990967
step: 261: time taken for the last 1 steps is 0.25282441899992136, loss is 3.8218555450439453
step: 262: time taken for the last 1 steps is 0.24722660099996574, loss is 3.36043381690979
step: 263: time taken for the last 1 steps is 0.2503766119999682, loss is 3.7000207901000977
step: 264: time taken for the last 1 steps is 0.2539465999999493, loss is 3.578587293624878
step: 265: time taken for the last 1 steps is 0.255111812999985, loss is 3.832127809524536
step: 266: time taken for the last 1 steps is 0.25609357200005434, loss is 3.6265101432800293
step: 267: time taken for the last 1 steps is 0.25074284899994836, loss is 3.893702745437622
step: 268: time taken for the last 1 steps is 0.2574886389999165, loss is 3.5314197540283203
step: 269: time taken for the last 1 steps is 0.2507302289999416, loss is 4.000232219696045
step: 270: time taken for the last 1 steps is 0.253859858999931, loss is 3.567129135131836
step: 271: time taken for the last 1 steps is 0.24726073199997245, loss is 3.6589467525482178
step: 272: time taken for the last 1 steps is 0.2539466299999731, loss is 3.6705353260040283
step: 273: time taken for the last 1 steps is 0.2534541109999964, loss is 3.7243568897247314
step: 274: time taken for the last 1 steps is 0.2536771549999912, loss is 3.790518283843994
step: 275: time taken for the last 1 steps is 0.26001827799996136, loss is 3.783092737197876
step: 276: time taken for the last 1 steps is 0.2520900850000771, loss is 3.7121379375457764
step: 277: time taken for the last 1 steps is 0.24970304900000428, loss is 4.211219787597656
step: 278: time taken for the last 1 steps is 0.2483935240001074, loss is 3.8870503902435303
step: 279: time taken for the last 1 steps is 0.21861689900003967, loss is 3.7971079349517822
val_loss : 3.5785 : val_acc: 0.1617
updating stats...
--> cuda max reserved memory = 12.5
--> max reserved percentage = 56.66 %
--> cuda max memory allocated = 11.0791
--> max allocated percentage = 50.22 %
--> peak active memory = 11.0791
--> peak active memory 50.22 %
cudaMalloc retries = 0
cuda OOM = 0
loss='4.0692', acc='0.0797'
loss='3.5785', acc='0.1617'
--> Highest Val Accuracy = 16.17
--> Model Size = 85.722869 M Params