-
Notifications
You must be signed in to change notification settings - Fork 1
/
train.log
9506 lines (9495 loc) · 471 KB
/
train.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
INFO:numexpr.utils:Note: NumExpr detected 56 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Preparing data...
Preparing done...
average sequence length: 163.50
W1122 15:17:38.822602 30602 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 5.2, Driver API Version: 10.1, Runtime API Version: 10.1
W1122 15:17:38.828788 30602 device_context.cc:465] device: 0, cuDNN Version: 7.6.
loss in epoch 1 iteration 0: 1.3954685926437378
loss in epoch 1 iteration 1: 1.390064001083374
loss in epoch 1 iteration 2: 1.3876252174377441
loss in epoch 1 iteration 3: 1.3833444118499756
loss in epoch 1 iteration 4: 1.3766138553619385
loss in epoch 1 iteration 5: 1.3713159561157227
loss in epoch 1 iteration 6: 1.3636765480041504
loss in epoch 1 iteration 7: 1.3560080528259277
loss in epoch 1 iteration 8: 1.3496676683425903
loss in epoch 1 iteration 9: 1.340202808380127
loss in epoch 1 iteration 10: 1.328973650932312
loss in epoch 1 iteration 11: 1.3285090923309326
loss in epoch 1 iteration 12: 1.3181021213531494
loss in epoch 1 iteration 13: 1.3026565313339233
loss in epoch 1 iteration 14: 1.2882106304168701
loss in epoch 1 iteration 15: 1.281112551689148
loss in epoch 1 iteration 16: 1.2652451992034912
loss in epoch 1 iteration 17: 1.2669715881347656
loss in epoch 1 iteration 18: 1.25684654712677
loss in epoch 1 iteration 19: 1.2507033348083496
loss in epoch 1 iteration 20: 1.24028480052948
loss in epoch 1 iteration 21: 1.235501766204834
loss in epoch 1 iteration 22: 1.2208380699157715
loss in epoch 1 iteration 23: 1.2172213792800903
loss in epoch 1 iteration 24: 1.2082992792129517
loss in epoch 1 iteration 25: 1.192840337753296
loss in epoch 1 iteration 26: 1.1891740560531616
loss in epoch 1 iteration 27: 1.1766554117202759
loss in epoch 1 iteration 28: 1.1731916666030884
loss in epoch 1 iteration 29: 1.1608920097351074
loss in epoch 1 iteration 30: 1.1513519287109375
loss in epoch 1 iteration 31: 1.1614339351654053
loss in epoch 1 iteration 32: 1.130920171737671
loss in epoch 1 iteration 33: 1.1264233589172363
loss in epoch 1 iteration 34: 1.1336860656738281
loss in epoch 1 iteration 35: 1.1287256479263306
loss in epoch 1 iteration 36: 1.1313598155975342
loss in epoch 1 iteration 37: 1.10867178440094
loss in epoch 1 iteration 38: 1.1201082468032837
loss in epoch 1 iteration 39: 1.1228140592575073
loss in epoch 1 iteration 40: 1.101260781288147
loss in epoch 1 iteration 41: 1.092340350151062
loss in epoch 1 iteration 42: 1.0971192121505737
loss in epoch 1 iteration 43: 1.0758672952651978
loss in epoch 1 iteration 44: 1.0767282247543335
loss in epoch 1 iteration 45: 1.1024706363677979
loss in epoch 1 iteration 46: 1.0783361196517944
loss in epoch 2 iteration 0: 1.0725834369659424
loss in epoch 2 iteration 1: 1.060503363609314
loss in epoch 2 iteration 2: 1.0577424764633179
loss in epoch 2 iteration 3: 1.0465812683105469
loss in epoch 2 iteration 4: 1.0480931997299194
loss in epoch 2 iteration 5: 1.0580838918685913
loss in epoch 2 iteration 6: 1.0440220832824707
loss in epoch 2 iteration 7: 1.044769525527954
loss in epoch 2 iteration 8: 1.033262848854065
loss in epoch 2 iteration 9: 1.0375419855117798
loss in epoch 2 iteration 10: 1.0657042264938354
loss in epoch 2 iteration 11: 1.0454919338226318
loss in epoch 2 iteration 12: 1.0568623542785645
loss in epoch 2 iteration 13: 1.0457067489624023
loss in epoch 2 iteration 14: 0.9936767816543579
loss in epoch 2 iteration 15: 1.041584849357605
loss in epoch 2 iteration 16: 1.031395435333252
loss in epoch 2 iteration 17: 1.0215445756912231
loss in epoch 2 iteration 18: 1.0385398864746094
loss in epoch 2 iteration 19: 1.0403368473052979
loss in epoch 2 iteration 20: 1.0118645429611206
loss in epoch 2 iteration 21: 1.0180903673171997
loss in epoch 2 iteration 22: 1.0131797790527344
loss in epoch 2 iteration 23: 1.0390992164611816
loss in epoch 2 iteration 24: 1.0096813440322876
loss in epoch 2 iteration 25: 1.0085821151733398
loss in epoch 2 iteration 26: 1.0116157531738281
loss in epoch 2 iteration 27: 1.0120694637298584
loss in epoch 2 iteration 28: 1.0114126205444336
loss in epoch 2 iteration 29: 1.0212396383285522
loss in epoch 2 iteration 30: 0.9878599047660828
loss in epoch 2 iteration 31: 0.9908968210220337
loss in epoch 2 iteration 32: 1.0123655796051025
loss in epoch 2 iteration 33: 1.003240942955017
loss in epoch 2 iteration 34: 1.040265679359436
loss in epoch 2 iteration 35: 0.9985079169273376
loss in epoch 2 iteration 36: 0.9887409806251526
loss in epoch 2 iteration 37: 1.0069292783737183
loss in epoch 2 iteration 38: 1.0376652479171753
loss in epoch 2 iteration 39: 1.006603479385376
loss in epoch 2 iteration 40: 1.0153368711471558
loss in epoch 2 iteration 41: 0.9946372509002686
loss in epoch 2 iteration 42: 1.0056779384613037
loss in epoch 2 iteration 43: 0.9850974678993225
loss in epoch 2 iteration 44: 0.9986028075218201
loss in epoch 2 iteration 45: 0.982260525226593
loss in epoch 2 iteration 46: 1.021675944328308
loss in epoch 3 iteration 0: 0.9925995469093323
loss in epoch 3 iteration 1: 0.9894546866416931
loss in epoch 3 iteration 2: 0.9828079342842102
loss in epoch 3 iteration 3: 1.0274986028671265
loss in epoch 3 iteration 4: 1.0135293006896973
loss in epoch 3 iteration 5: 0.9982706904411316
loss in epoch 3 iteration 6: 0.9921504259109497
loss in epoch 3 iteration 7: 1.0253090858459473
loss in epoch 3 iteration 8: 1.0227553844451904
loss in epoch 3 iteration 9: 1.0012770891189575
loss in epoch 3 iteration 10: 0.9977245330810547
loss in epoch 3 iteration 11: 1.001591682434082
loss in epoch 3 iteration 12: 0.9978345036506653
loss in epoch 3 iteration 13: 1.0190268754959106
loss in epoch 3 iteration 14: 0.9770386219024658
loss in epoch 3 iteration 15: 0.9858853220939636
loss in epoch 3 iteration 16: 0.9742804765701294
loss in epoch 3 iteration 17: 1.007082462310791
loss in epoch 3 iteration 18: 0.9976646900177002
loss in epoch 3 iteration 19: 0.9718105792999268
loss in epoch 3 iteration 20: 1.0135655403137207
loss in epoch 3 iteration 21: 0.9650017023086548
loss in epoch 3 iteration 22: 1.0049456357955933
loss in epoch 3 iteration 23: 0.9941770434379578
loss in epoch 3 iteration 24: 0.9906175136566162
loss in epoch 3 iteration 25: 0.9979776740074158
loss in epoch 3 iteration 26: 1.025390386581421
loss in epoch 3 iteration 27: 0.9793052077293396
loss in epoch 3 iteration 28: 0.9869596362113953
loss in epoch 3 iteration 29: 1.0061041116714478
loss in epoch 3 iteration 30: 0.9694822430610657
loss in epoch 3 iteration 31: 0.9919165968894958
loss in epoch 3 iteration 32: 0.9729082584381104
loss in epoch 3 iteration 33: 0.997200071811676
loss in epoch 3 iteration 34: 1.011775016784668
loss in epoch 3 iteration 35: 0.974399983882904
loss in epoch 3 iteration 36: 0.967228889465332
loss in epoch 3 iteration 37: 0.9707850813865662
loss in epoch 3 iteration 38: 0.9838470816612244
loss in epoch 3 iteration 39: 0.9991986155509949
loss in epoch 3 iteration 40: 0.9639493227005005
loss in epoch 3 iteration 41: 0.9745542407035828
loss in epoch 3 iteration 42: 0.9700213670730591
loss in epoch 3 iteration 43: 0.9695715308189392
loss in epoch 3 iteration 44: 1.0204198360443115
loss in epoch 3 iteration 45: 0.9551382660865784
loss in epoch 3 iteration 46: 0.994737446308136
loss in epoch 4 iteration 0: 0.9691866636276245
loss in epoch 4 iteration 1: 0.9746191501617432
loss in epoch 4 iteration 2: 0.9937982559204102
loss in epoch 4 iteration 3: 0.9825000166893005
loss in epoch 4 iteration 4: 0.9493007659912109
loss in epoch 4 iteration 5: 0.9543700218200684
loss in epoch 4 iteration 6: 0.951466977596283
loss in epoch 4 iteration 7: 0.9685297012329102
loss in epoch 4 iteration 8: 0.9728102087974548
loss in epoch 4 iteration 9: 0.9597477912902832
loss in epoch 4 iteration 10: 0.955816924571991
loss in epoch 4 iteration 11: 0.9551020264625549
loss in epoch 4 iteration 12: 0.9757920503616333
loss in epoch 4 iteration 13: 0.9588606953620911
loss in epoch 4 iteration 14: 0.972875714302063
loss in epoch 4 iteration 15: 0.9585232734680176
loss in epoch 4 iteration 16: 0.976365864276886
loss in epoch 4 iteration 17: 0.9573033452033997
loss in epoch 4 iteration 18: 0.9513728022575378
loss in epoch 4 iteration 19: 0.9532862305641174
loss in epoch 4 iteration 20: 0.9488862752914429
loss in epoch 4 iteration 21: 0.9194649457931519
loss in epoch 4 iteration 22: 0.9301161170005798
loss in epoch 4 iteration 23: 0.9674838185310364
loss in epoch 4 iteration 24: 0.9621961712837219
loss in epoch 4 iteration 25: 0.9650375247001648
loss in epoch 4 iteration 26: 0.973049521446228
loss in epoch 4 iteration 27: 0.9508160352706909
loss in epoch 4 iteration 28: 0.9201311469078064
loss in epoch 4 iteration 29: 0.91118985414505
loss in epoch 4 iteration 30: 0.922955334186554
loss in epoch 4 iteration 31: 0.9414001107215881
loss in epoch 4 iteration 32: 0.9416683912277222
loss in epoch 4 iteration 33: 0.9262685775756836
loss in epoch 4 iteration 34: 0.922330915927887
loss in epoch 4 iteration 35: 0.9208005666732788
loss in epoch 4 iteration 36: 0.9556438326835632
loss in epoch 4 iteration 37: 0.9052221179008484
loss in epoch 4 iteration 38: 0.9157860279083252
loss in epoch 4 iteration 39: 0.9408758282661438
loss in epoch 4 iteration 40: 0.9370405673980713
loss in epoch 4 iteration 41: 0.925308346748352
loss in epoch 4 iteration 42: 0.9295805096626282
loss in epoch 4 iteration 43: 0.9139207601547241
loss in epoch 4 iteration 44: 0.916073739528656
loss in epoch 4 iteration 45: 0.9103310108184814
loss in epoch 4 iteration 46: 0.8958482146263123
loss in epoch 5 iteration 0: 0.907871425151825
loss in epoch 5 iteration 1: 0.8890647292137146
loss in epoch 5 iteration 2: 0.9181132316589355
loss in epoch 5 iteration 3: 0.9378848671913147
loss in epoch 5 iteration 4: 0.902988851070404
loss in epoch 5 iteration 5: 0.9178439378738403
loss in epoch 5 iteration 6: 0.9239814877510071
loss in epoch 5 iteration 7: 0.890411913394928
loss in epoch 5 iteration 8: 0.8949112296104431
loss in epoch 5 iteration 9: 0.9102426767349243
loss in epoch 5 iteration 10: 0.9116990566253662
loss in epoch 5 iteration 11: 0.882325291633606
loss in epoch 5 iteration 12: 0.8923685550689697
loss in epoch 5 iteration 13: 0.9169273376464844
loss in epoch 5 iteration 14: 0.9153704047203064
loss in epoch 5 iteration 15: 0.9081300497055054
loss in epoch 5 iteration 16: 0.884569525718689
loss in epoch 5 iteration 17: 0.8965110182762146
loss in epoch 5 iteration 18: 0.86756831407547
loss in epoch 5 iteration 19: 0.8701356649398804
loss in epoch 5 iteration 20: 0.9054853916168213
loss in epoch 5 iteration 21: 0.900336503982544
loss in epoch 5 iteration 22: 0.9024986624717712
loss in epoch 5 iteration 23: 0.8832417130470276
loss in epoch 5 iteration 24: 0.8696333765983582
loss in epoch 5 iteration 25: 0.9010987281799316
loss in epoch 5 iteration 26: 0.8917621970176697
loss in epoch 5 iteration 27: 0.8784523606300354
loss in epoch 5 iteration 28: 0.870998203754425
loss in epoch 5 iteration 29: 0.8677819967269897
loss in epoch 5 iteration 30: 0.8673023581504822
loss in epoch 5 iteration 31: 0.8612610101699829
loss in epoch 5 iteration 32: 0.8911495804786682
loss in epoch 5 iteration 33: 0.8590403199195862
loss in epoch 5 iteration 34: 0.8922449946403503
loss in epoch 5 iteration 35: 0.887820303440094
loss in epoch 5 iteration 36: 0.8514235615730286
loss in epoch 5 iteration 37: 0.8783392310142517
loss in epoch 5 iteration 38: 0.8745507001876831
loss in epoch 5 iteration 39: 0.8561462759971619
loss in epoch 5 iteration 40: 0.8594401478767395
loss in epoch 5 iteration 41: 0.8554290533065796
loss in epoch 5 iteration 42: 0.8762661218643188
loss in epoch 5 iteration 43: 0.851452648639679
loss in epoch 5 iteration 44: 0.8792212009429932
loss in epoch 5 iteration 45: 0.8484688997268677
loss in epoch 5 iteration 46: 0.8591905832290649
loss in epoch 6 iteration 0: 0.8318217396736145
loss in epoch 6 iteration 1: 0.8593459725379944
loss in epoch 6 iteration 2: 0.849812388420105
loss in epoch 6 iteration 3: 0.8462134599685669
loss in epoch 6 iteration 4: 0.8490907549858093
loss in epoch 6 iteration 5: 0.8524436950683594
loss in epoch 6 iteration 6: 0.8678955435752869
loss in epoch 6 iteration 7: 0.8593259453773499
loss in epoch 6 iteration 8: 0.8421313762664795
loss in epoch 6 iteration 9: 0.8638089299201965
loss in epoch 6 iteration 10: 0.8677895069122314
loss in epoch 6 iteration 11: 0.8408400416374207
loss in epoch 6 iteration 12: 0.8563240170478821
loss in epoch 6 iteration 13: 0.8568468689918518
loss in epoch 6 iteration 14: 0.8619177937507629
loss in epoch 6 iteration 15: 0.8637444376945496
loss in epoch 6 iteration 16: 0.8210169672966003
loss in epoch 6 iteration 17: 0.837516725063324
loss in epoch 6 iteration 18: 0.8373696804046631
loss in epoch 6 iteration 19: 0.8539358377456665
loss in epoch 6 iteration 20: 0.8503708839416504
loss in epoch 6 iteration 21: 0.8445774912834167
loss in epoch 6 iteration 22: 0.8072218298912048
loss in epoch 6 iteration 23: 0.813836395740509
loss in epoch 6 iteration 24: 0.8498975038528442
loss in epoch 6 iteration 25: 0.8258275389671326
loss in epoch 6 iteration 26: 0.8087275624275208
loss in epoch 6 iteration 27: 0.8095979690551758
loss in epoch 6 iteration 28: 0.8489259481430054
loss in epoch 6 iteration 29: 0.8354438543319702
loss in epoch 6 iteration 30: 0.809438943862915
loss in epoch 6 iteration 31: 0.8138231039047241
loss in epoch 6 iteration 32: 0.8349950909614563
loss in epoch 6 iteration 33: 0.8352787494659424
loss in epoch 6 iteration 34: 0.8276519775390625
loss in epoch 6 iteration 35: 0.8207772374153137
loss in epoch 6 iteration 36: 0.7994301319122314
loss in epoch 6 iteration 37: 0.8039531111717224
loss in epoch 6 iteration 38: 0.8268610835075378
loss in epoch 6 iteration 39: 0.8191311955451965
loss in epoch 6 iteration 40: 0.83078932762146
loss in epoch 6 iteration 41: 0.8152351975440979
loss in epoch 6 iteration 42: 0.8424715399742126
loss in epoch 6 iteration 43: 0.8113421201705933
loss in epoch 6 iteration 44: 0.833625316619873
loss in epoch 6 iteration 45: 0.8248891830444336
loss in epoch 6 iteration 46: 0.8308899402618408
loss in epoch 7 iteration 0: 0.8318812847137451
loss in epoch 7 iteration 1: 0.8234942555427551
loss in epoch 7 iteration 2: 0.830970823764801
loss in epoch 7 iteration 3: 0.8001179099082947
loss in epoch 7 iteration 4: 0.8231655955314636
loss in epoch 7 iteration 5: 0.7834515571594238
loss in epoch 7 iteration 6: 0.8219812512397766
loss in epoch 7 iteration 7: 0.8213329315185547
loss in epoch 7 iteration 8: 0.8048946857452393
loss in epoch 7 iteration 9: 0.7819621562957764
loss in epoch 7 iteration 10: 0.8097481727600098
loss in epoch 7 iteration 11: 0.7926613092422485
loss in epoch 7 iteration 12: 0.7816228270530701
loss in epoch 7 iteration 13: 0.7797985076904297
loss in epoch 7 iteration 14: 0.7912074327468872
loss in epoch 7 iteration 15: 0.7943331003189087
loss in epoch 7 iteration 16: 0.8135220408439636
loss in epoch 7 iteration 17: 0.785663366317749
loss in epoch 7 iteration 18: 0.8158658742904663
loss in epoch 7 iteration 19: 0.8033889532089233
loss in epoch 7 iteration 20: 0.7922537326812744
loss in epoch 7 iteration 21: 0.77069091796875
loss in epoch 7 iteration 22: 0.771425187587738
loss in epoch 7 iteration 23: 0.7971804141998291
loss in epoch 7 iteration 24: 0.7804566025733948
loss in epoch 7 iteration 25: 0.7854028940200806
loss in epoch 7 iteration 26: 0.7883251905441284
loss in epoch 7 iteration 27: 0.8020663857460022
loss in epoch 7 iteration 28: 0.7682420015335083
loss in epoch 7 iteration 29: 0.8010557293891907
loss in epoch 7 iteration 30: 0.8035875558853149
loss in epoch 7 iteration 31: 0.8010453581809998
loss in epoch 7 iteration 32: 0.795353353023529
loss in epoch 7 iteration 33: 0.777157723903656
loss in epoch 7 iteration 34: 0.811118483543396
loss in epoch 7 iteration 35: 0.7869524359703064
loss in epoch 7 iteration 36: 0.81684809923172
loss in epoch 7 iteration 37: 0.7969116568565369
loss in epoch 7 iteration 38: 0.7743387222290039
loss in epoch 7 iteration 39: 0.7629038691520691
loss in epoch 7 iteration 40: 0.7946825623512268
loss in epoch 7 iteration 41: 0.7814467549324036
loss in epoch 7 iteration 42: 0.7915584444999695
loss in epoch 7 iteration 43: 0.7645875811576843
loss in epoch 7 iteration 44: 0.7720376253128052
loss in epoch 7 iteration 45: 0.7704254984855652
loss in epoch 7 iteration 46: 0.7755281329154968
loss in epoch 8 iteration 0: 0.7712116837501526
loss in epoch 8 iteration 1: 0.7687814831733704
loss in epoch 8 iteration 2: 0.7753844857215881
loss in epoch 8 iteration 3: 0.771216869354248
loss in epoch 8 iteration 4: 0.769497275352478
loss in epoch 8 iteration 5: 0.7577829957008362
loss in epoch 8 iteration 6: 0.7631379961967468
loss in epoch 8 iteration 7: 0.7825471758842468
loss in epoch 8 iteration 8: 0.7675445079803467
loss in epoch 8 iteration 9: 0.7429941296577454
loss in epoch 8 iteration 10: 0.7418854236602783
loss in epoch 8 iteration 11: 0.7727715373039246
loss in epoch 8 iteration 12: 0.7658249735832214
loss in epoch 8 iteration 13: 0.7751817107200623
loss in epoch 8 iteration 14: 0.7722018957138062
loss in epoch 8 iteration 15: 0.7422227263450623
loss in epoch 8 iteration 16: 0.7269474267959595
loss in epoch 8 iteration 17: 0.7760573029518127
loss in epoch 8 iteration 18: 0.7336304783821106
loss in epoch 8 iteration 19: 0.7416519522666931
loss in epoch 8 iteration 20: 0.7732457518577576
loss in epoch 8 iteration 21: 0.7255421876907349
loss in epoch 8 iteration 22: 0.7529283761978149
loss in epoch 8 iteration 23: 0.7363418936729431
loss in epoch 8 iteration 24: 0.7402034401893616
loss in epoch 8 iteration 25: 0.7956795692443848
loss in epoch 8 iteration 26: 0.7544950246810913
loss in epoch 8 iteration 27: 0.7450177073478699
loss in epoch 8 iteration 28: 0.7493711113929749
loss in epoch 8 iteration 29: 0.7704940438270569
loss in epoch 8 iteration 30: 0.7363591194152832
loss in epoch 8 iteration 31: 0.7438355684280396
loss in epoch 8 iteration 32: 0.7372117042541504
loss in epoch 8 iteration 33: 0.7695136666297913
loss in epoch 8 iteration 34: 0.73603755235672
loss in epoch 8 iteration 35: 0.7408326268196106
loss in epoch 8 iteration 36: 0.7453048825263977
loss in epoch 8 iteration 37: 0.741847813129425
loss in epoch 8 iteration 38: 0.765419602394104
loss in epoch 8 iteration 39: 0.750770628452301
loss in epoch 8 iteration 40: 0.7520395517349243
loss in epoch 8 iteration 41: 0.7092989087104797
loss in epoch 8 iteration 42: 0.7472370266914368
loss in epoch 8 iteration 43: 0.7257764339447021
loss in epoch 8 iteration 44: 0.743757426738739
loss in epoch 8 iteration 45: 0.734961748123169
loss in epoch 8 iteration 46: 0.7316722869873047
loss in epoch 9 iteration 0: 0.7300987839698792
loss in epoch 9 iteration 1: 0.7340084314346313
loss in epoch 9 iteration 2: 0.7649257183074951
loss in epoch 9 iteration 3: 0.7166370153427124
loss in epoch 9 iteration 4: 0.7361950278282166
loss in epoch 9 iteration 5: 0.7406734228134155
loss in epoch 9 iteration 6: 0.7490411996841431
loss in epoch 9 iteration 7: 0.726374089717865
loss in epoch 9 iteration 8: 0.7356189489364624
loss in epoch 9 iteration 9: 0.7474063038825989
loss in epoch 9 iteration 10: 0.7264783382415771
loss in epoch 9 iteration 11: 0.7192060947418213
loss in epoch 9 iteration 12: 0.7320207357406616
loss in epoch 9 iteration 13: 0.717888593673706
loss in epoch 9 iteration 14: 0.7315883040428162
loss in epoch 9 iteration 15: 0.7487693428993225
loss in epoch 9 iteration 16: 0.7272205948829651
loss in epoch 9 iteration 17: 0.7015507221221924
loss in epoch 9 iteration 18: 0.7302191257476807
loss in epoch 9 iteration 19: 0.7302562594413757
loss in epoch 9 iteration 20: 0.7425158619880676
loss in epoch 9 iteration 21: 0.7056486010551453
loss in epoch 9 iteration 22: 0.7257555723190308
loss in epoch 9 iteration 23: 0.7022297978401184
loss in epoch 9 iteration 24: 0.7332358956336975
loss in epoch 9 iteration 25: 0.723960280418396
loss in epoch 9 iteration 26: 0.746355414390564
loss in epoch 9 iteration 27: 0.6869873404502869
loss in epoch 9 iteration 28: 0.7109772562980652
loss in epoch 9 iteration 29: 0.6927624344825745
loss in epoch 9 iteration 30: 0.6986455917358398
loss in epoch 9 iteration 31: 0.7100334763526917
loss in epoch 9 iteration 32: 0.692020058631897
loss in epoch 9 iteration 33: 0.7183868288993835
loss in epoch 9 iteration 34: 0.6957949995994568
loss in epoch 9 iteration 35: 0.6908816695213318
loss in epoch 9 iteration 36: 0.6995864510536194
loss in epoch 9 iteration 37: 0.7112470269203186
loss in epoch 9 iteration 38: 0.715732991695404
loss in epoch 9 iteration 39: 0.7013790607452393
loss in epoch 9 iteration 40: 0.6947695016860962
loss in epoch 9 iteration 41: 0.6931447386741638
loss in epoch 9 iteration 42: 0.7290127873420715
loss in epoch 9 iteration 43: 0.7256482839584351
loss in epoch 9 iteration 44: 0.707457959651947
loss in epoch 9 iteration 45: 0.7224726676940918
loss in epoch 9 iteration 46: 0.7095064520835876
loss in epoch 10 iteration 0: 0.7416647672653198
loss in epoch 10 iteration 1: 0.6549530625343323
loss in epoch 10 iteration 2: 0.6964388489723206
loss in epoch 10 iteration 3: 0.7055014967918396
loss in epoch 10 iteration 4: 0.7065179944038391
loss in epoch 10 iteration 5: 0.6968803405761719
loss in epoch 10 iteration 6: 0.6862895488739014
loss in epoch 10 iteration 7: 0.7264845967292786
loss in epoch 10 iteration 8: 0.701085090637207
loss in epoch 10 iteration 9: 0.6907557249069214
loss in epoch 10 iteration 10: 0.7009850144386292
loss in epoch 10 iteration 11: 0.7034155130386353
loss in epoch 10 iteration 12: 0.6911093592643738
loss in epoch 10 iteration 13: 0.7129244804382324
loss in epoch 10 iteration 14: 0.7047098278999329
loss in epoch 10 iteration 15: 0.6798200607299805
loss in epoch 10 iteration 16: 0.7010929584503174
loss in epoch 10 iteration 17: 0.6829248666763306
loss in epoch 10 iteration 18: 0.6879277229309082
loss in epoch 10 iteration 19: 0.6903768181800842
loss in epoch 10 iteration 20: 0.7254827618598938
loss in epoch 10 iteration 21: 0.6791942119598389
loss in epoch 10 iteration 22: 0.7072764039039612
loss in epoch 10 iteration 23: 0.6901839971542358
loss in epoch 10 iteration 24: 0.6519026160240173
loss in epoch 10 iteration 25: 0.7073217034339905
loss in epoch 10 iteration 26: 0.6772890686988831
loss in epoch 10 iteration 27: 0.6603526473045349
loss in epoch 10 iteration 28: 0.670166015625
loss in epoch 10 iteration 29: 0.6799165606498718
loss in epoch 10 iteration 30: 0.6869593858718872
loss in epoch 10 iteration 31: 0.6822074055671692
loss in epoch 10 iteration 32: 0.6893329620361328
loss in epoch 10 iteration 33: 0.6765887141227722
loss in epoch 10 iteration 34: 0.6824609041213989
loss in epoch 10 iteration 35: 0.6950007081031799
loss in epoch 10 iteration 36: 0.6967897415161133
loss in epoch 10 iteration 37: 0.706653892993927
loss in epoch 10 iteration 38: 0.66963130235672
loss in epoch 10 iteration 39: 0.6925797462463379
loss in epoch 10 iteration 40: 0.676359236240387
loss in epoch 10 iteration 41: 0.6604172587394714
loss in epoch 10 iteration 42: 0.6736065745353699
loss in epoch 10 iteration 43: 0.6750014424324036
loss in epoch 10 iteration 44: 0.6945067644119263
loss in epoch 10 iteration 45: 0.6809627413749695
loss in epoch 10 iteration 46: 0.7008981704711914
loss in epoch 11 iteration 0: 0.6786425113677979
loss in epoch 11 iteration 1: 0.6778337955474854
loss in epoch 11 iteration 2: 0.685005784034729
loss in epoch 11 iteration 3: 0.6678077578544617
loss in epoch 11 iteration 4: 0.6670859456062317
loss in epoch 11 iteration 5: 0.6834124326705933
loss in epoch 11 iteration 6: 0.6818949580192566
loss in epoch 11 iteration 7: 0.6739338040351868
loss in epoch 11 iteration 8: 0.6596149802207947
loss in epoch 11 iteration 9: 0.6584272980690002
loss in epoch 11 iteration 10: 0.6623277068138123
loss in epoch 11 iteration 11: 0.6607365012168884
loss in epoch 11 iteration 12: 0.670139729976654
loss in epoch 11 iteration 13: 0.6778583526611328
loss in epoch 11 iteration 14: 0.6751372218132019
loss in epoch 11 iteration 15: 0.6710194945335388
loss in epoch 11 iteration 16: 0.6553514003753662
loss in epoch 11 iteration 17: 0.6698591113090515
loss in epoch 11 iteration 18: 0.6611554026603699
loss in epoch 11 iteration 19: 0.6847363710403442
loss in epoch 11 iteration 20: 0.716199517250061
loss in epoch 11 iteration 21: 0.684288501739502
loss in epoch 11 iteration 22: 0.6856546401977539
loss in epoch 11 iteration 23: 0.6825751066207886
loss in epoch 11 iteration 24: 0.6963602304458618
loss in epoch 11 iteration 25: 0.6811912059783936
loss in epoch 11 iteration 26: 0.6608962416648865
loss in epoch 11 iteration 27: 0.6599256992340088
loss in epoch 11 iteration 28: 0.6458843946456909
loss in epoch 11 iteration 29: 0.6550228595733643
loss in epoch 11 iteration 30: 0.6472496390342712
loss in epoch 11 iteration 31: 0.6693813800811768
loss in epoch 11 iteration 32: 0.6477804780006409
loss in epoch 11 iteration 33: 0.6591986417770386
loss in epoch 11 iteration 34: 0.6520959734916687
loss in epoch 11 iteration 35: 0.660656750202179
loss in epoch 11 iteration 36: 0.665496826171875
loss in epoch 11 iteration 37: 0.6704897284507751
loss in epoch 11 iteration 38: 0.6382722854614258
loss in epoch 11 iteration 39: 0.6726146936416626
loss in epoch 11 iteration 40: 0.6391689777374268
loss in epoch 11 iteration 41: 0.6612817049026489
loss in epoch 11 iteration 42: 0.6638034582138062
loss in epoch 11 iteration 43: 0.652697741985321
loss in epoch 11 iteration 44: 0.646255373954773
loss in epoch 11 iteration 45: 0.675376296043396
loss in epoch 11 iteration 46: 0.6843534111976624
loss in epoch 12 iteration 0: 0.6520501375198364
loss in epoch 12 iteration 1: 0.662613570690155
loss in epoch 12 iteration 2: 0.6724337339401245
loss in epoch 12 iteration 3: 0.6534978151321411
loss in epoch 12 iteration 4: 0.6656723022460938
loss in epoch 12 iteration 5: 0.6701639890670776
loss in epoch 12 iteration 6: 0.6674672365188599
loss in epoch 12 iteration 7: 0.697719931602478
loss in epoch 12 iteration 8: 0.649347186088562
loss in epoch 12 iteration 9: 0.6712114214897156
loss in epoch 12 iteration 10: 0.6508598923683167
loss in epoch 12 iteration 11: 0.6619114875793457
loss in epoch 12 iteration 12: 0.6720070242881775
loss in epoch 12 iteration 13: 0.6645147800445557
loss in epoch 12 iteration 14: 0.6882710456848145
loss in epoch 12 iteration 15: 0.6738383173942566
loss in epoch 12 iteration 16: 0.635615885257721
loss in epoch 12 iteration 17: 0.6501191258430481
loss in epoch 12 iteration 18: 0.6600931882858276
loss in epoch 12 iteration 19: 0.6504546403884888
loss in epoch 12 iteration 20: 0.6885488629341125
loss in epoch 12 iteration 21: 0.6608823537826538
loss in epoch 12 iteration 22: 0.6626038551330566
loss in epoch 12 iteration 23: 0.6614724397659302
loss in epoch 12 iteration 24: 0.6529643535614014
loss in epoch 12 iteration 25: 0.6573097705841064
loss in epoch 12 iteration 26: 0.6528633832931519
loss in epoch 12 iteration 27: 0.6401805877685547
loss in epoch 12 iteration 28: 0.6662614941596985
loss in epoch 12 iteration 29: 0.6535218954086304
loss in epoch 12 iteration 30: 0.635834276676178
loss in epoch 12 iteration 31: 0.6903967261314392
loss in epoch 12 iteration 32: 0.6449549794197083
loss in epoch 12 iteration 33: 0.6545307040214539
loss in epoch 12 iteration 34: 0.659397542476654
loss in epoch 12 iteration 35: 0.6277753114700317
loss in epoch 12 iteration 36: 0.6592061519622803
loss in epoch 12 iteration 37: 0.6283028721809387
loss in epoch 12 iteration 38: 0.6449996829032898
loss in epoch 12 iteration 39: 0.6376529335975647
loss in epoch 12 iteration 40: 0.6292754411697388
loss in epoch 12 iteration 41: 0.633913516998291
loss in epoch 12 iteration 42: 0.650786280632019
loss in epoch 12 iteration 43: 0.6188055872917175
loss in epoch 12 iteration 44: 0.6722487807273865
loss in epoch 12 iteration 45: 0.628729522228241
loss in epoch 12 iteration 46: 0.6502054333686829
loss in epoch 13 iteration 0: 0.640186607837677
loss in epoch 13 iteration 1: 0.6462555527687073
loss in epoch 13 iteration 2: 0.6363643407821655
loss in epoch 13 iteration 3: 0.6585367918014526
loss in epoch 13 iteration 4: 0.6374184489250183
loss in epoch 13 iteration 5: 0.6614411473274231
loss in epoch 13 iteration 6: 0.622645378112793
loss in epoch 13 iteration 7: 0.6358985900878906
loss in epoch 13 iteration 8: 0.6475363969802856
loss in epoch 13 iteration 9: 0.6190412044525146
loss in epoch 13 iteration 10: 0.6263806819915771
loss in epoch 13 iteration 11: 0.6194764971733093
loss in epoch 13 iteration 12: 0.6464937925338745
loss in epoch 13 iteration 13: 0.6525228023529053
loss in epoch 13 iteration 14: 0.631266713142395
loss in epoch 13 iteration 15: 0.6200947165489197
loss in epoch 13 iteration 16: 0.6325142979621887
loss in epoch 13 iteration 17: 0.6071545481681824
loss in epoch 13 iteration 18: 0.6060729622840881
loss in epoch 13 iteration 19: 0.6334218382835388
loss in epoch 13 iteration 20: 0.5941237211227417
loss in epoch 13 iteration 21: 0.6289678812026978
loss in epoch 13 iteration 22: 0.6306320428848267
loss in epoch 13 iteration 23: 0.6159337759017944
loss in epoch 13 iteration 24: 0.6153011322021484
loss in epoch 13 iteration 25: 0.6225727200508118
loss in epoch 13 iteration 26: 0.6488649249076843
loss in epoch 13 iteration 27: 0.6305614709854126
loss in epoch 13 iteration 28: 0.6387674808502197
loss in epoch 13 iteration 29: 0.6418316960334778
loss in epoch 13 iteration 30: 0.6639432907104492
loss in epoch 13 iteration 31: 0.6089796423912048
loss in epoch 13 iteration 32: 0.6250000596046448
loss in epoch 13 iteration 33: 0.6203837394714355
loss in epoch 13 iteration 34: 0.6309550404548645
loss in epoch 13 iteration 35: 0.6507856249809265
loss in epoch 13 iteration 36: 0.6285032033920288
loss in epoch 13 iteration 37: 0.6161587238311768
loss in epoch 13 iteration 38: 0.6195883750915527
loss in epoch 13 iteration 39: 0.6378293633460999
loss in epoch 13 iteration 40: 0.6183638572692871
loss in epoch 13 iteration 41: 0.627479076385498
loss in epoch 13 iteration 42: 0.6332768797874451
loss in epoch 13 iteration 43: 0.6770268678665161
loss in epoch 13 iteration 44: 0.6290939450263977
loss in epoch 13 iteration 45: 0.6507452130317688
loss in epoch 13 iteration 46: 0.6302761435508728
loss in epoch 14 iteration 0: 0.6348589062690735
loss in epoch 14 iteration 1: 0.6609249711036682
loss in epoch 14 iteration 2: 0.6086927652359009
loss in epoch 14 iteration 3: 0.6438299417495728
loss in epoch 14 iteration 4: 0.6197487711906433
loss in epoch 14 iteration 5: 0.6372214555740356
loss in epoch 14 iteration 6: 0.6137073636054993
loss in epoch 14 iteration 7: 0.6500061750411987
loss in epoch 14 iteration 8: 0.6199630498886108
loss in epoch 14 iteration 9: 0.6236387491226196
loss in epoch 14 iteration 10: 0.6219751834869385
loss in epoch 14 iteration 11: 0.6378330588340759
loss in epoch 14 iteration 12: 0.6254082322120667
loss in epoch 14 iteration 13: 0.6263447999954224
loss in epoch 14 iteration 14: 0.6360633373260498
loss in epoch 14 iteration 15: 0.6497567296028137
loss in epoch 14 iteration 16: 0.6554396748542786
loss in epoch 14 iteration 17: 0.6383514404296875
loss in epoch 14 iteration 18: 0.646178662776947
loss in epoch 14 iteration 19: 0.6288718581199646
loss in epoch 14 iteration 20: 0.6346921324729919
loss in epoch 14 iteration 21: 0.6320574879646301
loss in epoch 14 iteration 22: 0.6291441917419434
loss in epoch 14 iteration 23: 0.6249653100967407
loss in epoch 14 iteration 24: 0.6382707953453064
loss in epoch 14 iteration 25: 0.6172035336494446
loss in epoch 14 iteration 26: 0.6012514233589172
loss in epoch 14 iteration 27: 0.6380502581596375
loss in epoch 14 iteration 28: 0.6220231056213379
loss in epoch 14 iteration 29: 0.6180908679962158
loss in epoch 14 iteration 30: 0.6247820854187012
loss in epoch 14 iteration 31: 0.6403063535690308
loss in epoch 14 iteration 32: 0.6178343296051025
loss in epoch 14 iteration 33: 0.6227642297744751
loss in epoch 14 iteration 34: 0.63360196352005
loss in epoch 14 iteration 35: 0.6223033666610718
loss in epoch 14 iteration 36: 0.626771092414856
loss in epoch 14 iteration 37: 0.6005940437316895
loss in epoch 14 iteration 38: 0.6327066421508789
loss in epoch 14 iteration 39: 0.6082711815834045
loss in epoch 14 iteration 40: 0.6348557472229004
loss in epoch 14 iteration 41: 0.5869054198265076
loss in epoch 14 iteration 42: 0.6235240697860718
loss in epoch 14 iteration 43: 0.6100212335586548
loss in epoch 14 iteration 44: 0.6353187561035156
loss in epoch 14 iteration 45: 0.6258335709571838
loss in epoch 14 iteration 46: 0.6019359827041626
loss in epoch 15 iteration 0: 0.6146093010902405
loss in epoch 15 iteration 1: 0.6112343072891235
loss in epoch 15 iteration 2: 0.6003054976463318
loss in epoch 15 iteration 3: 0.6100964546203613
loss in epoch 15 iteration 4: 0.6092837452888489
loss in epoch 15 iteration 5: 0.6168461441993713
loss in epoch 15 iteration 6: 0.635871410369873
loss in epoch 15 iteration 7: 0.5943331718444824
loss in epoch 15 iteration 8: 0.646007776260376
loss in epoch 15 iteration 9: 0.5928195714950562
loss in epoch 15 iteration 10: 0.6222984194755554
loss in epoch 15 iteration 11: 0.5858518481254578
loss in epoch 15 iteration 12: 0.6265729069709778
loss in epoch 15 iteration 13: 0.616509199142456
loss in epoch 15 iteration 14: 0.6048811674118042
loss in epoch 15 iteration 15: 0.5971311926841736
loss in epoch 15 iteration 16: 0.6407081484794617
loss in epoch 15 iteration 17: 0.6199561953544617
loss in epoch 15 iteration 18: 0.6230186820030212
loss in epoch 15 iteration 19: 0.6208046078681946
loss in epoch 15 iteration 20: 0.6302918791770935
loss in epoch 15 iteration 21: 0.5654901266098022
loss in epoch 15 iteration 22: 0.6385995745658875
loss in epoch 15 iteration 23: 0.6001989245414734
loss in epoch 15 iteration 24: 0.6511735320091248
loss in epoch 15 iteration 25: 0.6147096753120422
loss in epoch 15 iteration 26: 0.6108824610710144
loss in epoch 15 iteration 27: 0.6223503351211548
loss in epoch 15 iteration 28: 0.6145483255386353
loss in epoch 15 iteration 29: 0.606292188167572
loss in epoch 15 iteration 30: 0.5985914468765259
loss in epoch 15 iteration 31: 0.6159566044807434
loss in epoch 15 iteration 32: 0.5962575078010559
loss in epoch 15 iteration 33: 0.6322420239448547
loss in epoch 15 iteration 34: 0.5800032019615173
loss in epoch 15 iteration 35: 0.6173473596572876
loss in epoch 15 iteration 36: 0.6221277117729187
loss in epoch 15 iteration 37: 0.5968843102455139
loss in epoch 15 iteration 38: 0.5947990417480469
loss in epoch 15 iteration 39: 0.5986552834510803
loss in epoch 15 iteration 40: 0.5945162177085876
loss in epoch 15 iteration 41: 0.5768322944641113
loss in epoch 15 iteration 42: 0.6194610595703125
loss in epoch 15 iteration 43: 0.6136354207992554
loss in epoch 15 iteration 44: 0.6077913641929626
loss in epoch 15 iteration 45: 0.5979841947555542
loss in epoch 15 iteration 46: 0.6195306181907654
loss in epoch 16 iteration 0: 0.6293591260910034
loss in epoch 16 iteration 1: 0.6217397451400757
loss in epoch 16 iteration 2: 0.6399310231208801
loss in epoch 16 iteration 3: 0.6163089871406555
loss in epoch 16 iteration 4: 0.60809725522995
loss in epoch 16 iteration 5: 0.6069396138191223
loss in epoch 16 iteration 6: 0.6326643228530884
loss in epoch 16 iteration 7: 0.5963442325592041
loss in epoch 16 iteration 8: 0.62138432264328
loss in epoch 16 iteration 9: 0.6134137511253357
loss in epoch 16 iteration 10: 0.5756115317344666
loss in epoch 16 iteration 11: 0.6161321997642517
loss in epoch 16 iteration 12: 0.6295775175094604
loss in epoch 16 iteration 13: 0.5909584164619446
loss in epoch 16 iteration 14: 0.575158953666687
loss in epoch 16 iteration 15: 0.5978643298149109
loss in epoch 16 iteration 16: 0.6219720840454102
loss in epoch 16 iteration 17: 0.6147273182868958
loss in epoch 16 iteration 18: 0.6245135068893433
loss in epoch 16 iteration 19: 0.6592197418212891
loss in epoch 16 iteration 20: 0.571800947189331
loss in epoch 16 iteration 21: 0.600271999835968
loss in epoch 16 iteration 22: 0.600049614906311
loss in epoch 16 iteration 23: 0.6047764420509338
loss in epoch 16 iteration 24: 0.6192612648010254
loss in epoch 16 iteration 25: 0.5721785426139832
loss in epoch 16 iteration 26: 0.5896327495574951
loss in epoch 16 iteration 27: 0.5945552587509155
loss in epoch 16 iteration 28: 0.6062804460525513
loss in epoch 16 iteration 29: 0.5989258289337158
loss in epoch 16 iteration 30: 0.5728862285614014
loss in epoch 16 iteration 31: 0.6077942252159119
loss in epoch 16 iteration 32: 0.6052199602127075
loss in epoch 16 iteration 33: 0.6058734059333801
loss in epoch 16 iteration 34: 0.5992992520332336
loss in epoch 16 iteration 35: 0.5906915664672852
loss in epoch 16 iteration 36: 0.5749076008796692
loss in epoch 16 iteration 37: 0.5777125954627991
loss in epoch 16 iteration 38: 0.6368547677993774
loss in epoch 16 iteration 39: 0.5783889293670654
loss in epoch 16 iteration 40: 0.6203365325927734
loss in epoch 16 iteration 41: 0.6011037230491638
loss in epoch 16 iteration 42: 0.6062548756599426
loss in epoch 16 iteration 43: 0.5923464894294739
loss in epoch 16 iteration 44: 0.5893805027008057
loss in epoch 16 iteration 45: 0.5959845781326294
loss in epoch 16 iteration 46: 0.6077088713645935
loss in epoch 17 iteration 0: 0.6125790476799011
loss in epoch 17 iteration 1: 0.6125289797782898
loss in epoch 17 iteration 2: 0.6056845188140869
loss in epoch 17 iteration 3: 0.5655069351196289
loss in epoch 17 iteration 4: 0.5881325006484985
loss in epoch 17 iteration 5: 0.6023129224777222
loss in epoch 17 iteration 6: 0.5756743550300598
loss in epoch 17 iteration 7: 0.5698866844177246
loss in epoch 17 iteration 8: 0.5795510411262512
loss in epoch 17 iteration 9: 0.5718787908554077
loss in epoch 17 iteration 10: 0.6170450448989868
loss in epoch 17 iteration 11: 0.5836141109466553
loss in epoch 17 iteration 12: 0.5859125852584839
loss in epoch 17 iteration 13: 0.5875800251960754
loss in epoch 17 iteration 14: 0.5892443060874939
loss in epoch 17 iteration 15: 0.5920312404632568
loss in epoch 17 iteration 16: 0.6213474869728088
loss in epoch 17 iteration 17: 0.61112380027771
loss in epoch 17 iteration 18: 0.5913992524147034
loss in epoch 17 iteration 19: 0.5826565623283386
loss in epoch 17 iteration 20: 0.5838589072227478
loss in epoch 17 iteration 21: 0.581543505191803
loss in epoch 17 iteration 22: 0.5738905668258667
loss in epoch 17 iteration 23: 0.593722939491272
loss in epoch 17 iteration 24: 0.6177064180374146
loss in epoch 17 iteration 25: 0.6007301807403564
loss in epoch 17 iteration 26: 0.5772152543067932
loss in epoch 17 iteration 27: 0.6048364043235779
loss in epoch 17 iteration 28: 0.5863299369812012
loss in epoch 17 iteration 29: 0.6114875078201294
loss in epoch 17 iteration 30: 0.5953657031059265
loss in epoch 17 iteration 31: 0.6158477067947388
loss in epoch 17 iteration 32: 0.6265726089477539
loss in epoch 17 iteration 33: 0.623054027557373
loss in epoch 17 iteration 34: 0.5984569787979126
loss in epoch 17 iteration 35: 0.5469513535499573
loss in epoch 17 iteration 36: 0.6048927307128906
loss in epoch 17 iteration 37: 0.6278553009033203
loss in epoch 17 iteration 38: 0.580528199672699
loss in epoch 17 iteration 39: 0.5634322762489319
loss in epoch 17 iteration 40: 0.6432176232337952
loss in epoch 17 iteration 41: 0.5733962655067444
loss in epoch 17 iteration 42: 0.5611854791641235
loss in epoch 17 iteration 43: 0.5778064727783203
loss in epoch 17 iteration 44: 0.5772227644920349
loss in epoch 17 iteration 45: 0.6027219891548157
loss in epoch 17 iteration 46: 0.5943653583526611
loss in epoch 18 iteration 0: 0.574112594127655
loss in epoch 18 iteration 1: 0.592474102973938
loss in epoch 18 iteration 2: 0.6075072288513184
loss in epoch 18 iteration 3: 0.5930647253990173
loss in epoch 18 iteration 4: 0.5692285299301147
loss in epoch 18 iteration 5: 0.5955765843391418
loss in epoch 18 iteration 6: 0.5926230549812317
loss in epoch 18 iteration 7: 0.5764279961585999
loss in epoch 18 iteration 8: 0.5768054723739624
loss in epoch 18 iteration 9: 0.6010205149650574
loss in epoch 18 iteration 10: 0.6119867563247681
loss in epoch 18 iteration 11: 0.5896655917167664
loss in epoch 18 iteration 12: 0.5848978757858276
loss in epoch 18 iteration 13: 0.5590786933898926
loss in epoch 18 iteration 14: 0.5649043917655945
loss in epoch 18 iteration 15: 0.5925273895263672
loss in epoch 18 iteration 16: 0.5851421356201172
loss in epoch 18 iteration 17: 0.5582991242408752
loss in epoch 18 iteration 18: 0.5625852346420288
loss in epoch 18 iteration 19: 0.5710762739181519
loss in epoch 18 iteration 20: 0.595299243927002
loss in epoch 18 iteration 21: 0.5649773478507996
loss in epoch 18 iteration 22: 0.5891627669334412
loss in epoch 18 iteration 23: 0.5983882546424866
loss in epoch 18 iteration 24: 0.5841766595840454
loss in epoch 18 iteration 25: 0.5838983058929443
loss in epoch 18 iteration 26: 0.6015399694442749
loss in epoch 18 iteration 27: 0.5893350839614868
loss in epoch 18 iteration 28: 0.5796637535095215
loss in epoch 18 iteration 29: 0.5858866572380066
loss in epoch 18 iteration 30: 0.5866113305091858
loss in epoch 18 iteration 31: 0.5958020091056824
loss in epoch 18 iteration 32: 0.5798400640487671
loss in epoch 18 iteration 33: 0.588447093963623
loss in epoch 18 iteration 34: 0.5799934267997742
loss in epoch 18 iteration 35: 0.5914474725723267
loss in epoch 18 iteration 36: 0.5691731572151184
loss in epoch 18 iteration 37: 0.6050321459770203
loss in epoch 18 iteration 38: 0.5434069037437439
loss in epoch 18 iteration 39: 0.5959277749061584
loss in epoch 18 iteration 40: 0.5595537424087524
loss in epoch 18 iteration 41: 0.5678594708442688
loss in epoch 18 iteration 42: 0.5764535665512085
loss in epoch 18 iteration 43: 0.5774965882301331
loss in epoch 18 iteration 44: 0.581179141998291
loss in epoch 18 iteration 45: 0.5843695402145386
loss in epoch 18 iteration 46: 0.5675491094589233
loss in epoch 19 iteration 0: 0.5897762775421143
loss in epoch 19 iteration 1: 0.6042885184288025
loss in epoch 19 iteration 2: 0.5939893126487732
loss in epoch 19 iteration 3: 0.5873335599899292
loss in epoch 19 iteration 4: 0.5520483255386353
loss in epoch 19 iteration 5: 0.5370315313339233
loss in epoch 19 iteration 6: 0.5477423071861267
loss in epoch 19 iteration 7: 0.5744200944900513
loss in epoch 19 iteration 8: 0.556175708770752
loss in epoch 19 iteration 9: 0.5461294651031494
loss in epoch 19 iteration 10: 0.5722957849502563
loss in epoch 19 iteration 11: 0.573813259601593
loss in epoch 19 iteration 12: 0.5642279386520386
loss in epoch 19 iteration 13: 0.6099944710731506
loss in epoch 19 iteration 14: 0.6088768243789673
loss in epoch 19 iteration 15: 0.5752955079078674
loss in epoch 19 iteration 16: 0.5378250479698181
loss in epoch 19 iteration 17: 0.5943096876144409
loss in epoch 19 iteration 18: 0.5712675452232361
loss in epoch 19 iteration 19: 0.5889773964881897
loss in epoch 19 iteration 20: 0.578083872795105
loss in epoch 19 iteration 21: 0.5821210741996765
loss in epoch 19 iteration 22: 0.586696445941925
loss in epoch 19 iteration 23: 0.5349501967430115
loss in epoch 19 iteration 24: 0.5860986113548279
loss in epoch 19 iteration 25: 0.591693639755249
loss in epoch 19 iteration 26: 0.5881264805793762
loss in epoch 19 iteration 27: 0.5782442688941956
loss in epoch 19 iteration 28: 0.5466016530990601
loss in epoch 19 iteration 29: 0.5823491215705872
loss in epoch 19 iteration 30: 0.5787870287895203
loss in epoch 19 iteration 31: 0.5724577903747559
loss in epoch 19 iteration 32: 0.5767421722412109
loss in epoch 19 iteration 33: 0.5443856120109558
loss in epoch 19 iteration 34: 0.5665178298950195
loss in epoch 19 iteration 35: 0.5608080625534058
loss in epoch 19 iteration 36: 0.5598500967025757
loss in epoch 19 iteration 37: 0.6049149036407471
loss in epoch 19 iteration 38: 0.5850713849067688
loss in epoch 19 iteration 39: 0.591471254825592
loss in epoch 19 iteration 40: 0.5799699425697327
loss in epoch 19 iteration 41: 0.5543279647827148
loss in epoch 19 iteration 42: 0.559066116809845
loss in epoch 19 iteration 43: 0.5632895827293396
loss in epoch 19 iteration 44: 0.531680166721344
loss in epoch 19 iteration 45: 0.5457037091255188
loss in epoch 19 iteration 46: 0.5881417989730835
loss in epoch 20 iteration 0: 0.5753134489059448
loss in epoch 20 iteration 1: 0.571964681148529
loss in epoch 20 iteration 2: 0.620926022529602
loss in epoch 20 iteration 3: 0.5239773988723755
loss in epoch 20 iteration 4: 0.6040768027305603
loss in epoch 20 iteration 5: 0.6077039241790771
loss in epoch 20 iteration 6: 0.5674633979797363
loss in epoch 20 iteration 7: 0.5608850717544556
loss in epoch 20 iteration 8: 0.5674854516983032
loss in epoch 20 iteration 9: 0.5776790976524353
loss in epoch 20 iteration 10: 0.5745290517807007
loss in epoch 20 iteration 11: 0.5728681087493896
loss in epoch 20 iteration 12: 0.5750664472579956
loss in epoch 20 iteration 13: 0.5529995560646057
loss in epoch 20 iteration 14: 0.5824216604232788
loss in epoch 20 iteration 15: 0.5593878626823425
loss in epoch 20 iteration 16: 0.5654425024986267
loss in epoch 20 iteration 17: 0.5582238435745239
loss in epoch 20 iteration 18: 0.5656432509422302
loss in epoch 20 iteration 19: 0.5778374671936035
loss in epoch 20 iteration 20: 0.5796228051185608
loss in epoch 20 iteration 21: 0.5962057709693909
loss in epoch 20 iteration 22: 0.5592641830444336
loss in epoch 20 iteration 23: 0.5521957278251648
loss in epoch 20 iteration 24: 0.5820695757865906
loss in epoch 20 iteration 25: 0.5941590666770935
loss in epoch 20 iteration 26: 0.5494058728218079
loss in epoch 20 iteration 27: 0.5569325685501099
loss in epoch 20 iteration 28: 0.5590215921401978
loss in epoch 20 iteration 29: 0.5422708988189697
loss in epoch 20 iteration 30: 0.5622471570968628
loss in epoch 20 iteration 31: 0.5750249624252319
loss in epoch 20 iteration 32: 0.5578492879867554
loss in epoch 20 iteration 33: 0.5753007531166077
loss in epoch 20 iteration 34: 0.5682048797607422
loss in epoch 20 iteration 35: 0.5911899209022522
loss in epoch 20 iteration 36: 0.566971480846405
loss in epoch 20 iteration 37: 0.5695789456367493
loss in epoch 20 iteration 38: 0.5584194660186768
loss in epoch 20 iteration 39: 0.5650844573974609
loss in epoch 20 iteration 40: 0.5552970767021179
loss in epoch 20 iteration 41: 0.581143319606781
loss in epoch 20 iteration 42: 0.5350347757339478
loss in epoch 20 iteration 43: 0.5632908344268799
loss in epoch 20 iteration 44: 0.577900230884552
loss in epoch 20 iteration 45: 0.5393121242523193
loss in epoch 20 iteration 46: 0.5625767111778259
Evaluating........................................................................................................................epoch:20, time: 62.143919(s), valid (NDCG@10: 0.5411, HR@10: 0.7859), test (NDCG@10: 0.5134, HR@10: 0.7603)
loss in epoch 21 iteration 0: 0.539985716342926
loss in epoch 21 iteration 1: 0.5735242962837219
loss in epoch 21 iteration 2: 0.5638297200202942
loss in epoch 21 iteration 3: 0.582366943359375
loss in epoch 21 iteration 4: 0.5883145928382874
loss in epoch 21 iteration 5: 0.5614680051803589
loss in epoch 21 iteration 6: 0.553020715713501
loss in epoch 21 iteration 7: 0.5370014905929565
loss in epoch 21 iteration 8: 0.5941249132156372
loss in epoch 21 iteration 9: 0.5902681350708008
loss in epoch 21 iteration 10: 0.5625402331352234
loss in epoch 21 iteration 11: 0.5455617904663086
loss in epoch 21 iteration 12: 0.5662745237350464
loss in epoch 21 iteration 13: 0.5699634552001953
loss in epoch 21 iteration 14: 0.5637548565864563
loss in epoch 21 iteration 15: 0.5475408434867859
loss in epoch 21 iteration 16: 0.5721405148506165
loss in epoch 21 iteration 17: 0.5624912977218628
loss in epoch 21 iteration 18: 0.5421813130378723
loss in epoch 21 iteration 19: 0.5602112412452698
loss in epoch 21 iteration 20: 0.5685986876487732
loss in epoch 21 iteration 21: 0.5311540961265564
loss in epoch 21 iteration 22: 0.5555211305618286
loss in epoch 21 iteration 23: 0.5648155808448792
loss in epoch 21 iteration 24: 0.5251337289810181
loss in epoch 21 iteration 25: 0.554641604423523
loss in epoch 21 iteration 26: 0.5831382274627686
loss in epoch 21 iteration 27: 0.5153700113296509
loss in epoch 21 iteration 28: 0.5339467525482178
loss in epoch 21 iteration 29: 0.5878880620002747
loss in epoch 21 iteration 30: 0.5668247938156128
loss in epoch 21 iteration 31: 0.5530834794044495
loss in epoch 21 iteration 32: 0.5704892873764038
loss in epoch 21 iteration 33: 0.5826714634895325
loss in epoch 21 iteration 34: 0.5923284292221069
loss in epoch 21 iteration 35: 0.5480049252510071
loss in epoch 21 iteration 36: 0.5579029321670532
loss in epoch 21 iteration 37: 0.5353330373764038
loss in epoch 21 iteration 38: 0.5374653339385986
loss in epoch 21 iteration 39: 0.5708736777305603
loss in epoch 21 iteration 40: 0.5875089168548584
loss in epoch 21 iteration 41: 0.5800458192825317
loss in epoch 21 iteration 42: 0.5693562626838684
loss in epoch 21 iteration 43: 0.5689210295677185
loss in epoch 21 iteration 44: 0.556664228439331
loss in epoch 21 iteration 45: 0.543552041053772
loss in epoch 21 iteration 46: 0.5828896164894104
loss in epoch 22 iteration 0: 0.5263690948486328
loss in epoch 22 iteration 1: 0.5368265509605408
loss in epoch 22 iteration 2: 0.5652415156364441
loss in epoch 22 iteration 3: 0.5581584572792053
loss in epoch 22 iteration 4: 0.5479701161384583
loss in epoch 22 iteration 5: 0.5851730108261108