GPT2--Mongolian Mongolian GPT-2 model. Validation perplexity loss is 1.53. Training data: 600MB. Crawled from mn.wikipedia.com, ikon.mn, dnn.mn Sample