Training of human-like models #971

BigBoxxx · 2024-08-08T08:04:37Z

I believe it's possible to train models with different playing styles by using game records from human players with various styles. Can this human-model training also be implemented using train.py?

lightvector · 2024-08-08T12:48:02Z

Maybe. If you could find a way to reliably categorize the styles of different players across tens of thousands of games, then you could add some parameters to the human sgf metadata encoder to label the style and train it. But to start off, you would have to find a way to categorize styles for the games without having a model trained to do it, because you wouldn't have a model yet.

mooy8899 · 2024-08-09T02:10:01Z

define "human-like"

BigBoxxx · 2024-08-09T04:54:38Z

@lightvector Thank you for your answer. We are trying to move in this direction.

Another question is, can we use your b18c384nbt-humanv0.bin.gz model and fine-tune it with some specific human game data? This would be to adapt it to specific player styles or differences in rating definitions across various regions.

jopdorp · 2024-08-13T17:33:18Z

It would also be great to make smaller models, for example b10 human style

BigBoxxx · 2024-08-15T06:12:42Z

@jopdorp Good idea~ so which part of the documentation can I refer to in order to implement the training process?🙏

jopdorp · 2024-08-23T12:31:36Z

@BigBoxxx I think thre are two general approaches here, that is to train a new supervised model from scratch, similar to how @lightvector trained the 18b humansl model, or :

You could take the 18b human sl model, input randomly generated data into it, and the 10b model to be trained, then update the parameters of the 10b model to get closer to the same outputs as the 18b model, this way you would not need any data, and you can get closer to the capabilities of the 18b model.

As a side note, this second approach could also be applied to make more powerful 10b and 20b normal models (non humansl)

BigBoxxx · 2024-08-28T08:53:58Z

@jopdorp Thanks jopdorp, what documents can I refer to for training a new supervised model from scratch, similar to the 18b humansl model?

jopdorp · 2024-08-28T13:57:05Z

@BigBoxxx I think there is just the source code in the python directory

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training of human-like models #971

Training of human-like models #971

BigBoxxx commented Aug 8, 2024

lightvector commented Aug 8, 2024

mooy8899 commented Aug 9, 2024

BigBoxxx commented Aug 9, 2024

jopdorp commented Aug 13, 2024 •

edited

Loading

BigBoxxx commented Aug 15, 2024

jopdorp commented Aug 23, 2024

BigBoxxx commented Aug 28, 2024

jopdorp commented Aug 28, 2024

Training of human-like models #971

Training of human-like models #971

Comments

BigBoxxx commented Aug 8, 2024

lightvector commented Aug 8, 2024

mooy8899 commented Aug 9, 2024

BigBoxxx commented Aug 9, 2024

jopdorp commented Aug 13, 2024 • edited Loading

BigBoxxx commented Aug 15, 2024

jopdorp commented Aug 23, 2024

BigBoxxx commented Aug 28, 2024

jopdorp commented Aug 28, 2024

jopdorp commented Aug 13, 2024 •

edited

Loading