Bot mainly generating blank text during and after training #23

TheLittlePeace · 2023-11-16T20:42:48Z

I have tried many different configurations and many different versions, all with the same result. It continuously generates blank text. I don't know if this is a problem with aitextgen or what? I'm using the Google Colab page to train.

Pip installed versions (Simply trying !pip install aitextgen ends in a failure, this is the setup that I got to work with the most recent versions available):

!pip install -q git+https://github.com/scorixear/aitextgen.git
!pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118

I have removed all non-ASCII characters from the training file, thinking that might be the issue before cleaning it - that didn't help.

The AI train settings (pretty much the default provided):

ai = aitextgen(tf_gpt2="124M", to_gpu=True)

ai.train(file_name,
         line_by_line=True,
         from_cache=False,
         num_steps=5000,
         generate_every=1000,
         save_every=1000,
         save_gdrive=True,
         learning_rate=1e-3,
         fp16=False,
         batch_size=1,
         )

The results I'm seeing during training: (Two out of the 5 test results are fine)

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
1,000 steps reached: generating sample texts.
==========


==========
Configuration saved in trained_model/generation_config.json
2,000 steps reached: saving model to /trained_model
Generate config GenerationConfig {
  "bos_token_id": 50256,
  "eos_token_id": 50256,
  "transformers_version": "4.26.1"
}

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
2,000 steps reached: generating sample texts.
==========
yoday

==========
Configuration saved in trained_model/generation_config.json
3,000 steps reached: saving model to /trained_model
Generate config GenerationConfig {
  "bos_token_id": 50256,
  "eos_token_id": 50256,
  "transformers_version": "4.26.1"
}

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
3,000 steps reached: generating sample texts.
==========
:gottem:

==========
Configuration saved in trained_model/generation_config.json
4,000 steps reached: saving model to /trained_model
Generate config GenerationConfig {
  "bos_token_id": 50256,
  "eos_token_id": 50256,
  "transformers_version": "4.26.1"
}

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
4,000 steps reached: generating sample texts.
==========

==========
Configuration saved in trained_model/generation_config.json
5,000 steps reached: saving model to /trained_model
Generate config GenerationConfig {
  "bos_token_id": 50256,
  "eos_token_id": 50256,
  "transformers_version": "4.26.1"
}

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
5,000 steps reached: generating sample texts.
==========


==========

This continues to happen when generating text from the trained model.

I'm pretty new to this sort of stuff, so is it something I'm doing wrong?

The text was updated successfully, but these errors were encountered:

theSoberSobber · 2024-04-07T14:33:51Z

Hey, same problem, found any solution?

theSoberSobber mentioned this issue Apr 7, 2024

Empty Text Generation? minimaxir/aitextgen#235

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bot mainly generating blank text during and after training #23

Bot mainly generating blank text during and after training #23

TheLittlePeace commented Nov 16, 2023

theSoberSobber commented Apr 7, 2024

Bot mainly generating blank text during and after training #23

Bot mainly generating blank text during and after training #23

Comments

TheLittlePeace commented Nov 16, 2023

theSoberSobber commented Apr 7, 2024