Still has problem when start training #905

kkklo · 2023-06-04T10:52:33Z

[Dataset 0]
loading image sizes.
100%|███████████████████████████████████████████████████████████████████████████████| 141/141 [00:00<00:00, 250.73it/s]
prepare dataset
preparing accelerator
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ F:\Stable Diffusion\kohya_ss\train_network.py:814 in │
│ │
│ 811 │ args = parser.parse_args() │
│ 812 │ args = train_util.read_config_from_file(args, parser) │
│ 813 │ │
│ ❱ 814 │ train(args) │
│ 815 │
│ │
│ F:\Stable Diffusion\kohya_ss\train_network.py:139 in train │
│ │
│ 136 │ │
│ 137 │ # acceleratorを準備する │
│ 138 │ print("preparing accelerator") │
│ ❱ 139 │ accelerator, unwrap_model = train_util.prepare_accelerator(args) │
│ 140 │ is_main_process = accelerator.is_main_process │
│ 141 │ │
│ 142 │ # mixed precisionに対応した型を用意しておき適宜castする │
│ │
│ F:\Stable Diffusion\kohya_ss\library\train_util.py:2975 in prepare_accelerator │
│ │
│ 2972 │ │ │ if args.wandb_api_key is not None: │
│ 2973 │ │ │ │ wandb.login(key=args.wandb_api_key) │
│ 2974 │ │
│ ❱ 2975 │ accelerator = Accelerator( │
│ 2976 │ │ gradient_accumulation_steps=args.gradient_accumulation_steps, │
│ 2977 │ │ mixed_precision=args.mixed_precision, │
│ 2978 │ │ log_with=log_with, │
│ │
│ F:\Stable Diffusion\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py:370 in init │
│ │
│ 367 │ │ ): │
│ 368 │ │ │ self.native_amp = is_bf16_available(True) │
│ 369 │ │ │ if mixed_precision == "bf16" and not self.native_amp and not is_tpu_availabl │
│ ❱ 370 │ │ │ │ raise ValueError(err.format(mode="bf16", requirement="PyTorch >= 1.10 an │
│ 371 │ │ │ │
│ 372 │ │ │ # Only on the GPU do we care about scaling the gradients │
│ 373 │ │ │ if torch.cuda.is_available() and self.device.type != "cpu": │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: bf16 mixed precision requires PyTorch >= 1.10 and a supported device.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\kira\AppData\Local\Programs\Python\Python310\lib\runpy.py:196 in _run_module_as_main │
│ │
│ 193 │ main_globals = sys.modules["main"].dict │
│ 194 │ if alter_argv: │
│ 195 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 196 │ return _run_code(code, main_globals, None, │
│ 197 │ │ │ │ │ "main", mod_spec) │
│ 198 │
│ 199 def run_module(mod_name, init_globals=None, │
│ │
│ C:\Users\kira\AppData\Local\Programs\Python\Python310\lib\runpy.py:86 in _run_code │
│ │
│ 83 │ │ │ │ │ loader = loader, │
│ 84 │ │ │ │ │ package = pkg_name, │
│ 85 │ │ │ │ │ spec = mod_spec) │
│ ❱ 86 │ exec(code, run_globals) │
│ 87 │ return run_globals │
│ 88 │
│ 89 def _run_module_code(code, init_globals=None, │
│ │
│ in :7 │
│ │
│ 4 from accelerate.commands.accelerate_cli import main │
│ 5 if name == 'main': │
│ 6 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │
│ ❱ 7 │ sys.exit(main()) │
│ 8 │
│ │
│ F:\Stable Diffusion\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py:45 in │
│ main │
│ │
│ 42 │ │ exit(1) │
│ 43 │ │
│ 44 │ # Run │
│ ❱ 45 │ args.func(args) │
│ 46 │
│ 47 │
│ 48 if name == "main": │
│ │
│ F:\Stable Diffusion\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py:1104 in │
│ launch_command │
│ │
│ 1101 │ elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA │
│ 1102 │ │ sagemaker_launcher(defaults, args) │
│ 1103 │ else: │
│ ❱ 1104 │ │ simple_launcher(args) │
│ 1105 │
│ 1106 │
│ 1107 def main(): │
│ │
│ F:\Stable Diffusion\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py:567 in │
│ simple_launcher │
│ │
│ 564 │ process = subprocess.Popen(cmd, env=current_env) │
│ 565 │ process.wait() │
│ 566 │ if process.returncode != 0: │
│ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) │
│ 568 │
│ 569 │
│ 570 def multi_gpu_launcher(args): │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['F:\Stable Diffusion\kohya_ss\venv\Scripts\python.exe', 'train_network.py',
'--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=F:/Stable Diffusion/aleng/image',
'--resolution=768,768', '--output_dir=F:/Stable Diffusion/aleng/model', '--logging_dir=F:/Stable Diffusion/aleng/log',
'--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-05',
'--unet_lr=0.0001', '--network_dim=128', '--output_name=aleng', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001','--lr_scheduler=constant', '--train_batch_size=2', '--max_train_steps=7050', '--save_every_n_epochs=1',
'--mixed_precision=bf16', '--save_precision=bf16', '--seed=1234', '--caption_extension=.txt', '--cache_latents',
'--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--xformers',
'--bucket_no_upscale']' returned non-zero exit status 1.

sumire608 · 2023-06-04T13:23:20Z

ValueError: bf16 mixed precision requires PyTorch >= 1.10 and a supported device.
Run setup.bat again.

bmaltais · 2023-06-04T19:02:24Z

I am working on a new setup script that should improve future upgrade and help fix issues got to move away from the dumb bat file solution to python for the task. Almost done.

* Add get_my_logger() * Use logger instead of print * Fix log level * Removed line-breaks for readability * Use setup_logging() * Add rich to requirements.txt * Make simple * Use logger instead of print --------- Co-authored-by: Kohya S <52813779+kohya-ss@users.noreply.github.com>

bmaltais closed this as completed Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Still has problem when start training #905

Still has problem when start training #905

kkklo commented Jun 4, 2023

sumire608 commented Jun 4, 2023

bmaltais commented Jun 4, 2023

Still has problem when start training #905

Still has problem when start training #905

Comments

kkklo commented Jun 4, 2023

sumire608 commented Jun 4, 2023

bmaltais commented Jun 4, 2023