Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Crash when mlflow is installed and train.report_to: true #6660

Open
1 task done
steveepreston opened this issue Jan 15, 2025 · 1 comment
Open
1 task done
Labels
bug Something isn't working pending This problem is yet to be addressed

Comments

@steveepreston
Copy link
Contributor

steveepreston commented Jan 15, 2025

Reminder

  • I have read the above rules and searched the existing issues.

System Info

llamafactory: nightly
mlflow: 2.19.0

Reproduction

If mlflow is installed and Enable external logger is checked in webUI (equal to set train.report_to: true in config yaml) llamafactory itself will throw this error: RuntimeError: cannot schedule new futures after shutdown

Once you uninstall mlflow, llamafactory will not throw this error anymore.

Error Traceback:

Traceback (most recent call last):
  File "/kaggle/working/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
    launch()
  File "/kaggle/working/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
    run_exp()
  File "/kaggle/working/LLaMA-Factory/src/llamafactory/train/tuner.py", line 92, in run_exp
    _training_function(config={"args": args, "callbacks": callbacks})
  File "/kaggle/working/LLaMA-Factory/src/llamafactory/train/tuner.py", line 66, in _training_function
    run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/kaggle/working/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 82, in run_sft
    trainer = CustomSeq2SeqTrainer(
  File "/kaggle/working/LLaMA-Factory/src/llamafactory/train/sft/trainer.py", line 59, in __init__
    super().__init__(**kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/deprecation.py", line 165, in wrapped_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer_seq2seq.py", line 72, in __init__
    super().__init__(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/deprecation.py", line 165, in wrapped_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 698, in __init__
    logger.info(f"Using {args.half_precision_backend} half precision backend")
  File "/usr/lib/python3.10/logging/__init__.py", line 1477, in info
    self._log(INFO, msg, args, **kwargs)
  File "/usr/lib/python3.10/logging/__init__.py", line 1624, in _log
    self.handle(record)
  File "/usr/lib/python3.10/logging/__init__.py", line 1634, in handle
    self.callHandlers(record)
  File "/usr/lib/python3.10/logging/__init__.py", line 1696, in callHandlers
    hdlr.handle(record)
  File "/usr/lib/python3.10/logging/__init__.py", line 968, in handle
    self.emit(record)
  File "/kaggle/working/LLaMA-Factory/src/llamafactory/extras/logging.py", line 62, in emit
    self.thread_pool.submit(self._write_log, log_entry)
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 167, in submit
    raise RuntimeError('cannot schedule new futures after shutdown')
RuntimeError: cannot schedule new futures after shutdown
@steveepreston steveepreston added bug Something isn't working pending This problem is yet to be addressed labels Jan 15, 2025
@steveepreston
Copy link
Contributor Author

steveepreston commented Jan 15, 2025

While train.report_to setting to true or false only by Checkbox in WebUI but i manually tried train.report_to: mlflow and error didn't changed.

@steveepreston steveepreston changed the title [Bug] Crash when mlflow installed and train.report_to: true [Bug] Crash when mlflow is installed and train.report_to: true Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

1 participant