[Bug] CheckpointHook After save_best is set, running val alone will cause an error #1587
Open
2 tasks done
Labels
bug
Something isn't working
Prerequisite
Environment
All environments
Reproduces the problem - code sample
runner.val()
Reproduces the problem - command or script
When I need to run val once (Debug), the code will tell me at the end that I have no save_best history
Reproduces the problem - error message
10/20 12:19:20 - mmengine - INFO - Epoch(val) [0][124/125] eta: 0:00:00 time: 0.0564 data_time: 0.0008 memory: 482
10/20 12:19:20 - mmengine - INFO - Epoch(val) [0][125/125] eta: 0:00:00 time: 0.0561 data_time: 0.0007 memory: 482
10/20 12:19:20 - mmengine - INFO - Epoch(val) [0][125/125] DepthMetric/abs_rel: 0.7433 DepthMetric/sq_rel: 0.3333 DepthMetric/rmse: 0.4196 DepthMetric/rmse_log: 1.3231 DepthMetric/a1: 0.0834 DepthMetric/a2: 0.1743 DepthMetric/a3: 0.2699 data_time: 0.0170 time: 0.0793
Traceback (most recent call last):
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/runpy.py", line 198, in _run_module_as_main
return _run_code(code, main_globals, None,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/runpy.py", line 88, in _run_code
exec(code, run_globals)
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 71, in
cli.main()
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 501, in main
run()
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 351, in run_file
runpy.run_path(target, run_name="main")
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 310, in run_path
return _run_module_code(code, init_globals, run_name, pkg_name=pkg_name, script_name=fname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 127, in _run_module_code
_run_code(code, mod_globals, init_globals, mod_name, mod_spec, pkg_name, script_name)
File "/home/baihanlin/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 118, in _run_code
exec(code, run_globals)
File "/home/baihanlin/Project/UMDE/DreamDE/run_command.py", line 56, in
demo()
File "/home/baihanlin/Project/UMDE/DreamDE/run_command.py", line 50, in demo
coal_dump_mde()
File "/home/baihanlin/Project/UMDE/DreamDE/run_command.py", line 35, in coal_dump_mde
run_command_script(
File "/home/baihanlin/Project/UMDE/DreamDE/tools/run_task.py", line 41, in run_command_script
runner.val()
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/runner/runner.py", line 1800, in val
metrics = self.val_loop.run() # type: ignore
^^^^^^^^^^^^^^^^^^^
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/runner/loops.py", line 377, in run
self.runner.call_hook('after_val_epoch', metrics=metrics)
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/runner/runner.py", line 1839, in call_hook
getattr(hook, fn_name)(self, **kwargs)
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/hooks/checkpoint_hook.py", line 361, in after_val_epoch
self._save_best_checkpoint(runner, metrics)
File "/home/baihanlin/miniconda3/envs/DreamDE/lib/python3.12/site-packages/mmengine/hooks/checkpoint_hook.py", line 514, in _save_best_checkpoint
best_ckpt_path = self.best_ckpt_path_dict[key_indicator]
~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
KeyError: 'DepthMetric/abs_rel'
Additional information
No response
The text was updated successfully, but these errors were encountered: