-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turning some logger.info into logger.debug and remove some logging overhead when not using debug #5211
Conversation
…erhead when not using debug
@@ -12,7 +12,8 @@ | |||
_loggers = set() | |||
_log_level = NOTSET | |||
DATE_FORMAT = "%Y-%m-%d %H:%M:%S" | |||
LOG_FORMAT = "%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s" | |||
DEBUG_LOG_FORMAT = "%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I think this was added by Anupam for log parsing (so he could get the time). Just checking in with @hvpeteet to make sure we aren't using it anywhere in cloud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't parse logs in any meaningful way at the moment so I think this should be fine. We essentially just record stdout and stderr and forward it on to the user. This may also get forwarded to stackdriver but each line would already have a timestamp provided upon upload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to provide --debug through the YAML? Just worried if users are depending on the current logs and want to restore their old functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YES !
You can do this by adding debug: true
in the yaml. For instance :
debug: true
behaviors:
3DBall:
trainer_type: ppo
hyperparameters:
batch_size: 64
buffer_size: 12000
learning_rate: 0.0003
beta: 0.001
epsilon: 0.2
lambd: 0.99
num_epoch: 3
learning_rate_schedule: linear
network_settings:
normalize: true
hidden_units: 128
num_layers: 2
vis_encode_type: simple
reward_signals:
extrinsic:
gamma: 0.99
strength: 1.0
keep_checkpoints: 5
max_steps: 500000
time_horizon: 1000
summary_freq: 12000
threaded: true
The use case is not documented, I will add a line in the changelog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
|
||
if log_level == DEBUG: | ||
formatter = logging.Formatter(fmt=DEBUG_LOG_FORMAT, datefmt=DATE_FORMAT) | ||
for logger in _loggers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this for loop be a good candidate for a separate method def set_all_formatters
?
@@ -104,7 +104,7 @@ def save_replay_buffer(self) -> None: | |||
Save the training buffer's update buffer to a pickle file. | |||
""" | |||
filename = os.path.join(self.artifact_path, "last_replay_buffer.hdf5") | |||
logger.info(f"Saving Experience Replay Buffer to {filename}...") | |||
logger.debug(f"Saving Experience Replay Buffer to {filename}...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make the loading and saving ones info
- mainly because they can be very slow (tens of seconds) if the replay buffer is large. So the user won't think that the code got stuck.
ml-agents/mlagents/trainers/stats.py
Outdated
f"{file_name} was left over from a previous run. Deleting." | ||
) | ||
full_fname = os.path.join(directory_name, file_name) | ||
try: | ||
os.remove(full_fname) | ||
except OSError: | ||
logger.warning( | ||
logger.debug( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe logger.error
? This one shouldn't happen unless we try to delete a file that has some write permissions blocked
ml-agents/mlagents/trainers/stats.py
Outdated
@@ -234,14 +234,14 @@ def _maybe_create_summary_writer(self, category: str) -> None: | |||
def _delete_all_events_files(self, directory_name: str) -> None: | |||
for file_name in os.listdir(directory_name): | |||
if file_name.startswith("events.out"): | |||
logger.warning( | |||
logger.debug( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep this one as a warning so that users know when their old TBs are being overwritten
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢 🍝
Proposed change(s)
Clean up some of the console outputs of training. The behavior under
--debug
remains unchanged.Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)
Types of change(s)
Checklist
Other comments