Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging update: added PID and formatting #276

Merged
merged 4 commits into from
Jul 16, 2023

Conversation

theobjectivedad
Copy link
Contributor

@theobjectivedad theobjectivedad commented Jul 14, 2023

Adding a logging enhancement that adds PID, package name, and line number to each logging message, ex:

[2023-07-14 12:35:12,981] [INFO] [axolotl.scripts.train:218] [PID:19] loading tokenizer... /models/llama-7b-hf
[2023-07-14 12:35:12,987] [INFO] [axolotl.scripts.train:218] [PID:18] loading tokenizer... /models/llama-7b-hf
[2023-07-14 12:35:12,988] [INFO] [axolotl.scripts.train:218] [PID:17] loading tokenizer... /models/llama-7b-hf
...
[2023-07-14 12:35:33,253] [INFO] [torch.distributed.distributed_c10d._store_based_barrier:432] [PID:18] Added key: store_based_barrier_key:2 to store for rank: 1
[2023-07-14 12:35:33,276] [INFO] [torch.distributed.distributed_c10d._store_based_barrier:432] [PID:17] Added key: store_based_barrier_key:2 to store for rank: 0
[2023-07-14 12:35:33,276] [INFO] [torch.distributed.distributed_c10d._store_based_barrier:432] [PID:19] Added key: store_based_barrier_key:2 to store for rank: 2
[2023-07-14 12:35:33,277] [INFO] [torch.distributed.distributed_c10d._store_based_barrier:466] [PID:19] Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 3 nodes.
[2023-07-14 12:35:33,283] [INFO] [torch.distributed.distributed_c10d._store_based_barrier:466] [PID:18] Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 3 nodes.
[2023-07-14 12:35:33,286] [INFO] [torch.distributed.distributed_c10d._store_based_barrier:466] [PID:17] Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 3 nodes.

Background:

I'm working on potential DeepSpeed + additional logging/visibility updates and find it incredibly useful to see which GPU process and exactly which module path/line number triggered a logging event.

Copy link
Collaborator

@winglian winglian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@winglian
Copy link
Collaborator

@theobjectivedad would you mind fixing a few things for the linter please? thanks!

pre-commit run --all-files

@winglian
Copy link
Collaborator

also, lmk if you want any help with the failing unit tests https://github.com/OpenAccess-AI-Collective/axolotl/actions/runs/5554165505/jobs/10144233892?pr=276

there are checks in the unit tests that check the config validation, so the output messages likely changed (or are now generators it seems). It might be easiest to not use the logging module for the validation sub-module

@theobjectivedad
Copy link
Contributor Author

Ty @winglian, I'll have some time to take a crack at it this Sunday. I'm planning to use Axolotl for my research projects so I need to have a proper dev environment setup anyway.

@theobjectivedad
Copy link
Contributor Author

Good morning @winglian, please see my last commit ... everything appears to be passing now locally:

=== 30 passed in 3.62s ===
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
black....................................................................Passed
isort....................................................................Passed
flake8...................................................................Passed
pylint...................................................................Passed
mypy.....................................................................Passed
bandit...................................................................Passed

@winglian winglian merged commit 6f16c45 into axolotl-ai-cloud:main Jul 16, 2023
mkeoliya pushed a commit to mkeoliya/axolotl that referenced this pull request Dec 15, 2023
…enhancement

Logging update: added PID and formatting
djsaunde pushed a commit that referenced this pull request Dec 17, 2024
Logging update: added PID and formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants