Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support nsys profiler upload in all cases #911

Open
gobbleturk opened this issue Sep 24, 2024 · 0 comments
Open

Support nsys profiler upload in all cases #911

gobbleturk opened this issue Sep 24, 2024 · 0 comments
Assignees
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@gobbleturk
Copy link
Collaborator

gobbleturk commented Sep 24, 2024

For both jax.profiler (profiler=xplane in maxtext) and a GPU nsys profiler (profiler=nsys in maxtext) we upload the profile to the base_output_directory (source)

Typically this directory is GCS, it can also be local. However for the nsys profiler we hardcode the uploader to use gsutil source, which has two problems

  1. Output directory may not be GCS, so gsutil is not applicable
  2. Hosts may not have gsutil installed, since gsutil is not in requirements.txt

We should modify the nsys profile upload to work in all cases.

Additional context - #909 was added as a temporary fix for 2 - we won't upload the profile when gsutil is missing, so training may continue

@gobbleturk gobbleturk added bug Something isn't working good first issue Good for newcomers labels Sep 24, 2024
@hengtaoguo hengtaoguo self-assigned this Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants