Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifying option for nvrtc #2926

Merged
merged 4 commits into from
Sep 10, 2024
Merged

Modifying option for nvrtc #2926

merged 4 commits into from
Sep 10, 2024

Conversation

jjsjann123
Copy link
Collaborator

@jjsjann123 jjsjann123 commented Sep 10, 2024

Adding NVFUSER_ENABLE=kernel_debug to enable debug option -G in nvrtc;
Moving DebugDumpOption::DebugInfo to EnableOption::KernelLineInfo.

@jacobhinkle
Copy link
Collaborator

Option naming

We already have -lineinfo which can be enabled using NVFUSER_DUMP=debug_info. That is similar to -G which also disables optimizations and is called NVFUSER_ENABLE=jit_debug in this PR. I would suggest we try to disambiguate these two. I think debug_info is not clear, while neither is the cmdline option -lineinfo unfortunately.

NVFUSER_DUMP=debug_info is mostly useful for profiling so we can see code locations with ncu. However, to add to the confusion we also have NVFUSER_ENABLE=kernel_profile which enables intra-kernel profiling markers, and NVFUSER_PROF=... which does CUPTI profiling of kernels as well as host latency.

So I think we have four related utilities that use three different env vars. One of the utilities is clearly for functional debugging (this PR), while the others are mostly for profiling and perf debugging. I think we should rename the NVFUSER_DUMP=debug_info option as NVFUSER_ENABLE=kernel_lineinfo (it affects the generated kernel so I think it should be an EnableOption instead of a dump option) and as a separate issue we might want to make sure NVFUSER_ENABLE=kernel_profile is not too confusing: e.g. the NVFUSER_PROF profiler uses a structure called KernelProfile.

@naoyam
Copy link
Collaborator

naoyam commented Sep 10, 2024

Here's the original PR to introduce debug_info: csarofeen/pytorch#1855

@naoyam
Copy link
Collaborator

naoyam commented Sep 10, 2024

I think we should rename the NVFUSER_DUMP=debug_info option as NVFUSER_ENABLE=kernel_lineinfo (it affects the generated kernel so I think it should be an EnableOption instead of a dump option)

+1

@jjsjann123
Copy link
Collaborator Author

funny enough, I was trying to use debug_info but saw it taken for the line info already....

Looks like you guys don't mind the renaming. I'll:

  1. change this flag to NVFUSER_ENABLE=kernel_debug`
  2. As @jacobhinkle suggested, change NVFUSER_DUMP=debug_info option as NVFUSER_ENABLE=kernel_lineinfo

@naoyam
Copy link
Collaborator

naoyam commented Sep 10, 2024

Do we need -G rather than just --lineinfo? The former also disables optimization.

@jjsjann123
Copy link
Collaborator Author

Do we need -G rather than just --lineinfo? The former also disables optimization.

Yes, those two are different.

--device-debug (-G)

Generate debug information. If '--dopt' is not specified, then turns off all optimizations.

--generate-line-info (-lineinfo)

Generate line-number information.

I verified with the generated PTX. We only see debug when -G is given to nvrtc.

.version 8.5
.target sm_80, debug

@jjsjann123
Copy link
Collaborator Author

But realistically, since this is only for @xwang233 's compute sanitizer stuff, is -lineinfo enough for that, or do we need debug mode?

@jjsjann123 jjsjann123 changed the title debug option for nvrtc Modifying option for nvrtc Sep 10, 2024
@naoyam
Copy link
Collaborator

naoyam commented Sep 10, 2024

Yes, that's what I'm asking about. For sanitizer, I think --lineinfo is sufficient. That said, -G may also be useful for actual debugging, so I'm fine to add that too.

@@ -162,6 +161,8 @@ std::unordered_map<EnableOption, std::vector<std::string>> Options<
{"static_fusion_count", EnableOption::StaticFusionCount},
{"warn_register_spill", EnableOption::WarnRegisterSpill},
{"io_to_lower_precision", EnableOption::IoToLowerPrecision},
{"kernel_debug", EnableOption::KernelDebug},
{"kernel_lineinfo", EnableOption::KernelLineInfo},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tagging @zasdfgbnm , this is renamed from DebugDumpOption::DebugInfo

@@ -179,7 +179,7 @@ std::string FusionExecutor::getStructuredCode(
<< code << "\n======================================\n\n";
}
if (isDebugDumpEnabled(DebugDumpOption::CudaToFile) ||
isDebugDumpEnabled(DebugDumpOption::DebugInfo)) {
isOptionEnabled(EnableOption::KernelLineInfo)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me a little unexpected that an enable option dumps a file. Do we need this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and I'm more than happy to remove this one. tagging @zasdfgbnm to see if there's a reason that we were dumping the cuda source in the first place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a strong request (so approved the PR already), but my preference is to remove this line. I feel that's clearer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and I'm more than happy to remove this one. tagging @zasdfgbnm to see if there's a reason that we were dumping the cuda source in the first place.

I think the original intention was indeed dumping some useful info for analyses using ncu. This can be done NVFUSER_ENABLE=kernel_lineinfo NVFUSER_DUMP=cuda_to_file. Now that we have (too) many options, I'd prefer each option as simple as possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably there because the line info is pretty useless without the dumped kernel. If you only provide NVFUSER_DUMP=debug_info and not cuda_to_file, then ncu-ui will not show the source (it gives a file not found when you try to look at the source and asks where the file is).

Copy link
Collaborator Author

@jjsjann123 jjsjann123 Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably there because the line info is pretty useless without the dumped kernel. If you only provide NVFUSER_DUMP=debug_info and not cuda_to_file, then ncu-ui will not show the source (it gives a file not found when you try to look at the source and asks where the file is).

🤯 That's a real surprise... I though I've been using the lineinfo on my local machine when I copy over just the profile file without the actual cuda source.

I must have got something wrong... Let me try playing with it a bit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discussion here covers all the information. I have nothing more to add, and I am OK with the change. Just a small request that we should document how to use these flags in our wiki page.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. I'll remember to do that before merging this one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

@jjsjann123 jjsjann123 Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, you don't have to manually copy the cuda source. you can opt in with --import-source 1 in ncu to import the source into the report, so you don't need to manually resolve that in ncu-ui.

But yeah, you still need to have the cuda source dumped out in the first place.

Copy link
Collaborator

@naoyam naoyam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup!

@jjsjann123 jjsjann123 merged commit 6d70792 into main Sep 10, 2024
5 checks passed
@jjsjann123 jjsjann123 deleted the jit_debug_option branch September 10, 2024 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants