Record comms input and output tensor information #1014

sanrise · 2024-11-14T02:05:48Z

Summary: Just copy the output from NCCL metadata about tensor information. Allows for easier analysis of kernel memory access patterns in downstream tools.

Differential Revision: D65785010

facebook-github-bot · 2024-11-14T02:06:03Z

This pull request was exported from Phabricator. Differential Revision: D65785010

sraikund16

accepting to unblock, lets make sure to add a trace though

Summary: Just copy the output from NCCL metadata about tensor information. Allows for easier analysis of kernel memory access patterns in downstream tools. Reviewed By: sraikund16 Differential Revision: D65785010

facebook-github-bot · 2024-11-14T22:55:27Z

This pull request was exported from Phabricator. Differential Revision: D65785010

Summary: Just copy the output from NCCL metadata about tensor information. Allows for easier analysis of kernel memory access patterns in downstream tools. Reviewed By: sraikund16 Differential Revision: D65785010

facebook-github-bot · 2024-11-14T23:56:31Z

This pull request was exported from Phabricator. Differential Revision: D65785010

Summary: Just copy the output from NCCL metadata about tensor information. Allows for easier analysis of kernel memory access patterns in downstream tools. Reviewed By: sraikund16 Differential Revision: D65785010

facebook-github-bot · 2024-11-15T00:02:20Z

This pull request was exported from Phabricator. Differential Revision: D65785010

Summary: Just copy the output from NCCL metadata about tensor information. Allows for easier analysis of kernel memory access patterns in downstream tools. Reviewed By: sraikund16 Differential Revision: D65785010

facebook-github-bot · 2024-11-15T00:05:03Z

This pull request was exported from Phabricator. Differential Revision: D65785010

Summary: Just copy the output from NCCL metadata about tensor information. Allows for easier analysis of kernel memory access patterns in downstream tools. Reviewed By: sraikund16 Differential Revision: D65785010

facebook-github-bot · 2024-11-15T19:12:24Z

This pull request was exported from Phabricator. Differential Revision: D65785010

facebook-github-bot · 2024-11-15T20:14:22Z

This pull request has been merged in d9673d2.

facebook-github-bot · 2024-11-25T21:44:11Z

This pull request has been reverted by 158d409.

Summary: Reverts a rollback D66458621 Revert the Kineto rollback, this would have partially solved the issue since this part controls transmission of the metadata to the corresponding kernel. but record_param_comms in pytorch is the real issue and was still recording this metadata and would still make an invalid trace JSON when working with GPUs>30 (our truncation case). Differential Revision: D66475394

Summary: Pull Request resolved: #1017 Reverts a rollback D66458621 Revert the Kineto rollback, this would have partially solved the issue since this part controls transmission of the metadata to the corresponding kernel. but record_param_comms in pytorch is the real issue and was still recording this metadata and would still make an invalid trace JSON when working with GPUs>30 (our truncation case). Reviewed By: sraikund16 Differential Revision: D66475394 fbshipit-source-id: 5b781ccb27fa898a1a6496c72733f72fd31c822e

facebook-github-bot added the cla signed label Nov 14, 2024

facebook-github-bot added the fb-exported label Nov 14, 2024

sanrise requested review from briancoutinho and sraikund16 November 14, 2024 02:06

sraikund16 approved these changes Nov 14, 2024

View reviewed changes

sanrise force-pushed the export-D65785010 branch from 7d78560 to 40b9db2 Compare November 14, 2024 22:55

sanrise force-pushed the export-D65785010 branch from 40b9db2 to 1539c9a Compare November 14, 2024 23:56

sanrise force-pushed the export-D65785010 branch from 1539c9a to 5d83c43 Compare November 15, 2024 00:02

sanrise force-pushed the export-D65785010 branch from 5d83c43 to fa2d485 Compare November 15, 2024 00:04

Record comms input and output tensor information (pytorch#1014)

50ae474

Summary: Just copy the output from NCCL metadata about tensor information. Allows for easier analysis of kernel memory access patterns in downstream tools. Reviewed By: sraikund16 Differential Revision: D65785010

sanrise force-pushed the export-D65785010 branch from fa2d485 to 50ae474 Compare November 15, 2024 19:12

facebook-github-bot closed this in d9673d2 Nov 15, 2024

facebook-github-bot added the Merged label Nov 15, 2024

facebook-github-bot added the Reverted label Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record comms input and output tensor information #1014

Record comms input and output tensor information #1014

sanrise commented Nov 14, 2024

facebook-github-bot commented Nov 14, 2024

sraikund16 left a comment

facebook-github-bot commented Nov 14, 2024

facebook-github-bot commented Nov 14, 2024

facebook-github-bot commented Nov 15, 2024

facebook-github-bot commented Nov 15, 2024

facebook-github-bot commented Nov 15, 2024

facebook-github-bot commented Nov 15, 2024

facebook-github-bot commented Nov 25, 2024

Record comms input and output tensor information #1014

Record comms input and output tensor information #1014

Conversation

sanrise commented Nov 14, 2024

facebook-github-bot commented Nov 14, 2024

sraikund16 left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Nov 14, 2024

facebook-github-bot commented Nov 14, 2024

facebook-github-bot commented Nov 15, 2024

facebook-github-bot commented Nov 15, 2024

facebook-github-bot commented Nov 15, 2024

facebook-github-bot commented Nov 15, 2024

facebook-github-bot commented Nov 25, 2024