You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are important Pytorch Specific markers generated during the training loop either using NVTX or other means from pytorch lightning.
It would be useful to show the time line view with execution time for different pytorch training sections of the code as shown in the attached image. The current trace view is quite detailed and goes down one-step further with detailed cuda kernels etc. It would be useful to have Pytorch Execution specific trace view that is easy to understand and intuit to find any synchronization or communication bottlenecks in the training loop and also compare different execution time for different executing region of the training/validation code (like forward, loss, data-loading etc).
The text was updated successfully, but these errors were encountered:
There are important Pytorch Specific markers generated during the training loop either using NVTX or other means from pytorch lightning.
It would be useful to show the time line view with execution time for different pytorch training sections of the code as shown in the attached image. The current trace view is quite detailed and goes down one-step further with detailed cuda kernels etc. It would be useful to have Pytorch Execution specific trace view that is easy to understand and intuit to find any synchronization or communication bottlenecks in the training loop and also compare different execution time for different executing region of the training/validation code (like forward, loss, data-loading etc).
The text was updated successfully, but these errors were encountered: