You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ZainRizvi
changed the title
Unstable CUDA signal in CI caused by cudnn 9 update
Rebase your PRs: Unstable CUDA signal in CI caused by cudnn 9 update
Jun 7, 2024
Current Status
Mitigated. Remove cudnn failures on your PR by rebasing past https://hud2.pytorch.org/pytorch/pytorch/commit/54fe2d0e89e1d7c64c1fb2ab120e966a750aff4d
Error looks like
Failures on CI related to cudnn
Incident timeline (all times pacific)
Update Try 1 (reverted):
Jun 4, 2024, 8:55 AM PST - Builder PR Merged
Jun 4, 2024, 9:33 AM PST - Pytrrch/Pytorch PR Merged
Jun 5, 2024, 1:59 AM PST - Pytorch/Pytorch PR Reverted
Jun 5, 2024, 4:24 AM PST - Builder PR Reverted
Update Try 2 (landed):
Jun 6, 2024, 11:11 PM PST - Builder PR Landed
Jun 6, 2024, 11:45 PM PST - Pytrrch/Pytorch PR Merged
Jun 6, 2024, 2:43 PM PST - Followup fix for qlinear failure landed
User impact
How does this affect users of PyTorch CI?
Root cause
Update to cudnn 9
Mitigation
Mitigated, rebase past: https://hud2.pytorch.org/pytorch/pytorch/commit/54fe2d0e89e1d7c64c1fb2ab120e966a750aff4d
Prevention/followups
To mitigate in future we need to adress this issue: pytorch/builder#1849
The text was updated successfully, but these errors were encountered: