Skip to content

Commit

Permalink
Add cudnn related synchronization (pause and ping Meta)
Browse files Browse the repository at this point in the history
  • Loading branch information
nWEIdia committed Jun 28, 2024
1 parent 15bef6a commit ad4bf4e
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions CUDA_UPGRADE_GUIDE.MD
Original file line number Diff line number Diff line change
Expand Up @@ -151,3 +151,4 @@ If you require to update CUDNN version for already existing CUDA version, please
1. Builder PR: https://github.com/pytorch/builder/pull/1271. Important note: Builder PR and Pytorch PR need to be validated and landed togeather to avoid breakage of CI and nightly!
2. Add new cudnn vesion to windows AMI: https://github.com/pytorch/test-infra/pull/1523. Rebuild and retest the AMI. Follow step 6 Generate new Windows AMI, test and deploy to canary and prod.
3. Create PyTorch PR: https://github.com/pytorch/pytorch/pull/93086 and small wheel update PyTorch PR: https://github.com/pytorch/pytorch/pull/104757
4. Note: in trying to get the pytorch PR (e.g. cudnn v8 upgrade to cudnn v9) to pass all tests, you may need to pause and contact Meta for help with 1) uploading docker tags to resolve issues like "the repository with name 'pytorch/pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9' does not exist in the registry with id '308535385114'". 2) uploading new dependency wheels (e.g. nvidia-cudnn-cu12) to resolve issues like "ERROR: Could not find a version that satisfies the requirement nvidia-cudnn-cu12==9.1.0.70; platform_system == "Linux" and platform_machine == "x86_64" (from torch) (from versions: 8.8.1.3, 8.9.2.26, 8.9.7.29)"

0 comments on commit ad4bf4e

Please sign in to comment.