You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pytorch isn't taking much advantage of MPS in a unified memory way, and, i agree, there isn't a whole lot of point to it on MPS other than to be able to run the same code and ensure a consistent experience.
for ROCm, i'm almost certain it just masquerades as CUDA and is invisible. the Intel and other systems like Ascend NPU rely on XPU or NPU extensions in pytorch, and TPUs require XLA.
The CPUOptimizerOffload class is very clever, but overly relies on CUDA Streams, which aren't available w/o a CUDA device.
should use
torch.cpu.Stream
andtorch.cpu.current_stream
instead.additionally,
pin_memory=True if torch.cuda.is_available() else False
as MPS is a unified mem arch.The text was updated successfully, but these errors were encountered: