You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi team, I'm currently testing my training job on AWS Trainium instance. I encountered error Input tensor is not an XLA tensor: torch.FloatTensor when using pytorch Conv1d/Linear module. I’ve confirmed that the input tensor has been moved to xla as I explicitly called .to(xm.xla_device()) when passing the input tensor to the module forward method. However, I found out the error was actually caused by the weight and bias generated within those pytorch module, eg here: https://github.com/pytorch/pytorch/blob/main/torch/nn/modules/conv.py#L375, I printed the device location for self.weght and self.bias and they are on cpu. I have to modify the source Conv1d code to resolve the issue, eg:
Speaking from experience @JmeanJmy , Input tensor is not an XLA tensor is somewhat misleading as it can either mean the model or tensor are not on the xla device. Have you tried just moving the Conv.weight and Conv.bias (or just moving the whole Conv) to device if it's not on the device outside of the source code? I believe something this trivial should not need source code modification.
Hi team, I'm currently testing my training job on AWS Trainium instance. I encountered error
Input tensor is not an XLA tensor: torch.FloatTensor
when using pytorch Conv1d/Linear module. I’ve confirmed that the input tensor has been moved to xla as I explicitly called.to(xm.xla_device())
when passing the input tensor to the module forward method. However, I found out the error was actually caused by the weight and bias generated within those pytorch module, eg here: https://github.com/pytorch/pytorch/blob/main/torch/nn/modules/conv.py#L375, I printed the device location for self.weght and self.bias and they are on cpu. I have to modify the source Conv1d code to resolve the issue, eg:Does anyone know how to make sure those are on the xla device?
The text was updated successfully, but these errors were encountered: