Torch not compiled with CUDA enabled when deploying T5 using Triton #4651

subhamiitk · 2024-05-04T02:41:13Z

Link to the notebook
https://github.com/aws/amazon-sagemaker-examples/blob/main/inference/nlp/realtime/triton/single-model/t5_pytorch_python-backend/t5_pytorch_python-backend.ipynb

Describe the bug
When following this notebook, getting an error when creating the endpoint. Endpoint creation fails with error: creating server: Invalid argument - load failed for model '/opt/ml/model/::t5_pytorch': version 1 is at UNAVAILABLE state: Internal: AssertionError:
error in the Cloudwatch.
To reproduce
Followed the above notebook for T5 model deployment, getting error at creating the endpoint.

Logs
error: creating server: Invalid argument - load failed for model '/opt/ml/model/::t5_pytorch': version 1 is at UNAVAILABLE state: Internal: AssertionError:

The text was updated successfully, but these errors were encountered:

HubGab-Git · 2024-10-20T06:21:20Z

Hi @subhamiitk,
Could you share what environment you’re using? I ran the setup with the following configuration, and everything worked smoothly:

•	Platform: JupyterLab
•	Instance: ml.t3.medium
•	Image: SageMaker Distribution 2.0.0
•	Storage: 20GB
•	Kernel: Python 3 (default)

Looking forward to hearing from you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch not compiled with CUDA enabled when deploying T5 using Triton #4651

Torch not compiled with CUDA enabled when deploying T5 using Triton #4651

subhamiitk commented May 4, 2024

HubGab-Git commented Oct 20, 2024

Torch not compiled with CUDA enabled when deploying T5 using Triton #4651

Torch not compiled with CUDA enabled when deploying T5 using Triton #4651

Comments

subhamiitk commented May 4, 2024

HubGab-Git commented Oct 20, 2024