You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When deploying PyTorch models using the pytorch/torchserve-kfs image with Kserve, I found it challenging to understand the architecture and how different processes interact with each other. Specifically, I would like to know which processes run in which pods and how resources are allocated. To optimize for large traffic volumes, it's crucial to understand how resources are allocated to each process.
As I understand, TorchServe uses Netty-based HTTP/gRPC servers, while Kserve uses Tornado-based HTTP/gRPC servers. However, when deploying with pytorch/torchserve-kfs image, it's unclear what process runs where.
📚 The doc issue
When deploying PyTorch models using the
pytorch/torchserve-kfs
image with Kserve, I found it challenging to understand the architecture and how different processes interact with each other. Specifically, I would like to know which processes run in which pods and how resources are allocated. To optimize for large traffic volumes, it's crucial to understand how resources are allocated to each process.As I understand, TorchServe uses Netty-based HTTP/gRPC servers, while Kserve uses Tornado-based HTTP/gRPC servers. However, when deploying with
pytorch/torchserve-kfs
image, it's unclear what process runs where.Reference
https://kserve.github.io/website/master/modelserving/v1beta1/torchserve/
Suggest a potential alternative/fix
If possible, providing a high-level diagram or explanation of how the different components interact would be incredibly helpful.
The text was updated successfully, but these errors were encountered: