torchserve bloom7b1 demo Load model failed #3202

zqc2011hy · 2024-06-22T04:02:03Z

🐛 Describe the bug

2024-06-22T03:41:52,860 [ERROR] W-9000-bloom7b1_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 2
2024-06-22T03:41:52,861 [ERROR] W-9000-bloom7b1_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error
org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
2024-06-22T03:41:52,863 [WARN ] W-9000-bloom7b1_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: bloom7b1, error: Worker died.
2024-06-22T03:41:52,863 [DEBUG] W-9000-bloom7b1_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-bloom7b1_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-06-22T03:41:52,863 [WARN ] W-9000-bloom7b1_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again
2024-06-22T03:41:52,864 [INFO ] epollEventLoopGroup-5-2 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STOPPED

Error logs

2024-06-22T03:41:52,860 [ERROR] W-9000-bloom7b1_1.0 org.pytorch.serve.wlm.WorkerThread - Number or consecutive unsuccessful inference 2
2024-06-22T03:41:52,861 [ERROR] W-9000-bloom7b1_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error
org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:242) [model-server.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
2024-06-22T03:41:52,863 [WARN ] W-9000-bloom7b1_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: bloom7b1, error: Worker died.
2024-06-22T03:41:52,863 [DEBUG] W-9000-bloom7b1_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-bloom7b1_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-06-22T03:41:52,863 [WARN ] W-9000-bloom7b1_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again
2024-06-22T03:41:52,864 [INFO ] epollEventLoopGroup-5-2 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STOPPED

Installation instructions

https://kserve.github.io/website/latest/modelserving/v1beta1/llm/torchserve/accelerate/

Model Packaging

gs://kfserving-examples/models/torchserve/llm/Huggingface_accelerate/bloom

config.properties

No response

Versions

torchserve --start --model-store=/mnt/models/model-store --ts-config=/mnt/models/config/config.properties

Repro instructions

gs://kfserving-examples/models/torchserve/llm/Huggingface_accelerate/bloom

Possible Solution

No response

agunapal · 2024-06-22T19:30:14Z

Hi @zqc2011hy Looking at this log : Backend worker did not respond in given time, it seems you need to increase the default_response_timeout value in config.properties.
This value would changes depending on the hardware you are using.

zqc2011hy · 2024-06-23T01:53:25Z

SERVICE_HOSTNAME=$(kubectl get inferenceservice bloom7b1 -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v
-H "Host: ${SERVICE_HOSTNAME}"
-H "Content-Type: application/json"
-d @./text.json
http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/bloom7b1:predict

{"predictions":["My dog is cute.\nNice.\n- Hey, Mom.\n- Yeah?\nWhat color's your dog?\n- It's gray.\n- Gray?\nYeah.\nIt looks gray to me.\n- Where'd you get it?\n- Well, Dad says it's kind of...\n- Gray?\n- Gray.\nYou got a gray dog?\n- It's gray.\n- Gray.\nIs your dog gray?\nAre you sure?\nNo.\nYou sure"]}

Please provide the specific parameters of text.json

agunapal added triaged Issue has been reviewed and triaged kfserving labels Jun 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torchserve bloom7b1 demo Load model failed #3202

torchserve bloom7b1 demo Load model failed #3202

zqc2011hy commented Jun 22, 2024

agunapal commented Jun 22, 2024

zqc2011hy commented Jun 23, 2024 •

edited

Loading

torchserve bloom7b1 demo Load model failed #3202

torchserve bloom7b1 demo Load model failed #3202

Comments

zqc2011hy commented Jun 22, 2024

🐛 Describe the bug

Error logs

Installation instructions

Model Packaging

config.properties

Versions

Repro instructions

Possible Solution

agunapal commented Jun 22, 2024

zqc2011hy commented Jun 23, 2024 • edited Loading

zqc2011hy commented Jun 23, 2024 •

edited

Loading