You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using MQTT in a single-node NATS deployment. There are sudden spikes in JetStream API failures, which cause connection issues, subscription failures, and message publishing failures. This occurs multiple times per day, making it a high-frequency failure event. A clean restart resolves the issue. During these incidents, there are no anomalies in the CPU or memory metrics.
System Details
Instance Details:
CPU: 32 cores
Memory: 128GB
Disk Storage: 50GB
Utilization:
CPU: 2 cores
Memory: 1GB
Disk: 150MB
Number of MQTT connections: 3,000
Number of MQTT subscriptions: 6,000 (QoS 1)
Messages produced: ~30 RPS across all topics
A single NATS queue group subscription is used to consume MQTT-published messages on one topic.
Associated Logs:
mid: 102204 - "cae2bc80-7142-11ec-b9b8-33dad110a235" - Unable to persist session "cae2bc80-7142-11ec-b9b8-33dad110a235" (seq=70876): Timeout after 4.000022403s. Request type "SP" on "$MQTT.sess.RT45Zasv" (reply="$MQTT.JSA.S1Nunr6R.SP.RT45Zasv.1iHZZPsxA2EXBvLS043jtn").
mid: 116735 - "KkuRAJeYH02G8HqecxCiAW" - Unable to add JetStream consumer for subscription on "abcd.user.8a7d3311-4040-40b8-955d-834ce54b8c15": Error - Timeout after 4.000826922s. Request type "CC" on "$JS.API.CONSUMER.DURABLE.CREATE.$MQTT_msgs.51r4DC1W_KkuRAJeYH02G8HqecxLU1k" (reply="$MQTT.JSA.S1Nunr6R.CC.1iHZZPsxA2EXBvLS043jic").
Observed behavior
We are using MQTT in a single-node NATS deployment. There are sudden spikes in JetStream API failures, which cause connection issues, subscription failures, and message publishing failures. This occurs multiple times per day, making it a high-frequency failure event. A clean restart resolves the issue. During these incidents, there are no anomalies in the CPU or memory metrics.
System Details
Instance Details:
CPU: 32 cores
Memory: 128GB
Disk Storage: 50GB
Utilization:
CPU: 2 cores
Memory: 1GB
Disk: 150MB
Number of MQTT connections: 3,000
Number of MQTT subscriptions: 6,000 (QoS 1)
Messages produced: ~30 RPS across all topics
A single NATS queue group subscription is used to consume MQTT-published messages on one topic.
Associated Logs:
mid: 102204 - "cae2bc80-7142-11ec-b9b8-33dad110a235" - Unable to persist session "cae2bc80-7142-11ec-b9b8-33dad110a235" (seq=70876): Timeout after 4.000022403s. Request type "SP" on "$MQTT.sess.RT45Zasv" (reply="$MQTT.JSA.S1Nunr6R.SP.RT45Zasv.1iHZZPsxA2EXBvLS043jtn").
mid: 116735 - "KkuRAJeYH02G8HqecxCiAW" - Unable to add JetStream consumer for subscription on "abcd.user.8a7d3311-4040-40b8-955d-834ce54b8c15": Error - Timeout after 4.000826922s. Request type "CC" on "$JS.API.CONSUMER.DURABLE.CREATE.$MQTT_msgs.51r4DC1W_KkuRAJeYH02G8HqecxLU1k" (reply="$MQTT.JSA.S1Nunr6R.CC.1iHZZPsxA2EXBvLS043jic").
mid: 84480647 - "mqttjs_a1346563" - Read loop processing time: 5.011585369s.
Another observation is that CPU usage never exceeded 2 cores, despite allocating 32 cores. Could this indicate a potential resource bottleneck?
Expected behavior
No connection/sub/pub failures
Server and client version
Nats Server version 2.10.22
Host environment
Kubernetes v1.25
Steps to reproduce
No response
The text was updated successfully, but these errors were encountered: