Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jetstream API timeouts on MQTT streams #6191

Open
slice-arpitkhatri opened this issue Dec 1, 2024 · 0 comments
Open

Jetstream API timeouts on MQTT streams #6191

slice-arpitkhatri opened this issue Dec 1, 2024 · 0 comments
Labels
defect Suspected defect such as a bug or regression

Comments

@slice-arpitkhatri
Copy link

slice-arpitkhatri commented Dec 1, 2024

Observed behavior

We are using MQTT in a single-node NATS deployment. There are sudden spikes in JetStream API failures, which cause connection issues, subscription failures, and message publishing failures. This occurs multiple times per day, making it a high-frequency failure event. A clean restart resolves the issue. During these incidents, there are no anomalies in the CPU or memory metrics.

System Details

Instance Details:

CPU: 32 cores
Memory: 128GB
Disk Storage: 50GB

Utilization:

CPU: 2 cores
Memory: 1GB
Disk: 150MB

Number of MQTT connections: 3,000

Number of MQTT subscriptions: 6,000 (QoS 1)

Messages produced: ~30 RPS across all topics

A single NATS queue group subscription is used to consume MQTT-published messages on one topic.

Associated Logs:

  • mid: 102204 - "cae2bc80-7142-11ec-b9b8-33dad110a235" - Unable to persist session "cae2bc80-7142-11ec-b9b8-33dad110a235" (seq=70876): Timeout after 4.000022403s. Request type "SP" on "$MQTT.sess.RT45Zasv" (reply="$MQTT.JSA.S1Nunr6R.SP.RT45Zasv.1iHZZPsxA2EXBvLS043jtn").

  • mid: 116735 - "KkuRAJeYH02G8HqecxCiAW" - Unable to add JetStream consumer for subscription on "abcd.user.8a7d3311-4040-40b8-955d-834ce54b8c15": Error - Timeout after 4.000826922s. Request type "CC" on "$JS.API.CONSUMER.DURABLE.CREATE.$MQTT_msgs.51r4DC1W_KkuRAJeYH02G8HqecxLU1k" (reply="$MQTT.JSA.S1Nunr6R.CC.1iHZZPsxA2EXBvLS043jic").

  • mid: 84480647 - "mqttjs_a1346563" - Read loop processing time: 5.011585369s.

Another observation is that CPU usage never exceeded 2 cores, despite allocating 32 cores. Could this indicate a potential resource bottleneck?

Expected behavior

No connection/sub/pub failures

Server and client version

Nats Server version 2.10.22

Host environment

Kubernetes v1.25

Steps to reproduce

No response

@slice-arpitkhatri slice-arpitkhatri added the defect Suspected defect such as a bug or regression label Dec 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Suspected defect such as a bug or regression
Projects
None yet
Development

No branches or pull requests

1 participant