Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The second node took about 7 seconds to join in a single node cluster #18385

Open
4 tasks done
XiuhuaRuan opened this issue Jul 31, 2024 · 9 comments
Open
4 tasks done
Labels
area/performance priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. type/feature

Comments

@XiuhuaRuan
Copy link

XiuhuaRuan commented Jul 31, 2024

Bug report criteria

What happened?

I started the first etcd node with option --force-new-cluster. Then I started the second node to join this new cluster. But it took 7 seconds to set up this cluster.

What did you expect to happen?

The second node could join the cluster quickly.

How can we reproduce it (as minimally and precisely as possible)?

  1. Start the first etcd node with option --force-new-cluster
    ./etcd --name etcd_01 --data-dir etcd01 --listen-client-urls http://192.168.31.105:12379,http://127.0.0.1:12379 --advertise-client-urls http://192.168.31.105:12379 --initial-advertise-peer-urls http://192.168.31.105:12380 --listen-peer-urls http://192.168.31.105:12380 --initial-cluster-token etcd-cluster --initial-cluster etcd_01=http://192.168.31.105:12380 --force-new-cluster

  2. Check the cluster member list successfully
    ./etcdctl --endpoints=192.168.31.105:12379 member list
    7c0730f7752a5a15, started, etcd_01, http://192.168.31.105:12380, http://192.168.31.105:12379, false

  3. Add the secode node to the cluster
    ./etcdctl --endpoints=192.168.31.105:12379 member add etcd_02 --peer-urls=http://192.168.31.106:12380
    Member 40339f82f8313cbb added to cluster 5fb4c1b191df6a5b

ETCD_NAME="etcd_02"
ETCD_INITIAL_CLUSTER="etcd_02=http://192.168.31.106:12380,etcd_01=http://192.168.31.105:12380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.31.106:12380"
ETCD_INITIAL_CLUSTER_STATE="existing"

  1. Start the second etcd node
    ./etcd --name etcd_02 --data-dir etcd02 --listen-client-urls http://192.168.31.106:12379,http://127.0.0.1:12379 --advertise-client-urls http://192.168.31.106:12379 --initial-advertise-peer-urls http://192.168.31.106:12380 --listen-peer-urls http://192.168.31.106:12380 --initial-cluster-token etcd-cluster-07311158 --initial-cluster etcd_01=http://192.168.31.105:12380,etcd_02=http://192.168.31.106:12380 --initial-cluster-state existing

  2. Checking the cluster member list failed at the beginning
    ./etcdctl --endpoints=192.168.31.105:12379 member list
    {"level":"warn","ts":"2024-07-31T13:31:53.591638+0800","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000362700/192.168.31.105:12379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
    Error: context deadline exceeded

  3. After a few seconds, check the cluster member list successfully
    ./etcdctl --endpoints=192.168.31.105:12379 member list
    40339f82f8313cbb, started, etcd_02, http://192.168.31.106:12380, http://192.168.31.106:12379, false
    7c0730f7752a5a15, started, etcd_01, http://192.168.31.105:12380, http://192.168.31.105:12379, false

Anything else we need to know?

I tried etcd-v3.5.9 and etcd-v3.5.15, both got the same result, but with different log output.

With etcd-v3.5.15, the etcd log showed 7 seconds absence between 13:32:42~13:32:49 when the second node started , reported two warnings "failed to reach the peer URL" and "failed to get downgrade enabled status" at 13:32:49.

{"level":"info","ts":"2024-07-31T13:32:42.416866+0800","caller":"netutil/netutil.go:112","msg":"resolved URL Host","url":"http://192.168.31.105:12380","host":"192.168.31.105:12380","resolved-addr":"192.168.31.105:12380"}
{"level":"info","ts":"2024-07-31T13:32:42.416933+0800","caller":"netutil/netutil.go:112","msg":"resolved URL Host","url":"http://192.168.31.106:12380","host":"192.168.31.106:12380","resolved-addr":"192.168.31.106:12380"}
{"level":"warn","ts":"2024-07-31T13:32:49.419147+0800","caller":"etcdserver/cluster_util.go:386","msg":"failed to reach the peer URL","address":"http://192.168.31.105:12380/downgrade/enabled","remote-member-id":"7c0730f7752a5a15","error":"Get \"http://192.168.31.105:12380/downgrade/enabled\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
{"level":"warn","ts":"2024-07-31T13:32:49.419219+0800","caller":"etcdserver/cluster_util.go:357","msg":"failed to get downgrade enabled status","remote-member-id":"7c0730f7752a5a15","error":"Get \"http://192.168.31.105:12380/downgrade/enabled\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}

With etcd-v3.5.9, the etcd log also showed 7 seconds absence between 10:42:15~10:42:22 when the second node started , reported two warnings "failed to convert response" and "failed to get downgrade enabled status" at 10:42:22.

{"level":"info","ts":"2024-07-31T10:42:15.27614+0800","caller":"netutil/netutil.go:112","msg":"resolved URL Host","url":"http://192.168.31.105:12380","host":"192.168.31.105:12380","resolved-addr":"192.168.31.105:12380"}
{"level":"info","ts":"2024-07-31T10:42:15.276169+0800","caller":"netutil/netutil.go:112","msg":"resolved URL Host","url":"http://192.168.31.106:12380","host":"192.168.31.106:12380","resolved-addr":"192.168.31.106:12380"}
{"level":"warn","ts":"2024-07-31T10:42:22.285193+0800","caller":"etcdserver/cluster_util.go:428","msg":"failed to convert response","address":"http://192.168.31.105:12380/downgrade/enabled","remote-member-id":"997928936728488a","error":"strconv.ParseBool: parsing \"failed linearized read: context deadline exceeded\\n\": invalid syntax"}
{"level":"warn","ts":"2024-07-31T10:42:22.285262+0800","caller":"etcdserver/cluster_util.go:381","msg":"failed to get downgrade enabled status","remote-member-id":"997928936728488a","error":"strconv.ParseBool: parsing \"failed linearized read: context deadline exceeded\\n\": invalid syntax"}

Etcd version (please run commands below)

# ./etcd --version
etcd Version: 3.6.0-alpha.0
Git SHA: f181ced
Go Version: go1.22.5
Go OS/Arch: linux/amd64

# ./etcdctl version
etcdctl version: 3.6.0-alpha.0
API version: 3.6

Etcd configuration (command line flags or environment variables)

#on the first node
./etcd --name etcd_01 --data-dir etcd01 --listen-client-urls http://192.168.31.105:12379,http://127.0.0.1:12379 --advertise-client-urls http://192.168.31.105:12379 --initial-advertise-peer-urls http://192.168.31.105:12380 --listen-peer-urls http://192.168.31.105:12380 --initial-cluster-token etcd-cluster-07311158 --initial-cluster etcd_01=http://192.168.31.105:12380 --force-new-cluster

#no the second node
./etcdctl --endpoints=192.168.31.105:12379 member add etcd_02 --peer-urls=http://192.168.31.106:12380
./etcd --name etcd_02 --data-dir etcd02 --listen-client-urls http://192.168.31.106:12379,http://127.0.0.1:12379 --advertise-client-urls http://192.168.31.106:12379 --initial-advertise-peer-urls http://192.168.31.106:12380 --listen-peer-urls http://192.168.31.106:12380 --initial-cluster-token etcd-cluster-07311158 --initial-cluster etcd_01=http://192.168.31.105:12380,etcd_02=http://192.168.31.106:12380 --initial-cluster-state existing

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

# ./etcdctl --endpoints=192.168.31.105:12379,192.168.31.106:12379 member list -w table
+------------------+---------+---------+-----------------------------+-----------------------------+------------+
|        ID        | STATUS  |  NAME   |         PEER ADDRS          |        CLIENT ADDRS         | IS LEARNER |
+------------------+---------+---------+-----------------------------+-----------------------------+------------+
| 40339f82f8313cbb | started | etcd_02 | http://192.168.31.106:12380 | http://192.168.31.106:12379 |      false |
| 7c0730f7752a5a15 | started | etcd_01 | http://192.168.31.105:12380 | http://192.168.31.105:12379 |      false |
+------------------+---------+---------+-----------------------------+-----------------------------+------------+


# ./etcdctl --endpoints=192.168.31.105:12379,192.168.31.106:12379 endpoint status -w table
+----------------------+------------------+---------------+---------+-----------+------------+-----------+------------+--------------------+--------+
|       ENDPOINT       |        ID        |    VERSION    | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------+------------------+---------------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.31.105:12379 | 7c0730f7752a5a15 | 3.6.0-alpha.0 |   20 kB |      true |      false |         3 |          7 |                  7 |        |
| 192.168.31.106:12379 | 40339f82f8313cbb | 3.6.0-alpha.0 |   20 kB |     false |      false |         3 |          7 |                  7 |        |
+----------------------+------------

</details>


### Relevant log output

```Shell
log output when the second node started:
# ./etcd --name etcd_02 --data-dir etcd02 --listen-client-urls http://192.168.31.106:12379,http://127.0.0.1:12379 --advertise-client-urls http://192.168.31.106:12379 --initial-advertise-peer-urls http://192.168.31.106:12380 --listen-peer-urls http://192.168.31.106:12380 --initial-cluster-token etcd-cluster-07311158 --initial-cluster etcd_01=http://192.168.31.105:12380,etcd_02=http://192.168.31.106:12380 --initial-cluster-state existing
{"level":"warn","ts":"2024-07-31T13:32:42.413207+0800","caller":"embed/config.go:940","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2024-07-31T13:32:42.413275+0800","caller":"etcdmain/etcd.go:63","msg":"Running: ","args":["./etcd","--name","etcd_02","--data-dir","etcd02","--listen-client-urls","http://192.168.31.106:12379,http://127.0.0.1:12379","--advertise-client-urls","http://192.168.31.106:12379","--initial-advertise-peer-urls","http://192.168.31.106:12380","--listen-peer-urls","http://192.168.31.106:12380","--initial-cluster-token","etcd-cluster-07311158","--initial-cluster","etcd_01=http://192.168.31.105:12380,etcd_02=http://192.168.31.106:12380","--initial-cluster-state","existing"]}
{"level":"info","ts":"2024-07-31T13:32:42.413300+0800","caller":"etcdmain/etcd.go:123","msg":"Initialize and start etcd server","data-dir":"etcd02","dir-type":"empty"}
{"level":"warn","ts":"2024-07-31T13:32:42.413310+0800","caller":"embed/config.go:940","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2024-07-31T13:32:42.413316+0800","caller":"embed/etcd.go:125","msg":"configuring peer listeners","listen-peer-urls":["http://192.168.31.106:12380"]}
{"level":"info","ts":"2024-07-31T13:32:42.413422+0800","caller":"embed/etcd.go:133","msg":"configuring client listeners","listen-client-urls":["http://127.0.0.1:12379","http://192.168.31.106:12379"]}
{"level":"info","ts":"2024-07-31T13:32:42.413474+0800","caller":"embed/etcd.go:315","msg":"starting an etcd server","etcd-version":"3.6.0-alpha.0","git-sha":"f181ced","go-version":"go1.22.5","go-os":"linux","go-arch":"amd64","max-cpu-set":2,"max-cpu-available":2,"member-initialized":false,"name":"etcd_02","data-dir":"etcd02","wal-dir":"","wal-dir-dedicated":"","member-dir":"etcd02/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":10000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://192.168.31.106:12380"],"listen-peer-urls":["http://192.168.31.106:12380"],"advertise-client-urls":["http://192.168.31.106:12379"],"listen-client-urls":["http://127.0.0.1:12379","http://192.168.31.106:12379"],"listen-metrics-urls":[],"experimental-set-member-localaddr":false,"experimental-local-address":"","cors":["*"],"host-whitelist":["*"],"initial-cluster":"etcd_01=http://192.168.31.105:12380,etcd_02=http://192.168.31.106:12380","initial-cluster-state":"existing","initial-cluster-token":"etcd-cluster-07311158","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","discovery-token":"","discovery-endpoints":"","discovery-dial-timeout":"2s","discovery-request-timeout":"5s","discovery-keepalive-time":"2s","discovery-keepalive-timeout":"6s","discovery-insecure-transport":true,"discovery-insecure-skip-tls-verify":false,"discovery-cert":"","discovery-key":"","discovery-cacert":"","discovery-user":"","downgrade-check-interval":"5s","max-learners":1}
{"level":"info","ts":"2024-07-31T13:32:42.413778+0800","logger":"bbolt","caller":"backend/backend.go:197","msg":"Opening db file (etcd02/member/snap/db) with mode 2d72772d2d2d2d2d2d2d and with options: {Timeout: 0s, NoGrowSync: false, NoFreelistSync: true, PreLoadFreelist: false, FreelistType: hashmap, ReadOnly: false, MmapFlags: 8000, InitialMmapSize: 10737418240, PageSize: 0, NoSync: false, OpenFile: 0x0, Mlock: false, Logger: 0xc000122088}"}
{"level":"info","ts":"2024-07-31T13:32:42.415503+0800","logger":"bbolt","caller":"[email protected]/db.go:321","msg":"Opening bbolt db (etcd02/member/snap/db) successfully"}
{"level":"info","ts":"2024-07-31T13:32:42.415584+0800","caller":"storage/backend.go:80","msg":"opened backend db","path":"etcd02/member/snap/db","took":"1.869628ms"}
{"level":"info","ts":"2024-07-31T13:32:42.415627+0800","caller":"etcdserver/bootstrap.go:93","msg":"bootstrapping cluster"}
{"level":"info","ts":"2024-07-31T13:32:42.416866+0800","caller":"netutil/netutil.go:112","msg":"resolved URL Host","url":"http://192.168.31.105:12380","host":"192.168.31.105:12380","resolved-addr":"192.168.31.105:12380"}
{"level":"info","ts":"2024-07-31T13:32:42.416933+0800","caller":"netutil/netutil.go:112","msg":"resolved URL Host","url":"http://192.168.31.106:12380","host":"192.168.31.106:12380","resolved-addr":"192.168.31.106:12380"}
{"level":"warn","ts":"2024-07-31T13:32:49.419147+0800","caller":"etcdserver/cluster_util.go:386","msg":"failed to reach the peer URL","address":"http://192.168.31.105:12380/downgrade/enabled","remote-member-id":"7c0730f7752a5a15","error":"Get \"http://192.168.31.105:12380/downgrade/enabled\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
{"level":"warn","ts":"2024-07-31T13:32:49.419219+0800","caller":"etcdserver/cluster_util.go:357","msg":"failed to get downgrade enabled status","remote-member-id":"7c0730f7752a5a15","error":"Get \"http://192.168.31.105:12380/downgrade/enabled\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
{"level":"info","ts":"2024-07-31T13:32:49.419251+0800","caller":"etcdserver/bootstrap.go:100","msg":"bootstrapping storage"}
{"level":"info","ts":"2024-07-31T13:32:49.422717+0800","caller":"etcdserver/bootstrap.go:108","msg":"bootstrapping raft"}
{"level":"info","ts":"2024-07-31T13:32:49.422756+0800","caller":"etcdserver/bootstrap.go:493","msg":"starting local member","local-member-id":"40339f82f8313cbb","cluster-id":"5fb4c1b191df6a5b"}
{"level":"info","ts":"2024-07-31T13:32:49.422767+0800","caller":"etcdserver/server.go:307","msg":"bootstrap successfully"}
{"level":"info","ts":"2024-07-31T13:32:49.422881+0800","logger":"raft","caller":"[email protected]/raft.go:1920","msg":"40339f82f8313cbb switched to configuration voters=()"}
{"level":"info","ts":"2024-07-31T13:32:49.422923+0800","logger":"raft","caller":"[email protected]/raft.go:870","msg":"40339f82f8313cbb became follower at term 0"}
{"level":"info","ts":"2024-07-31T13:32:49.422933+0800","logger":"raft","caller":"[email protected]/raft.go:483","msg":"newRaft 40339f82f8313cbb [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]"}
{"level":"warn","ts":"2024-07-31T13:32:49.424043+0800","caller":"auth/store.go:1134","msg":"simple token is not cryptographically signed"}
{"level":"info","ts":"2024-07-31T13:32:49.425120+0800","caller":"mvcc/kvstore.go:405","msg":"kvstore restored","current-rev":1}
{"level":"info","ts":"2024-07-31T13:32:49.427074+0800","caller":"storage/quota.go:93","msg":"enabled backend quota with default value","quota-name":"v3-applier","quota-size-bytes":2147483648,"quota-size":"2.1 GB"}
{"level":"info","ts":"2024-07-31T13:32:49.427264+0800","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.427277+0800","caller":"rafthttp/transport.go:286","msg":"added new remote peer","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15","remote-peer-urls":["http://192.168.31.105:12380"]}
{"level":"info","ts":"2024-07-31T13:32:49.427294+0800","caller":"rafthttp/peer.go:133","msg":"starting remote peer","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.427323+0800","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.428324+0800","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.428346+0800","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.428861+0800","caller":"rafthttp/peer.go:137","msg":"started remote peer","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.428884+0800","caller":"rafthttp/transport.go:317","msg":"added remote peer","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15","remote-peer-urls":["http://192.168.31.105:12380"]}
{"level":"info","ts":"2024-07-31T13:32:49.428901+0800","caller":"etcdserver/server.go:591","msg":"starting etcd server","local-member-id":"40339f82f8313cbb","local-server-version":"3.6.0-alpha.0","cluster-version":"to_be_decided"}
{"level":"info","ts":"2024-07-31T13:32:49.428890+0800","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.429056+0800","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.429169+0800","caller":"embed/etcd.go:284","msg":"now serving peer/client/metrics","local-member-id":"40339f82f8313cbb","initial-advertise-peer-urls":["http://192.168.31.106:12380"],"listen-peer-urls":["http://192.168.31.106:12380"],"advertise-client-urls":["http://192.168.31.106:12379"],"listen-client-urls":["http://127.0.0.1:12379","http://192.168.31.106:12379"],"listen-metrics-urls":[]}
{"level":"info","ts":"2024-07-31T13:32:49.429246+0800","caller":"etcdserver/server.go:492","msg":"starting initial election tick advance","election-ticks":10}
{"level":"info","ts":"2024-07-31T13:32:49.429311+0800","caller":"fileutil/purge.go:49","msg":"started to purge file","dir":"etcd02/member/snap","suffix":"snap.db","max":5,"interval":"30s"}
{"level":"info","ts":"2024-07-31T13:32:49.429335+0800","caller":"embed/etcd.go:611","msg":"serving peer traffic","address":"192.168.31.106:12380"}
{"level":"info","ts":"2024-07-31T13:32:49.429344+0800","caller":"embed/etcd.go:583","msg":"cmux::serve","address":"192.168.31.106:12380"}
{"level":"info","ts":"2024-07-31T13:32:49.429338+0800","caller":"fileutil/purge.go:49","msg":"started to purge file","dir":"etcd02/member/snap","suffix":"snap","max":5,"interval":"30s"}
{"level":"info","ts":"2024-07-31T13:32:49.429472+0800","caller":"fileutil/purge.go:49","msg":"started to purge file","dir":"etcd02/member/wal","suffix":"wal","max":5,"interval":"30s"}
{"level":"info","ts":"2024-07-31T13:32:49.429627+0800","caller":"rafthttp/stream.go:249","msg":"set message encoder","from":"40339f82f8313cbb","to":"7c0730f7752a5a15","stream-type":"stream MsgApp v2"}
{"level":"info","ts":"2024-07-31T13:32:49.429640+0800","caller":"rafthttp/peer_status.go:53","msg":"peer became active","peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.429640+0800","logger":"raft","caller":"[email protected]/raft.go:1191","msg":"40339f82f8313cbb [logterm: 0, index: 0, vote: 0] cast MsgPreVote for 7c0730f7752a5a15 [logterm: 2, index: 5] at term 0"}
{"level":"info","ts":"2024-07-31T13:32:49.429652+0800","caller":"rafthttp/stream.go:274","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream MsgApp v2","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.429736+0800","caller":"rafthttp/stream.go:249","msg":"set message encoder","from":"40339f82f8313cbb","to":"7c0730f7752a5a15","stream-type":"stream Message"}
{"level":"info","ts":"2024-07-31T13:32:49.429746+0800","caller":"rafthttp/stream.go:274","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream Message","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.430009+0800","logger":"raft","caller":"[email protected]/raft.go:1191","msg":"40339f82f8313cbb [logterm: 0, index: 0, vote: 0] cast MsgPreVote for 7c0730f7752a5a15 [logterm: 2, index: 5] at term 0"}
{"level":"info","ts":"2024-07-31T13:32:49.430138+0800","logger":"raft","caller":"[email protected]/raft.go:1191","msg":"40339f82f8313cbb [logterm: 0, index: 0, vote: 0] cast MsgPreVote for 7c0730f7752a5a15 [logterm: 2, index: 5] at term 0"}
{"level":"info","ts":"2024-07-31T13:32:49.430221+0800","logger":"raft","caller":"[email protected]/raft.go:1191","msg":"40339f82f8313cbb [logterm: 0, index: 0, vote: 0] cast MsgPreVote for 7c0730f7752a5a15 [logterm: 2, index: 5] at term 0"}
{"level":"info","ts":"2024-07-31T13:32:49.430787+0800","caller":"rafthttp/stream.go:412","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.430798+0800","logger":"raft","caller":"[email protected]/raft.go:1191","msg":"40339f82f8313cbb [logterm: 0, index: 0, vote: 0] cast MsgPreVote for 7c0730f7752a5a15 [logterm: 2, index: 5] at term 0"}
{"level":"info","ts":"2024-07-31T13:32:49.431038+0800","caller":"rafthttp/stream.go:412","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream Message","local-member-id":"40339f82f8313cbb","remote-peer-id":"7c0730f7752a5a15"}
{"level":"info","ts":"2024-07-31T13:32:49.432380+0800","logger":"raft","caller":"[email protected]/raft.go:1078","msg":"40339f82f8313cbb [term: 0] received a MsgVote message with higher term from 7c0730f7752a5a15 [term: 3]"}
{"level":"info","ts":"2024-07-31T13:32:49.432431+0800","logger":"raft","caller":"[email protected]/raft.go:870","msg":"40339f82f8313cbb became follower at term 3"}
{"level":"info","ts":"2024-07-31T13:32:49.432443+0800","logger":"raft","caller":"[email protected]/raft.go:1191","msg":"40339f82f8313cbb [logterm: 0, index: 0, vote: 0] cast MsgVote for 7c0730f7752a5a15 [logterm: 2, index: 5] at term 3"}
{"level":"info","ts":"2024-07-31T13:32:49.433758+0800","logger":"raft","caller":"[email protected]/node.go:370","msg":"raft.node: 40339f82f8313cbb elected leader 7c0730f7752a5a15 at term 3"}
{"level":"info","ts":"2024-07-31T13:32:49.434581+0800","logger":"raft","caller":"[email protected]/raft.go:1920","msg":"40339f82f8313cbb switched to configuration voters=(8937165824920803861)"}
{"level":"info","ts":"2024-07-31T13:32:49.434650+0800","caller":"membership/cluster.go:411","msg":"added member","cluster-id":"5fb4c1b191df6a5b","local-member-id":"40339f82f8313cbb","added-peer-id":"7c0730f7752a5a15","added-peer-peer-urls":["http://192.168.31.105:12380"],"added-peer-is-learner":false}
{"level":"info","ts":"2024-07-31T13:32:49.434712+0800","caller":"membership/cluster.go:594","msg":"set initial cluster version","cluster-id":"5fb4c1b191df6a5b","local-member-id":"40339f82f8313cbb","cluster-version":"3.6"}
{"level":"info","ts":"2024-07-31T13:32:49.434774+0800","caller":"api/capability.go:76","msg":"enabled capabilities for version","cluster-version":"3.6"}
{"level":"info","ts":"2024-07-31T13:32:49.434810+0800","logger":"raft","caller":"[email protected]/raft.go:1920","msg":"40339f82f8313cbb switched to configuration voters=(4626216627098172603 8937165824920803861)"}
{"level":"info","ts":"2024-07-31T13:32:49.434837+0800","caller":"membership/cluster.go:411","msg":"added member","cluster-id":"5fb4c1b191df6a5b","local-member-id":"40339f82f8313cbb","added-peer-id":"40339f82f8313cbb","added-peer-peer-urls":["http://192.168.31.106:12380"],"added-peer-is-learner":false}
{"level":"warn","ts":"2024-07-31T13:32:49.434883+0800","caller":"etcdserver/server.go:2222","msg":"Failed to detect schema version","error":"missing confstate information"}
{"level":"error","ts":"2024-07-31T13:32:49.434903+0800","caller":"version/monitor.go:120","msg":"failed to update storage version","cluster-version":"3.6.0","error":"cannot detect storage schema version: missing confstate information","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver/version.(*Monitor).UpdateStorageVersionIfNeeded\n\tgo.etcd.io/etcd/server/v3/etcdserver/version/monitor.go:120\ngo.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).monitorStorageVersion\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:2263\ngo.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).GoAttach.func1\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:2444"}
{"level":"warn","ts":"2024-07-31T13:32:49.436333+0800","caller":"etcdserver/server.go:2222","msg":"Failed to detect schema version","error":"missing confstate information"}
{"level":"info","ts":"2024-07-31T13:32:49.436342+0800","caller":"etcdserver/server.go:1780","msg":"published local member to cluster through raft","local-member-id":"40339f82f8313cbb","local-member-attributes":"{Name:etcd_02 ClientURLs:[http://192.168.31.106:12379]}","cluster-id":"5fb4c1b191df6a5b","publish-timeout":"7s"}
{"level":"info","ts":"2024-07-31T13:32:49.436388+0800","caller":"embed/serve.go:104","msg":"ready to serve client requests"}
{"level":"info","ts":"2024-07-31T13:32:49.436433+0800","caller":"embed/serve.go:104","msg":"ready to serve client requests"}
{"level":"info","ts":"2024-07-31T13:32:49.436513+0800","caller":"etcdmain/main.go:44","msg":"notifying init daemon"}
{"level":"info","ts":"2024-07-31T13:32:49.436521+0800","caller":"etcdmain/main.go:50","msg":"successfully notified init daemon"}
{"level":"info","ts":"2024-07-31T13:32:49.436889+0800","caller":"v3rpc/health.go:62","msg":"grpc service status changed","service":"","status":"SERVING"}
{"level":"info","ts":"2024-07-31T13:32:49.437089+0800","caller":"v3rpc/health.go:62","msg":"grpc service status changed","service":"","status":"SERVING"}
{"level":"info","ts":"2024-07-31T13:32:49.438005+0800","caller":"embed/serve.go:188","msg":"serving client traffic insecurely; this is strongly discouraged!","traffic":"grpc+http","address":"192.168.31.106:12379"}
{"level":"info","ts":"2024-07-31T13:32:49.438208+0800","caller":"embed/serve.go:188","msg":"serving client traffic insecurely; this is strongly discouraged!","traffic":"grpc+http","address":"127.0.0.1:12379"}
{"level":"info","ts":"2024-07-31T13:32:53.436050+0800","caller":"version/monitor.go:116","msg":"cluster version differs from storage version.","cluster-version":"3.6.0","storage-version":"3.5.0"}
{"level":"info","ts":"2024-07-31T13:32:53.436124+0800","caller":"schema/migration.go:65","msg":"updated storage version","new-storage-version":"3.6.0"}


log output on the first node during second node started:
{"level":"info","ts":"2024-07-31T13:32:41.802493+0800","logger":"raft","caller":"[email protected]/raft.go:956","msg":"7c0730f7752a5a15 is starting a new election at term 2"}
{"level":"info","ts":"2024-07-31T13:32:41.802531+0800","logger":"raft","caller":"[email protected]/raft.go:899","msg":"7c0730f7752a5a15 became pre-candidate at term 2"}
{"level":"info","ts":"2024-07-31T13:32:41.802549+0800","logger":"raft","caller":"[email protected]/raft.go:1030","msg":"7c0730f7752a5a15 [logterm: 2, index: 5] sent MsgPreVote request to 40339f82f8313cbb at term 2"}
{"level":"info","ts":"2024-07-31T13:32:41.802568+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgPreVoteResp from 7c0730f7752a5a15 at term 2"}
{"level":"info","ts":"2024-07-31T13:32:41.802575+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 1 MsgPreVoteResp votes and 0 vote rejections"}
{"level":"info","ts":"2024-07-31T13:32:43.164875+0800","logger":"raft","caller":"[email protected]/raft.go:956","msg":"7c0730f7752a5a15 is starting a new election at term 2"}
{"level":"info","ts":"2024-07-31T13:32:43.164921+0800","logger":"raft","caller":"[email protected]/raft.go:899","msg":"7c0730f7752a5a15 became pre-candidate at term 2"}
{"level":"info","ts":"2024-07-31T13:32:43.164937+0800","logger":"raft","caller":"[email protected]/raft.go:1030","msg":"7c0730f7752a5a15 [logterm: 2, index: 5] sent MsgPreVote request to 40339f82f8313cbb at term 2"}
{"level":"info","ts":"2024-07-31T13:32:43.164957+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgPreVoteResp from 7c0730f7752a5a15 at term 2"}
{"level":"info","ts":"2024-07-31T13:32:43.164963+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 1 MsgPreVoteResp votes and 0 vote rejections"}
{"level":"info","ts":"2024-07-31T13:32:44.664981+0800","logger":"raft","caller":"[email protected]/raft.go:956","msg":"7c0730f7752a5a15 is starting a new election at term 2"}
{"level":"info","ts":"2024-07-31T13:32:44.665013+0800","logger":"raft","caller":"[email protected]/raft.go:899","msg":"7c0730f7752a5a15 became pre-candidate at term 2"}
{"level":"info","ts":"2024-07-31T13:32:44.665026+0800","logger":"raft","caller":"[email protected]/raft.go:1030","msg":"7c0730f7752a5a15 [logterm: 2, index: 5] sent MsgPreVote request to 40339f82f8313cbb at term 2"}
{"level":"info","ts":"2024-07-31T13:32:44.665045+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgPreVoteResp from 7c0730f7752a5a15 at term 2"}
{"level":"info","ts":"2024-07-31T13:32:44.665052+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 1 MsgPreVoteResp votes and 0 vote rejections"}
{"level":"warn","ts":"2024-07-31T13:32:45.890793+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"info","ts":"2024-07-31T13:32:46.064774+0800","logger":"raft","caller":"[email protected]/raft.go:956","msg":"7c0730f7752a5a15 is starting a new election at term 2"}
{"level":"info","ts":"2024-07-31T13:32:46.064815+0800","logger":"raft","caller":"[email protected]/raft.go:899","msg":"7c0730f7752a5a15 became pre-candidate at term 2"}
{"level":"info","ts":"2024-07-31T13:32:46.064841+0800","logger":"raft","caller":"[email protected]/raft.go:1030","msg":"7c0730f7752a5a15 [logterm: 2, index: 5] sent MsgPreVote request to 40339f82f8313cbb at term 2"}
{"level":"info","ts":"2024-07-31T13:32:46.064862+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgPreVoteResp from 7c0730f7752a5a15 at term 2"}
{"level":"info","ts":"2024-07-31T13:32:46.064868+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 1 MsgPreVoteResp votes and 0 vote rejections"}
{"level":"warn","ts":"2024-07-31T13:32:46.391821+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"warn","ts":"2024-07-31T13:32:46.506772+0800","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"40339f82f8313cbb","rtt":"0s","error":"dial tcp 192.168.31.106:12380: connect: connection refused"}
{"level":"warn","ts":"2024-07-31T13:32:46.506799+0800","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"40339f82f8313cbb","rtt":"0s","error":"dial tcp 192.168.31.106:12380: connect: connection refused"}
{"level":"warn","ts":"2024-07-31T13:32:46.892434+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"warn","ts":"2024-07-31T13:32:47.394420+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"info","ts":"2024-07-31T13:32:47.464688+0800","logger":"raft","caller":"[email protected]/raft.go:956","msg":"7c0730f7752a5a15 is starting a new election at term 2"}
{"level":"info","ts":"2024-07-31T13:32:47.464724+0800","logger":"raft","caller":"[email protected]/raft.go:899","msg":"7c0730f7752a5a15 became pre-candidate at term 2"}
{"level":"info","ts":"2024-07-31T13:32:47.464739+0800","logger":"raft","caller":"[email protected]/raft.go:1030","msg":"7c0730f7752a5a15 [logterm: 2, index: 5] sent MsgPreVote request to 40339f82f8313cbb at term 2"}
{"level":"info","ts":"2024-07-31T13:32:47.464757+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgPreVoteResp from 7c0730f7752a5a15 at term 2"}
{"level":"info","ts":"2024-07-31T13:32:47.464765+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 1 MsgPreVoteResp votes and 0 vote rejections"}
{"level":"warn","ts":"2024-07-31T13:32:47.896854+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"warn","ts":"2024-07-31T13:32:48.397292+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"info","ts":"2024-07-31T13:32:48.864186+0800","logger":"raft","caller":"[email protected]/raft.go:956","msg":"7c0730f7752a5a15 is starting a new election at term 2"}
{"level":"info","ts":"2024-07-31T13:32:48.864230+0800","logger":"raft","caller":"[email protected]/raft.go:899","msg":"7c0730f7752a5a15 became pre-candidate at term 2"}
{"level":"info","ts":"2024-07-31T13:32:48.864250+0800","logger":"raft","caller":"[email protected]/raft.go:1030","msg":"7c0730f7752a5a15 [logterm: 2, index: 5] sent MsgPreVote request to 40339f82f8313cbb at term 2"}
{"level":"info","ts":"2024-07-31T13:32:48.864272+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgPreVoteResp from 7c0730f7752a5a15 at term 2"}
{"level":"info","ts":"2024-07-31T13:32:48.864280+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 1 MsgPreVoteResp votes and 0 vote rejections"}
{"level":"warn","ts":"2024-07-31T13:32:48.905423+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"warn","ts":"2024-07-31T13:32:49.460680+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"warn","ts":"2024-07-31T13:32:49.962647+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"info","ts":"2024-07-31T13:32:50.264437+0800","logger":"raft","caller":"[email protected]/raft.go:956","msg":"7c0730f7752a5a15 is starting a new election at term 2"}
{"level":"info","ts":"2024-07-31T13:32:50.264477+0800","logger":"raft","caller":"[email protected]/raft.go:899","msg":"7c0730f7752a5a15 became pre-candidate at term 2"}
{"level":"info","ts":"2024-07-31T13:32:50.264493+0800","logger":"raft","caller":"[email protected]/raft.go:1030","msg":"7c0730f7752a5a15 [logterm: 2, index: 5] sent MsgPreVote request to 40339f82f8313cbb at term 2"}
{"level":"info","ts":"2024-07-31T13:32:50.264518+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgPreVoteResp from 7c0730f7752a5a15 at term 2"}
{"level":"info","ts":"2024-07-31T13:32:50.264525+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 1 MsgPreVoteResp votes and 0 vote rejections"}
{"level":"warn","ts":"2024-07-31T13:32:50.463221+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"warn","ts":"2024-07-31T13:32:50.964455+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"warn","ts":"2024-07-31T13:32:51.475092+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"warn","ts":"2024-07-31T13:32:51.516498+0800","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"40339f82f8313cbb","rtt":"0s","error":"dial tcp 192.168.31.106:12380: connect: connection refused"}
{"level":"warn","ts":"2024-07-31T13:32:51.516528+0800","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"40339f82f8313cbb","rtt":"0s","error":"dial tcp 192.168.31.106:12380: connect: connection refused"}
{"level":"info","ts":"2024-07-31T13:32:51.664404+0800","logger":"raft","caller":"[email protected]/raft.go:956","msg":"7c0730f7752a5a15 is starting a new election at term 2"}
{"level":"info","ts":"2024-07-31T13:32:51.664445+0800","logger":"raft","caller":"[email protected]/raft.go:899","msg":"7c0730f7752a5a15 became pre-candidate at term 2"}
{"level":"info","ts":"2024-07-31T13:32:51.664460+0800","logger":"raft","caller":"[email protected]/raft.go:1030","msg":"7c0730f7752a5a15 [logterm: 2, index: 5] sent MsgPreVote request to 40339f82f8313cbb at term 2"}
{"level":"info","ts":"2024-07-31T13:32:51.664480+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgPreVoteResp from 7c0730f7752a5a15 at term 2"}
{"level":"info","ts":"2024-07-31T13:32:51.664488+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 1 MsgPreVoteResp votes and 0 vote rejections"}
{"level":"warn","ts":"2024-07-31T13:32:51.977379+0800","caller":"etcdserver/v3_server.go:909","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":6491253898331065351,"retry-timeout":"500ms"}
{"level":"warn","ts":"2024-07-31T13:32:52.391315+0800","caller":"etcdserver/v3_server.go:921","msg":"timed out waiting for read index response (local node might have slow network)","timeout":"7s"}
{"level":"info","ts":"2024-07-31T13:32:52.401962+0800","caller":"rafthttp/peer_status.go:53","msg":"peer became active","peer-id":"40339f82f8313cbb"}
{"level":"info","ts":"2024-07-31T13:32:52.402003+0800","caller":"rafthttp/stream.go:412","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"7c0730f7752a5a15","remote-peer-id":"40339f82f8313cbb"}
{"level":"info","ts":"2024-07-31T13:32:52.402559+0800","caller":"rafthttp/stream.go:249","msg":"set message encoder","from":"7c0730f7752a5a15","to":"40339f82f8313cbb","stream-type":"stream Message"}
{"level":"info","ts":"2024-07-31T13:32:52.402577+0800","caller":"rafthttp/stream.go:274","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream Message","local-member-id":"7c0730f7752a5a15","remote-peer-id":"40339f82f8313cbb"}
{"level":"info","ts":"2024-07-31T13:32:52.402689+0800","caller":"rafthttp/stream.go:249","msg":"set message encoder","from":"7c0730f7752a5a15","to":"40339f82f8313cbb","stream-type":"stream MsgApp v2"}
{"level":"info","ts":"2024-07-31T13:32:52.402701+0800","caller":"rafthttp/stream.go:274","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream MsgApp v2","local-member-id":"7c0730f7752a5a15","remote-peer-id":"40339f82f8313cbb"}
{"level":"info","ts":"2024-07-31T13:32:52.402796+0800","caller":"rafthttp/stream.go:412","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream Message","local-member-id":"7c0730f7752a5a15","remote-peer-id":"40339f82f8313cbb"}
{"level":"info","ts":"2024-07-31T13:32:52.402843+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgPreVoteResp from 40339f82f8313cbb at term 2"}
{"level":"info","ts":"2024-07-31T13:32:52.402858+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 2 MsgPreVoteResp votes and 0 vote rejections"}
{"level":"info","ts":"2024-07-31T13:32:52.402876+0800","logger":"raft","caller":"[email protected]/raft.go:883","msg":"7c0730f7752a5a15 became candidate at term 3"}
{"level":"info","ts":"2024-07-31T13:32:52.402887+0800","logger":"raft","caller":"[email protected]/raft.go:1030","msg":"7c0730f7752a5a15 [logterm: 2, index: 5] sent MsgVote request to 40339f82f8313cbb at term 3"}
{"level":"info","ts":"2024-07-31T13:32:52.404043+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgVoteResp from 7c0730f7752a5a15 at term 3"}
{"level":"info","ts":"2024-07-31T13:32:52.404071+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 1 MsgVoteResp votes and 0 vote rejections"}
{"level":"info","ts":"2024-07-31T13:32:52.405563+0800","logger":"raft","caller":"[email protected]/raft.go:1043","msg":"7c0730f7752a5a15 received MsgVoteResp from 40339f82f8313cbb at term 3"}
{"level":"info","ts":"2024-07-31T13:32:52.405603+0800","logger":"raft","caller":"[email protected]/raft.go:1649","msg":"7c0730f7752a5a15 has received 2 MsgVoteResp votes and 0 vote rejections"}
{"level":"info","ts":"2024-07-31T13:32:52.405630+0800","logger":"raft","caller":"[email protected]/raft.go:938","msg":"7c0730f7752a5a15 became leader at term 3"}
{"level":"info","ts":"2024-07-31T13:32:52.405649+0800","logger":"raft","caller":"[email protected]/node.go:370","msg":"raft.node: 7c0730f7752a5a15 elected leader 7c0730f7752a5a15 at term 3"}
{"level":"warn","ts":"2024-07-31T13:32:56.518983+0800","caller":"rafthttp/probing_status.go:82","msg":"prober found high clock drift","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"40339f82f8313cbb","clock-drift":"2.722000374s","rtt":"500.306964ms"}
@XiuhuaRuan XiuhuaRuan changed the title second node takes 7 seconds to join in a single node cluster The second node took about 7 seconds to join in a single node cluster Jul 31, 2024
@jmhbnz jmhbnz added area/performance priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Aug 1, 2024
@ivanvc
Copy link
Member

ivanvc commented Aug 1, 2024

I'll try to reproduce it. Thanks for the detailed instructions, @XiuhuaRuan.

/assign

@ivanvc
Copy link
Member

ivanvc commented Aug 1, 2024

I was able to duplicate this issue with 3.5.15 and the main branch. I used the following Procfile:

etcd1: bin/etcd --name infra1 --listen-client-urls http://127.0.0.1:2379 --advertise-client-urls http://127.0.0.1:2379 --listen-peer-urls http://127.0.0.1:12380 --initial-advertise-peer-urls http://127.0.0.1:12380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380' --logger=zap --log-outputs=stderr --force-new-cluster
etcd2: bin/etcd --name infra2 --listen-client-urls http://127.0.0.1:22379 --advertise-client-urls http://127.0.0.1:22379 --listen-peer-urls http://127.0.0.1:22380 --initial-advertise-peer-urls http://127.0.0.1:22380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380' --logger=zap --log-outputs=stderr --initial-cluster-state existing

And the following script (after running make build):

rm -rf infra*.etcd

trap 'goreman run stop-all' EXIT

goreman start &

sleep 3
bin/etcdctl --endpoints=127.0.0.1:2379 member list
bin/etcdctl --endpoints=127.0.0.1:2379 member add infra2 --peer-urls=http://127.0.0.1:22380
sleep 1

SECONDS=0
goreman run start etcd2 &

while ! bin/etcdctl --command-timeout=0.1s --endpoints=127.0.0.1:2379 member list | grep infra2; do
  sleep 0.5
done
elapsed="$SECONDS"

goreman run stop-all

echo "second node joined in $elapsed seconds"

Find the files attached and remove the .txt extension to make them work.

@ivanvc ivanvc removed their assignment Aug 1, 2024
@jmhbnz
Copy link
Member

jmhbnz commented Aug 1, 2024

Thanks @ivanvc for reproducing and @XiuhuaRuan for raising. It looks like this could be related to downgrade implementation, @siyuanfoundation is this behaviour expected?

@siyuanfoundation
Copy link
Contributor

When a new member joins an existing cluster, the new member needs to check if its version is compatible with the current cluster version, and the new member's version can be lower than cluster version only when downgrade is enabled. That's the background of needing to check the /downgrade/enabled endpoint.

On the other end, when serving /downgrade/enabled, the server does a linearizableReadNotify check to make sure it reads the latest downgradeInfo from the backend.

In the case of adding a new member to 1 node cluster, after etcdctl member add, the cluster is in a leaderless state because only 1/2 member is running. So the 1 member cannot complete linearizable read, and it will retry until timeout. After which, it would default to downgradeEnabled=false. That's why it could take 7 seconds to proceed.

@serathius this seems to be a bug fix. WDYT?

@serathius
Copy link
Member

Right, the suggestion to fix makes sense, however I would like to better understand the context why it was designed like this before.

@siyuanfoundation
Copy link
Contributor

@YoyinZyc I see you wrote this code. Can you help us understand the context why it was designed like this?

@ahrtr
Copy link
Member

ahrtr commented Aug 3, 2024

The suggestion is to add the second member as a learner firstly, and promote to a voting member it later.

@YoyinZyc
Copy link
Contributor

YoyinZyc commented Aug 5, 2024

@YoyinZyc I see you wrote this code. Can you help us understand the context why it was designed like this?

You mean why we designed to use linearizableReadNotify? It is just like you said, we use it to get the latest downgrade status.

The context timeout issue for the second member sounds like a bug we didn't catch at the early stage.

@serathius
Copy link
Member

I'm thinking about why check /downgrade/enabled endpoint. I think member should be still able to abort after it requests and gets snapshot/entries from the leader. If the answer is that we wanted to detect that early, then maybe linearizable read should be reconsidered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. type/feature
Development

No branches or pull requests

7 participants