Skip to content

Commit

Permalink
scylla_cluster, scylla_node: watch_log_for_alive before watch_rest_fo…
Browse files Browse the repository at this point in the history
…r_alive

#462 introduced
watch_rest_for_alive that replaced the calls to watch_log_for_alive
on the scylla node(s) start path.

But a node is killed and then restarted, and other nodes
miss the kill event, `watch_rest_for_alive` will consider
that node already as up as seen by the other nodes,
while previously, watch_log_for_alive, waited until
other nodes discovered this node as up again, based
on markes taken right before (re)starting that node.

This change brings this call back.

Fixes #563

Signed-off-by: Benny Halevy <[email protected]>
  • Loading branch information
bhalevy committed Mar 12, 2024
1 parent 5ee8825 commit a3f7956
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 2 deletions.
3 changes: 2 additions & 1 deletion ccmlib/scylla_cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,9 +156,10 @@ def start_nodes(self, nodes=None, no_wait=False, verbose=False, wait_for_binary_
verbose=verbose, from_mark=mark)

if wait_other_notice:
for old_node, _ in marks:
for old_node, mark in marks:
for node, _, _ in started:
if old_node is not node:
old_node.watch_log_for_alive(node, from_mark=mark, timeout=self.default_wait_other_notice_timeout)
old_node.watch_rest_for_alive(node, timeout=self.default_wait_other_notice_timeout,
wait_normal_token_owner=wait_normal_token_owner)

Expand Down
3 changes: 2 additions & 1 deletion ccmlib/scylla_node.py
Original file line number Diff line number Diff line change
Expand Up @@ -332,8 +332,9 @@ def _start_scylla(self, args, marks, update_pid,
self.wait_for_binary_interface(from_mark=from_mark, process=self._process_scylla, timeout=t)

if wait_other_notice:
for node, _ in marks:
for node, mark in marks:
t = self.cluster.default_wait_other_notice_timeout
node.watch_log_for_alive(self, from_mark=mark, timeout=t)
node.watch_rest_for_alive(self, timeout=t, wait_normal_token_owner=wait_normal_token_owner)
self.watch_rest_for_alive(node, timeout=t, wait_normal_token_owner=wait_normal_token_owner)

Expand Down

0 comments on commit a3f7956

Please sign in to comment.