-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error enabling msgr2 messenger in Ceph during Ansible playbook execution #11
Comments
That would happen if the Ceph cluster isn't functional. This most commonly happen if you have fully redone your deployment without also wiping the data from the In this scenario you end up with a freshly deployed cluster that's still expecting the servers from the previous deployment and so is unable to achieve a quorum, causing the Ceph API to fall to come online and results in the configuration failure you're getting. |
Ceph monitor initialization issue: monmap min_mon_release older than installed version |
Can you show Normally the logic in the playbook is to set the min-mon-release in the mon map to the same release as |
I have already cleaned the data/ceph/ folder and others. I also used both Quincy and Reef versions. I am lost in this deployment. |
Also the output of |
root@haruunkal:~/incus-deploy# git rev-parse HEAD |
Okay, so it shouldn't be because of lack of support for calling monmaptool with the needed set-min-mon-release, but then it's pretty confusing as to why it would have set a release of 15 when it should have been passed 18. The output of |
Thank you very much for your support. It seems that there was an issue with my lab workstation that was resolved only when I disabled the IPv6 network. After that, the entire process ran perfectly. |
root@haruunkal:~/incus-deploy# monmaptool --print ansible/data/ceph/cluster.e2850e1f-7aab-472e-b6b1-824e19a75071.mon.map |
rsrsrs. Other error: |
Yeah, so the Maybe that older version of You could add the Ceph repository to your own machine and then update to a new version of |
Description:When running the Ansible playbook deploy.yaml from the incus-deploy project, an error occurs while attempting to enable the msgr2 messenger in Ceph. The ceph mon enable-msgr2 command fails with a timeout, indicating that it could not connect to the RADOS cluster.
Error Message:
fatal: [server01]: FAILED! => {"changed": true, "cmd": "ceph mon enable-msgr2", "delta": "0:05:00.070563", "end": "2024-07-11 13:43:18.284315", "msg": "non-zero return code", "rc": 1, "start": "2024-07-11 13:38:18.213752", "stderr": "2024-07-11T13:43:18.279+0000 7ff21f567640 0 monclient(hunting): authenticate timed out after 300\n[errno 110] RADOS timed out (error connecting to the cluster)", "stderr_lines": ["2024-07-11T13:43:18.279+0000 7ff21f567640 0 monclient(hunting): authenticate timed out after 300", "[errno 110] RADOS timed out (error connecting to the cluster)"], "stdout": "", "stdout_lines": []}
fatal: [server03]: FAILED! => {"changed": true, "cmd": "ceph mon enable-msgr2", "delta": "0:05:00.109144", "end": "2024-07-11 13:43:18.320621", "msg": "non-zero return code", "rc": 1, "start": "2024-07-11 13:38:18.211477", "stderr": "2024-07-11T13:43:18.316+0000 7fc48f66d640 0 monclient(hunting): authenticate timed out after 300\n[errno 110] RADOS timed out (error connecting to the cluster)", "stderr_lines": ["2024-07-11T13:43:18.316+0000 7fc48f66d640 0 monclient(hunting): authenticate timed out after 300", "[errno 110] RADOS timed out (error connecting to the cluster)"], "stdout": "", "stdout_lines": []}
fatal: [server02]: FAILED! => {"changed": true, "cmd": "ceph mon enable-msgr2", "delta": "0:05:00.093801", "end": "2024-07-11 13:43:18.316757", "msg": "non-zero return code", "rc": 1, "start": "2024-07-11 13:38:18.222956", "stderr": "2024-07-11T13:43:18.314+0000 7f4cb7b4a640 0 monclient(hunting): authenticate timed out after 300\n[errno 110] RADOS timed out (error connecting to the cluster)", "stderr_lines": ["2024-07-11T13:43:18.314+0000 7f4cb7b4a640 0 monclient(hunting): authenticate timed out after 300", "[errno 110] RADOS timed out (error connecting to the cluster)"], "stdout": "", "stdout_lines": []}
Steps to Reproduce:
Execute the Ansible playbook deploy.yaml in the directory ~/incus-deploy/ansible.
Observe the error during the task to enable the msgr2 messenger in Ceph.
Expected Behavior:
The ceph mon enable-msgr2 command should execute without errors, enabling the msgr2 messenger in the Ceph cluster.
Actual Behavior:
The ceph mon enable-msgr2 command fails with a timeout, indicating it could not connect to the RADOS cluster.
Additional Details:
The error occurs on multiple servers (server01, server02, server03).
Specific error message: RADOS timed out (error connecting to the cluster).
The playbook was executed as root.
Environment:
Ansible version: [2.17.1]]
Ubuntu: 22.04
Execute:
root@haruunkal:
/incus-deploy/terraform# cd ../ansible//incus-deploy/ansible# ansible-playbook deploy.yamlroot@haruunkal:
PLAY [Ceph - Generate cluster keys and maps] ********************************************************************************************
TASK [Gathering Facts] ******************************************************************************************************************
[WARNING]: Platform linux on host server03 is using the discovered Python interpreter at /usr/bin/python3.10, but future installation of
another Python interpreter could change the meaning of that path. See https://docs.ansible.com/ansible-
core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [server03]
[WARNING]: Platform linux on host server04 is using the discovered Python interpreter at /usr/bin/python3.10, but future installation of
another Python interpreter could change the meaning of that path. See https://docs.ansible.com/ansible-
core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [server04]
[WARNING]: Platform linux on host server02 is using the discovered Python interpreter at /usr/bin/python3.10, but future installation of
another Python interpreter could change the meaning of that path. See https://docs.ansible.com/ansible-
core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [server02]
[WARNING]: Platform linux on host server05 is using the discovered Python interpreter at /usr/bin/python3.10, but future installation of
another Python interpreter could change the meaning of that path. See https://docs.ansible.com/ansible-
core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [server05]
[WARNING]: Platform linux on host server01 is using the discovered Python interpreter at /usr/bin/python3.10, but future installation of
another Python interpreter could change the meaning of that path. See https://docs.ansible.com/ansible-
core/2.17/reference_appendices/interpreter_discovery.html for more information.
ok: [server01]
TASK [Generate mon keyring] *************************************************************************************************************
changed: [server03 -> 127.0.0.1]
ok: [server04 -> 127.0.0.1]
ok: [server01 -> 127.0.0.1]
ok: [server05 -> 127.0.0.1]
ok: [server02 -> 127.0.0.1]
TASK [Generate client.admin keyring] ****************************************************************************************************
changed: [server03 -> 127.0.0.1]
ok: [server04 -> 127.0.0.1]
ok: [server01 -> 127.0.0.1]
ok: [server05 -> 127.0.0.1]
ok: [server02 -> 127.0.0.1]
TASK [Generate bootstrap-osd keyring] ***************************************************************************************************
changed: [server03 -> 127.0.0.1]
ok: [server04 -> 127.0.0.1]
ok: [server01 -> 127.0.0.1]
ok: [server05 -> 127.0.0.1]
ok: [server02 -> 127.0.0.1]
TASK [Generate mon map] *****************************************************************************************************************
changed: [server03 -> 127.0.0.1]
ok: [server04 -> 127.0.0.1]
ok: [server01 -> 127.0.0.1]
ok: [server05 -> 127.0.0.1]
ok: [server02 -> 127.0.0.1]
RUNNING HANDLER [Add key to client.admin keyring] ***************************************************************************************
changed: [server03 -> 127.0.0.1]
RUNNING HANDLER [Add key to bootstrap-osd keyring] **************************************************************************************
changed: [server03 -> 127.0.0.1]
RUNNING HANDLER [Add nodes to mon map] **************************************************************************************************
changed: [server03 -> 127.0.0.1] => (item={'name': 'server01', 'ip': 'fd42:60dc:dec6:a73b:216:3eff:fe2d:4c57'})
changed: [server03 -> 127.0.0.1] => (item={'name': 'server02', 'ip': 'fd42:60dc:dec6:a73b:216:3eff:fe05:31f6'})
changed: [server03 -> 127.0.0.1] => (item={'name': 'server03', 'ip': 'fd42:60dc:dec6:a73b:216:3eff:fe01:1c21'})
PLAY [Ceph - Add package repository] ****************************************************************************************************
TASK [Gathering Facts] ******************************************************************************************************************
ok: [server04]
ok: [server05]
ok: [server03]
ok: [server01]
ok: [server02]
TASK [Create apt keyring path] **********************************************************************************************************
ok: [server03]
ok: [server01]
ok: [server05]
ok: [server04]
ok: [server02]
TASK [Add ceph GPG key] *****************************************************************************************************************
changed: [server04]
changed: [server03]
changed: [server05]
changed: [server01]
changed: [server02]
TASK [Get DPKG architecture] ************************************************************************************************************
ok: [server04]
ok: [server03]
ok: [server05]
ok: [server01]
ok: [server02]
TASK [Add ceph package sources] *********************************************************************************************************
changed: [server03]
changed: [server05]
changed: [server04]
changed: [server02]
changed: [server01]
RUNNING HANDLER [Update apt] ************************************************************************************************************
changed: [server01]
changed: [server04]
changed: [server05]
changed: [server03]
changed: [server02]
PLAY [Ceph - Install packages] **********************************************************************************************************
TASK [Gathering Facts] ******************************************************************************************************************
ok: [server01]
ok: [server04]
ok: [server05]
ok: [server03]
ok: [server02]
TASK [Install ceph-common] **************************************************************************************************************
changed: [server02]
changed: [server03]
changed: [server05]
changed: [server04]
changed: [server01]
TASK [Install ceph-mon] *****************************************************************************************************************
skipping: [server04]
skipping: [server05]
changed: [server03]
changed: [server01]
changed: [server02]
TASK [Install ceph-mgr] *****************************************************************************************************************
skipping: [server04]
skipping: [server05]
changed: [server03]
changed: [server02]
changed: [server01]
TASK [Install ceph-mds] *****************************************************************************************************************
skipping: [server04]
skipping: [server05]
changed: [server01]
changed: [server02]
changed: [server03]
TASK [Install ceph-osd] *****************************************************************************************************************
changed: [server01]
changed: [server04]
changed: [server03]
changed: [server02]
changed: [server05]
TASK [Install ceph-rbd-mirror] **********************************************************************************************************
skipping: [server01]
skipping: [server02]
skipping: [server04]
skipping: [server05]
skipping: [server03]
TASK [Install radosgw] ******************************************************************************************************************
skipping: [server01]
skipping: [server02]
skipping: [server03]
changed: [server04]
changed: [server05]
PLAY [Ceph - Set up config and keyrings] ************************************************************************************************
TASK [Transfer the cluster configuration] ***********************************************************************************************
changed: [server01]
changed: [server04]
changed: [server03]
changed: [server05]
changed: [server02]
TASK [Create main storage directory] ****************************************************************************************************
ok: [server04]
ok: [server01]
ok: [server03]
ok: [server05]
ok: [server02]
TASK [Create monitor bootstrap path] ****************************************************************************************************
skipping: [server05]
skipping: [server04]
changed: [server01]
changed: [server03]
changed: [server02]
TASK [Create OSD bootstrap path] ********************************************************************************************************
changed: [server05]
changed: [server04]
changed: [server01]
changed: [server03]
changed: [server02]
TASK [Transfer main admin keyring] ******************************************************************************************************
changed: [server05]
changed: [server03]
changed: [server01]
changed: [server02]
changed: [server04]
TASK [Transfer additional client keyrings] **********************************************************************************************
skipping: [server05]
skipping: [server03]
skipping: [server04]
skipping: [server01]
skipping: [server02]
TASK [Transfer bootstrap mon keyring] ***************************************************************************************************
skipping: [server05]
skipping: [server04]
changed: [server03]
changed: [server02]
changed: [server01]
TASK [Transfer bootstrap mon map] *******************************************************************************************************
skipping: [server05]
skipping: [server04]
changed: [server03]
changed: [server02]
changed: [server01]
TASK [Transfer bootstrap OSD keyring] ***************************************************************************************************
changed: [server05]
changed: [server04]
changed: [server01]
changed: [server03]
changed: [server02]
RUNNING HANDLER [Restart Ceph] **********************************************************************************************************
changed: [server05]
changed: [server03]
changed: [server02]
changed: [server04]
changed: [server01]
PLAY [Ceph - Deploy mon] ****************************************************************************************************************
TASK [Gathering Facts] ******************************************************************************************************************
ok: [server01]
ok: [server02]
ok: [server05]
ok: [server04]
ok: [server03]
TASK [Bootstrap Ceph mon] ***************************************************************************************************************
skipping: [server04]
skipping: [server05]
changed: [server02]
changed: [server03]
changed: [server01]
TASK [Enable and start Ceph mon] ********************************************************************************************************
skipping: [server04]
skipping: [server05]
changed: [server02]
changed: [server03]
changed: [server01]
RUNNING HANDLER [Enable msgr2] **********************************************************************************************************
fatal: [server01]: FAILED! => {"changed": true, "cmd": "ceph mon enable-msgr2", "delta": "0:05:00.070563", "end": "2024-07-11 13:43:18.284315", "msg": "non-zero return code", "rc": 1, "start": "2024-07-11 13:38:18.213752", "stderr": "2024-07-11T13:43:18.279+0000 7ff21f567640 0 monclient(hunting): authenticate timed out after 300\n[errno 110] RADOS timed out (error connecting to the cluster)", "stderr_lines": ["2024-07-11T13:43:18.279+0000 7ff21f567640 0 monclient(hunting): authenticate timed out after 300", "[errno 110] RADOS timed out (error connecting to the cluster)"], "stdout": "", "stdout_lines": []}
fatal: [server03]: FAILED! => {"changed": true, "cmd": "ceph mon enable-msgr2", "delta": "0:05:00.109144", "end": "2024-07-11 13:43:18.320621", "msg": "non-zero return code", "rc": 1, "start": "2024-07-11 13:38:18.211477", "stderr": "2024-07-11T13:43:18.316+0000 7fc48f66d640 0 monclient(hunting): authenticate timed out after 300\n[errno 110] RADOS timed out (error connecting to the cluster)", "stderr_lines": ["2024-07-11T13:43:18.316+0000 7fc48f66d640 0 monclient(hunting): authenticate timed out after 300", "[errno 110] RADOS timed out (error connecting to the cluster)"], "stdout": "", "stdout_lines": []}
fatal: [server02]: FAILED! => {"changed": true, "cmd": "ceph mon enable-msgr2", "delta": "0:05:00.093801", "end": "2024-07-11 13:43:18.316757", "msg": "non-zero return code", "rc": 1, "start": "2024-07-11 13:38:18.222956", "stderr": "2024-07-11T13:43:18.314+0000 7f4cb7b4a640 0 monclient(hunting): authenticate timed out after 300\n[errno 110] RADOS timed out (error connecting to the cluster)", "stderr_lines": ["2024-07-11T13:43:18.314+0000 7f4cb7b4a640 0 monclient(hunting): authenticate timed out after 300", "[errno 110] RADOS timed out (error connecting to the cluster)"], "stdout": "", "stdout_lines": []}
PLAY RECAP ******************************************************************************************************************************
server01 : ok=29 changed=18 unreachable=0 failed=1 skipped=3 rescued=0 ignored=0
server02 : ok=29 changed=18 unreachable=0 failed=1 skipped=3 rescued=0 ignored=0
server03 : ok=32 changed=25 unreachable=0 failed=1 skipped=3 rescued=0 ignored=0
server04 : ok=22 changed=11 unreachable=0 failed=0 skipped=10 rescued=0 ignored=0
server05 : ok=22 changed=11 unreachable=0 failed=0 skipped=10 rescued=0 ignored=0
The text was updated successfully, but these errors were encountered: