我不知道如何进一步排除此故障。希望能得到帮助。

当我使用以下方式删除计算服务时:openstack compute service delete a04910ef-1441-4949-8ffb-6393c22141b2,我可以使用nova-compute以下方式在机器上重新启动sudo systemctl restart nova-compute以恢复 openstack 中的服务。

但是,该节点不再在openstack hypervisor list例如在我的情况下为os-compute03.maas )中列出:

$ openstack compute service list --service nova-compute
+--------------------------------------+--------------+-------------------+------+----------+-------+----------------------------+
| ID                                   | Binary       | Host              | Zone | Status   | State | Updated At                 |
+--------------------------------------+--------------+-------------------+------+----------+-------+----------------------------+
| d1fadd40-6035-4f76-b8c1-5b981d003832 | nova-compute | os-compute08.maas | nova | disabled | down  | 2024-09-25T14:02:55.000000 |
| 7e02a0bd-0e53-45bc-9680-99a33d98c05b | nova-compute | os-compute04.maas | nova | enabled  | up    | 2024-10-18T11:43:12.000000 |
| 7f47d65e-b041-44dc-927e-085effdf0ec9 | nova-compute | os-compute09.maas | nova | enabled  | up    | 2024-10-18T11:43:07.000000 |
| 3a7527b3-3664-4ae1-ac90-ac04e835ee5b | nova-compute | os-compute03.maas | nova | enabled  | up    | 2024-10-18T11:43:13.000000 |
+--------------------------------------+--------------+-------------------+------+----------+-------+----------------------------+

$ openstack hypervisor list
+----+---------------------+-----------------+----------+-------+
| ID | Hypervisor Hostname | Hypervisor Type | Host IP  | State |
+----+---------------------+-----------------+----------+-------+
|  1 | os-compute08.maas   | QEMU            | 10.0.1.8 | down  |
|  2 | os-compute04.maas   | QEMU            | 10.0.1.4 | up    |
|  3 | os-compute09.maas   | QEMU            | 10.0.1.9 | up    |
+----+---------------------+-----------------+----------+-------+

^ 没有os-compute03.maas


$ sudo nova-manage cell_v2 list_hosts
                                                                                                                                     
Modules with known eventlet monkey patching issues were imported prior to eventlet monkey patching: urllib3. This warning can usually be ignored if the caller is only importing and not executing nova code.
Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
+-----------+--------------------------------------+-------------------+
| Cell Name |              Cell UUID               |      Hostname     |
+-----------+--------------------------------------+-------------------+
|   cell1   | d3dfd353-b4ee-4293-b362-1e0175ebe337 | os-compute04.maas |
|   cell1   | d3dfd353-b4ee-4293-b362-1e0175ebe337 | os-compute08.maas |
|   cell1   | d3dfd353-b4ee-4293-b362-1e0175ebe337 | os-compute09.maas |
+-----------+--------------------------------------+-------------------+
$ sudo nova-manage cell_v2 discover_hosts --verbose
Modules with known eventlet monkey patching issues were imported prior to eventlet monkey patching: urllib3. This warning can usually be ignored if the caller is only importing and not executing nova code.
Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
Found 2 cell mappings.
Skipping cell0 since it does not contain hosts.
Getting computes from cell 'cell1': d3dfd353-b4ee-4293-b362-1e0175ebe337
Found 0 unmapped computes in cell: d3dfd353-b4ee-4293-b362-1e0175ebe337

我同样尝试重新启动 nova 控制器:

$ sudo systemctl restart nova-scheduler
$ sudo systemctl restart nova-conductor

重新启动 nova 控制器时,日志会显示以下内容:

2024-10-18 11:53:44.939 630048 INFO nova.scheduler.host_manager [None req-eb8c8117-136c-4a13-ac57-d32d4099388a - - - - - -] Received a sync request from an unknown host 'os-compute09.maas'. Re-created its InstanceList.
2024-10-18 11:53:44.939 630046 INFO nova.scheduler.host_manager [None req-eb8c8117-136c-4a13-ac57-d32d4099388a - - - - - -] Received a sync request from an unknown host 'os-compute09.maas'. Re-created its InstanceList.
2024-10-18 11:53:44.940 630045 INFO nova.scheduler.host_manager [None req-eb8c8117-136c-4a13-ac57-d32d4099388a - - - - - -] Received a sync request from an unknown host 'os-compute09.maas'. Re-created its InstanceList.
2024-10-18 11:53:44.968 630047 INFO nova.scheduler.host_manager [None req-eb8c8117-136c-4a13-ac57-d32d4099388a - - - - - -] Received a sync request from an unknown host 'os-compute09.maas'. Re-created its InstanceList.
2024-10-18 11:54:02.539 630048 INFO nova.scheduler.host_manager [None req-ad410200-f99f-4dc7-896d-637dd9620568 - - - - - -] Received a sync request from an unknown host 'os-compute04.maas'. Re-created its InstanceList.
2024-10-18 11:54:02.541 630047 INFO nova.scheduler.host_manager [None req-ad410200-f99f-4dc7-896d-637dd9620568 - - - - - -] Received a sync request from an unknown host 'os-compute04.maas'. Re-created its InstanceList.
2024-10-18 11:54:02.542 630046 INFO nova.scheduler.host_manager [None req-ad410200-f99f-4dc7-896d-637dd9620568 - - - - - -] Received a sync request from an unknown host 'os-compute04.maas'. Re-created its InstanceList.
2024-10-18 11:54:02.543 630045 INFO nova.scheduler.host_manager [None req-ad410200-f99f-4dc7-896d-637dd9620568 - - - - - -] Received a sync request from an unknown host 'os-compute04.maas'. Re-created its InstanceList.
2024-10-18 11:55:06.432 630048 INFO nova.scheduler.host_manager [None req-53ac9845-afb5-48ec-b24f-344d2afcfc3f - - - - - -] Host mapping not found for host os-compute03.maas. Not tracking instance info for this host.
2024-10-18 11:55:06.432 630048 INFO nova.scheduler.host_manager [None req-53ac9845-afb5-48ec-b24f-344d2afcfc3f - - - - - -] Received a sync request from an unknown host 'os-compute03.maas'. Re-created its InstanceList.
2024-10-18 11:55:06.435 630047 INFO nova.scheduler.host_manager [None req-53ac9845-afb5-48ec-b24f-344d2afcfc3f - - - - - -] Host mapping not found for host os-compute03.maas. Not tracking instance info for this host.
2024-10-18 11:55:06.435 630046 INFO nova.scheduler.host_manager [None req-53ac9845-afb5-48ec-b24f-344d2afcfc3f - - - - - -] Host mapping not found for host os-compute03.maas. Not tracking instance info for this host.
2024-10-18 11:55:06.436 630045 INFO nova.scheduler.host_manager [None req-53ac9845-afb5-48ec-b24f-344d2afcfc3f - - - - - -] Host mapping not found for host os-compute03.maas. Not tracking instance info for this host.
2024-10-18 11:55:06.436 630047 INFO nova.scheduler.host_manager [None req-53ac9845-afb5-48ec-b24f-344d2afcfc3f - - - - - -] Received a sync request from an unknown host 'os-compute03.maas'. Re-created its InstanceList.
2024-10-18 11:55:06.437 630045 INFO nova.scheduler.host_manager [None req-53ac9845-afb5-48ec-b24f-344d2afcfc3f - - - - - -] Received a sync request from an unknown host 'os-compute03.maas'. Re-created its InstanceList.
2024-10-18 11:55:06.436 630046 INFO nova.scheduler.host_manager [None req-53ac9845-afb5-48ec-b24f-344d2afcfc3f - - - - - -] Received a sync request from an unknown host 'os-compute03.maas'. Re-created its InstanceList.

特别的是:Host mapping not found for host os-compute03.maas. Not tracking instance info for this host.我猜这是应该在nova cell_v2中添加节点的时候吧?事实并非如此。


我猜测数据库中的以下记录与计算节点有关:

mysql> SELECT hypervisor_hostname, host_ip FROM nova.compute_nodes;
+---------------------+----------+
| hypervisor_hostname | host_ip  |
+---------------------+----------+
| os-compute08.maas   | 10.0.1.8 |
| os-compute04.maas   | 10.0.1.4 |
| os-compute09.maas   | 10.0.1.9 |
| os-compute03.maas   | 10.0.1.3 |
+---------------------+----------+
4 rows in set (0.00 sec)

os-compute03.maas确实没有在nova_api.host_mappings表中列出:

SELECT * FROM nova_api.host_mappings;
+---------------------+------------+----+---------+-------------------+
| created_at          | updated_at | id | cell_id | host              |
+---------------------+------------+----+---------+-------------------+
| 2024-07-02 21:23:10 | NULL       |  1 |       2 | os-compute08.maas |
| 2024-07-02 21:23:11 | NULL       |  2 |       2 | os-compute04.maas |
| 2024-07-02 21:23:11 | NULL       |  3 |       2 | os-compute09.maas |
+---------------------+------------+----+---------+-------------------+

我没有尝试手动将主机添加到此表中。我认为这应该会自动完成,但什么时候完成?我错过了什么?

我们如何进一步调试这种情况?


最佳答案
1

经过三天的努力,当我最终决定寻求帮助并写下这篇文章时,我最终找到了解决方案。

如果有人遇到这种情况,请尝试执行以下命令:

$ sudo nova-manage cell_v2 map_cell_and_hosts

参考: