这篇文章上次修改于 686 天前,可能其部分内容已经发生变化,如有疑问可询问作者。
错误
platform.linux_distribution问题
表现
RuntimeError: AttributeError: module 'platform' has no attribute 'linux_distribution'
原因
在Python 3.8及更高版本中,platform.linux_distribution已经被删除了,所以ceph会报错。
解决
解决方法简单粗暴,直接修改程序逻辑就好。
编辑 /usr/lib/python3/dist-packages/ceph_deploy/hosts/remotes.py
然后找到platform_information函数,将下列两行用try语句包裹:
linux_distribution = _linux_distribution or platform.linux_distribution
distro, release, codename = linux_distribution()
然后设置三个变量的默认值,修改后是这样的:
def platform_information(_linux_distribution=None):
""" detect platform information from remote host """
"""
linux_distribution = _linux_distribution or platform.linux_distribution
distro, release, codename = linux_distribution()
"""
distro = release = codename = None
try:
linux_distribution = _linux_distribution or platform.linux_distribution
distro, release, codename = linux_distribution()
except AttributeError:
pass
保存退出,然后再次尝试运行。
ceph-deploy文本显示问题
表现
[a1][INFO ] Running command: sudo fdisk -l
[ceph_deploy][ERROR ] Traceback (most recent call last):
[ceph_deploy][ERROR ] File "/usr/lib/python3/dist-packages/ceph_deploy/util/decorators.py", line 69, in newfunc
[ceph_deploy][ERROR ] return f(*a, **kw)
[ceph_deploy][ERROR ] File "/usr/lib/python3/dist-packages/ceph_deploy/cli.py", line 166, in _main
[ceph_deploy][ERROR ] return args.func(args)
[ceph_deploy][ERROR ] File "/usr/lib/python3/dist-packages/ceph_deploy/osd.py", line 434, in disk
[ceph_deploy][ERROR ] disk_list(args, cfg)
[ceph_deploy][ERROR ] File "/usr/lib/python3/dist-packages/ceph_deploy/osd.py", line 375, in disk_list
[ceph_deploy][ERROR ] if line.startswith('Disk /'):
[ceph_deploy][ERROR ] TypeError: startswith first arg must be bytes or a tuple of bytes, not str
[ceph_deploy][ERROR ]
原因
类型不通问题,需要以bytes形式匹配。
解决
打开 /usr/lib/python3/dist-packages/ceph_deploy/osd.py
查找下列代码:
if line.startswith('Disk /'):
将其替换为:
if line.startswith(b'Disk /'):
保存退出,重新运行即可。
残留Block问题
如果用ceph-deploy强制删除了原来的集群,那原来的OSD可能会保留在磁盘内。
所以当再次部署的时候,使用ceph-deploy擦除磁盘的时候可能导致异常情况:
[x1][INFO ] Running command: /usr/sbin/ceph-volume lvm zap /dev/sdb
[x1][WARNIN] --> Zapping: /dev/sdb
[x1][WARNIN] --> --destroy was not specified, but zapping a whole device will remove the partition table
[x1][WARNIN] stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy
[x1][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[x1][WARNIN] stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy
[x1][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[x1][WARNIN] stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy
[x1][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[x1][WARNIN] stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy
[x1][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[x1][WARNIN] stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy
[x1][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[x1][WARNIN] stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy
[x1][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[x1][WARNIN] stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy
[x1][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[x1][WARNIN] stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy
[x1][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition
[x1][WARNIN] --> RuntimeError: could not complete wipefs on device: /dev/sdb
[x1][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-volume lvm zap /dev/sdb
要想解决这个问题,就得手动删除磁盘中残余的Block,先用lsblk查看磁盘情况
lsblk
我这边显示的情况是:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop1 7:1 0 55.5M 1 loop /snap/core18/2409
loop2 7:2 0 71.3M 1 loop /snap/lxd/16099
loop3 7:3 0 61.9M 1 loop /snap/core20/1518
loop4 7:4 0 67.8M 1 loop /snap/lxd/22753
loop5 7:5 0 47M 1 loop /snap/snapd/16292
loop6 7:6 0 55.6M 1 loop /snap/core18/2538
sda 8:0 0 50G 0 disk
├─sda1 8:1 0 1M 0 part
└─sda2 8:2 0 50G 0 part /
sdb 8:16 0 1T 0 disk
└─ceph--66fb0189--7b8a--423e--a26c--f4a85545f396-osd--block--df953059--5020--4c8c--8b82--4dd8a22a0b1c
253:0 0 1024G 0 lvm
rbd0 252:0 0 20G 0 disk
rbd1 252:16 0 20G 0 disk
rbd2 252:32 0 2G 0 disk
rbd3 252:48 0 20G 0 disk
rbd4 252:64 0 4G 0 disk
rbd5 252:80 0 8G 0 disk
rbd6 252:96 0 8G 0 disk
rbd7 252:112 0 8G 0 disk
可以看到在要抹除的磁盘/dev/sdb下,存在一个前ceph集群残留的存储块: ceph--66fb0189--7b8a--423e--a26c--f4a85545f396-osd--block--df953059--5020--4c8c--8b82--4dd8a22a0b1c
把它擦除掉就好了
sudo dmsetup remove --force ceph--66fb0189--7b8a--423e--a26c--f4a85545f396-osd--block--df953059--5020--4c8c--8b82--4dd8a22a0b1c
删除这个block之后,磁盘里可能还存在一些LVM或者分区,用wipefs强行写入删除:
wipefs -af /dev/sdb
没有评论