其他分享
首页 > 其他分享> > 【ceph相关】osd异常问题处理(lvm信息丢失)

【ceph相关】osd异常问题处理(lvm信息丢失)

作者:互联网

一、前言

1、简述

参考文档:
RHEL / CentOS : How to rebuild LVM from Archive (metadata backups)
Red Hat Enterprise Linux 7 逻辑卷管理器管理
Bluestore 下的 OSD 开机自启动分析

本文介绍osd异常排查及相关修复过程,主要涉及lvm修复及osd恢复启动两部分说明

2、问题说明

root@node163:~# ceph -s
  cluster:
    id:     9bc47ff2-5323-4964-9e37-45af2f750918
    health: HEALTH_WARN
            too many PGs per OSD (256 > max 250)

  services:
    mon: 3 daemons, quorum node163,node164,node165
    mgr: node163(active), standbys: node164, node165
    mds: ceph-1/1/1 up  {0=node165=up:active}, 2 up:standby
    osd: 3 osds: 2 up, 2 in

  data:
    pools:   3 pools, 256 pgs
    objects: 46 objects, 100MiB
    usage:   2.20GiB used, 198GiB / 200GiB avail
    pgs:     256 active+clean

root@node163:~# ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF 
-1       0.29306 root default                             
-5       0.09769     host node163                         
 1   hdd 0.09769         osd.1      down        0 1.00000 
-3       0.09769     host node164                         
 0   hdd 0.09769         osd.0        up  1.00000 1.00000 
-7       0.09769     host node165                         
 2   hdd 0.09769         osd.2        up  1.00000 1.00000 
root@node163:~# lvs
root@node163:~# vgs
root@node163:~# pvs
root@node163:~# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0  100G  0 disk 
vda    254:0    0  100G  0 disk 
├─vda1 254:1    0  487M  0 part /boot
├─vda2 254:2    0 54.4G  0 part /
├─vda3 254:3    0    1K  0 part 
├─vda5 254:5    0 39.5G  0 part /data
├─vda6 254:6    0  5.6G  0 part [SWAP]
└─vda7 254:7    0  105M  0 part /boot/efi

root@node163:~# df -h
Filesystem                                   Size  Used Avail Use% Mounted on
udev                                         2.0G     0  2.0G   0% /dev
tmpfs                                        394M   47M  347M  12% /run
/dev/vda2                                     54G   12G   40G  23% /
tmpfs                                        2.0G     0  2.0G   0% /dev/shm
tmpfs                                        5.0M     0  5.0M   0% /run/lock
tmpfs                                        2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/vda1                                    464M  178M  258M  41% /boot
/dev/vda5                                     39G   48M   37G   1% /data
/dev/vda7                                    105M  550K  105M   1% /boot/efi
tmpfs                                        394M     0  394M   0% /run/user/0

root@node163:~# dd if=/dev/sda bs=512 count=4 | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
4+0 records in
4+0 records out
2048 bytes (2.0 kB, 2.0 KiB) copied, 0.000944624 s, 2.2 MB/s
00000800

二、处理过程

由以上信息可知,磁盘lvm信息丢失,磁盘未挂载,导致osd启动失败
此处我们尝试进行lvm修复和osd恢复启动两部分工作

1、lvm修复

1.1、简述

lvm配置目录结构如下,每当有vg或者lv有配置变更时,lvm都会创建元数据的备份和存档

/etc/lvm/              lvm配置主目录
/etc/lvm/archive       lvm元数据备份(一般存放的文件为完整的lvm配置信息)
/etc/lvm/backup        lvm元数据存档(一般存放的文件为每个lvm阶段性操作记录,比如说vgcreate、lvcreate、lvchange等)
/etc/lvm/lvm.conf      lvm主配置文件,涉及到元数据的备份和存档

可以通过vgcfgrestore --list <vg_name>查询元数据存档信息,根据操作记录找到最完整的lvm配置信息,可通过此命令找到最新的archive文件用于lvm误删恢复

[Unauthorized System] root@node163:/etc/lvm/archive# vgcfgrestore --list ceph-07e80157-b488-41e5-b217-4079d52edb08

  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00000-999427028.vg
  Couldn't find device with uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN.
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/vgcreate --force --yes ceph-07e80157-b488-41e5-b217-4079d52edb08 /dev/sda'
  Backup Time:    Wed Jun 29 14:53:47 2022


  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00001-98007334.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvcreate --yes -l 100%FREE -n osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d ceph-07e80157-b488-41e5-b217-4079d52edb08'
  Backup Time:    Wed Jun 29 14:53:47 2022


  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00002-65392131.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvchange --addtag ceph.type=block /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d'
  Backup Time:    Wed Jun 29 14:53:47 2022


  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00003-1190179092.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvchange --addtag ceph.block_device=/dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d'
  Backup Time:    Wed Jun 29 14:53:47 2022


  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00004-1217184452.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvchange --addtag ceph.vdo=0 /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d'
  Backup Time:    Wed Jun 29 14:53:48 2022


  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00005-2051164187.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvchange --addtag ceph.osd_id=1 /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d'
  Backup Time:    Wed Jun 29 14:53:48 2022

默认情况下,使用pvcreate创建pv,会在物理磁盘第二个512 bytes扇区存放物理卷标签,物理卷标签以字符串LABELONE开头
可通过dd if=<pv_disk_path> bs=512 count=2查询pv设备是否正常
注:物理卷标签一般包括物理卷UUID、块设备大小等信息

--异常节点信息--
root@node163:~# dd if=/dev/sda bs=512 count=2 | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
2+0 records in
2+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000803544 s, 1.3 MB/s
00000400


--正常节点信息--
root@node164:/etc/lvm/archive# dd if=/dev/sda bs=512 count=2 | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
2+0 records in
2+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000111721 s, 9.2 MB/s
00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
00000210  1c 9f f4 1e 20 00 00 00  4c 56 4d 32 20 30 30 31  |.... ...LVM2 001|
00000220  59 6c 6a 79 78 64 59 53  66 4e 44 54 4b 7a 36 64  |YljyxdYSfNDTKz6d|
00000230  41 31 44 56 46 79 52 78  5a 52 39 58 61 49 45 52  |A1DVFyRxZR9XaIER|
00000240  00 00 00 00 19 00 00 00  00 00 10 00 00 00 00 00  |................|
00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
00000270  00 f0 0f 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000280  00 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  |................|
00000290  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400

此外,如果物理磁盘未被覆盖写入新数据,可以通过dd if=<pv_disk_path> count=12 | strings查询lvm相关配置信息

root@node163:~# dd if=/dev/sda count=12 | strings 
 LVM2 x[5A%r0N*>
ceph-07e80157-b488-41e5-b217-4079d52edb08 {
id = "e1Ge2Y-6DAn-EZzA-6btK-MGMW-qVrP-ldcE9R"
seqno = 1
format = "lvm2"
status = ["RESIZEABLE", "READ", "WRITE"]
flags = []
extent_size = 8192
max_lv = 0
max_pv = 0
metadata_copies = 0
physical_volumes {
pv0 {
id = "UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN"
device = "/dev/sda"
status = ["ALLOCATABLE"]
flags = []
dev_size = 209715200
pe_start = 2048
pe_count = 25599
# Generated by LVM2 version 2.02.133(2) (2015-10-30): Wed Jun 29 14:53:47 2022
contents = "Text Format Volume Group"
version = 1
description = ""
creation_host = "node163"    # Linux node163 4.4.58-20180615.kylin.server.YUN+-generic #kylin SMP Tue Jul 10 14:55:31 CST 2018 aarch64
creation_time = 1656485627    # Wed Jun 29 14:53:47 2022
ceph-07e80157-b488-41e5-b217-4079d52edb08 {
id = "e1Ge2Y-6DAn-EZzA-6btK-MGMW-qVrP-ldcE9R"
seqno = 2
format = "lvm2"
status = ["RESIZEABLE", "READ", "WRITE"]
flags = []
extent_size = 8192
max_lv = 0
max_pv = 0
metadata_copies = 0
physical_volumes {
pv0 {
id = "UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN"
device = "/dev/sda"
status = ["ALLOCATABLE"]
flags = []
dev_size = 209715200
pe_start = 2048
pe_count = 25599
logical_volumes {
osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d {
12+0 records in
12+0 records out
id = "oV0BZG-WLSM-v2jL-god

1.2、获取lvm信息

在进行lvm修复之前,需要先拿到lv(一般为osd-block-<osd_fsid>)和vg(一般以ceph-开头)信息
注:osd_fsid可通过ceph osd dump | grep <osd_id> | awk '{print $NF}'查询

# 查询osd.1对应lv为osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d,vg为ceph-07e80157-b488-41e5-b217-4079d52edb08

root@node163:/etc/lvm/archive# grep `ceph osd dump | grep osd.1 | awk '{print $NF}'` -R *
ceph-07e80157-b488-41e5-b217-4079d52edb08_00001-98007334.vg:description = "Created *before* executing '/sbin/lvcreate --yes -l 100%FREE -n osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d ceph-07e80157-b488-41e5-b217-4079d52edb08'"
# 通过查询vg元数据操作记录及比对archive文件大小,找到最完整的archive文件为/etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg,查看pv uuid为UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN

root@node163:/etc/lvm/archive# vgcfgrestore --list ceph-07e80157-b488-41e5-b217-4079d52edb08
  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvchange --addtag ceph.block_uuid=oV0BZG-WLSM-v2jL-godE-o6vd-fdfu-w7Ms5w /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d'
  Backup Time:    Wed Jun 29 14:53:48 2022

root@node163:/etc/lvm/archive# cat ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg | grep -A 5 physical_volumes 
    physical_volumes {

        pv0 {
            id = "UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN"
            device = "/dev/sda"    # Hint only

1.3、构造label信息

根据一开始查询的信息得知,osd.1对应物理磁盘pv相关信息已丢失,故无法直接使用vgcfgrestore命令恢复vg配置
此处需要用一个新的硬盘,使用原有的pv-uuid和archive文件创建一个新的pv,将新硬盘前两个扇区信息dd写入到原有的osd.1对应物理磁盘,恢复原有pv信息

root@node163:/etc/lvm/archive# vgcfgrestore -f ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg ceph-07e80157-b488-41e5-b217-4079d52edb08
  Couldn't find device with uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN.
  PV unknown device missing from cache
  Format-specific setup for unknown device failed
  Restore failed.
[root@node122 ~]# pvcreate -ff --uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN --restorefile ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg /dev/sdb 
  Couldn't find device with uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN.
  Physical volume "/dev/sdb" successfully created.
[root@node122 ~]# dd if=/dev/sdb of=file_label bs=512 count=2
2+0 records in
2+0 records out
1024 bytes (1.0 kB) copied, 0.219809 s, 4.7 kB/s

[root@node122 ~]# dd if=./file_label | hexdump -C
2+0 records in
2+0 records out
1024 bytes (1.0 kB) copied00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
, 6.1274e-05 s, 16.7 MB/s
00000210  2b a3 c4 46 20 00 00 00  4c 56 4d 32 20 30 30 31  |+..F ...LVM2 001|
00000220  55 6a 78 71 75 48 69 48  4a 65 4e 59 31 41 42 64  |UjxquHiHJeNY1ABd|
00000230  51 66 30 30 6f 44 6a 32  32 43 68 65 65 4f 54 4e  |Qf00oDj22CheeOTN|
00000240  00 00 00 00 19 00 00 00  00 00 10 00 00 00 00 00  |................|
00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
00000270  00 f0 0f 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000280  00 00 00 00 00 00 00 00  02 00 00 00 00 00 00 00  |................|
00000290  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400

1.4、恢复pv信息

root@node163:/etc/lvm/archive# dd if=/dev/sda bs=512 count=2 | hexdump -C
2+0 records in
2+0 records out
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000761583 s, 1.3 MB/s

root@node163:/etc/lvm/archive# dd if=/dev/sda of=/home/file_backup bs=512 count=2 
2+0 records in
2+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000825143 s, 1.2 MB/s
root@node163:/etc/lvm/archive# dd if=/home/file_label of=/dev/sda bs=512 count=2
2+0 records in
2+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.00122898 s, 833 kB/s

root@node163:/etc/lvm/archive# dd if=/dev/sda bs=512 count=2 | hexdump -C
2+0 records in
2+0 records out
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
00000210  2b a3 c4 46 20 00 00 00  4c 56 4d 32 20 30 30 31  |+..F ...LVM2 001|
00000220  55 6a 78 71 75 48 69 48  4a 65 4e 59 31 41 42 64  |UjxquHiHJeNY1ABd|
00000230  51 66 30 30 6f 44 6a 32  32 43 68 65 65 4f 54 4e  |Qf00oDj22CheeOTN|
00000240  00 00 00 00 19 00 00 00  00 00 10 00 00 00 00 00  |................|
00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
00000270  00 f0 0f 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000280  00 00 00 00 00 00 00 00  02 00 00 00 00 00 00 00  |................|
00000290  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.00244905 s, 418 kB/s
root@node163:/etc/lvm/archive# pvcreate -ff --uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN --restorefile ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg /dev/sda 
  Couldn't find device with uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN.
  Physical volume "/dev/sda" successfully created

root@node163:/etc/lvm/archive# pvs
  PV         VG   Fmt  Attr PSize   PFree  
  /dev/sda        lvm2 ---  100.00g 100.00g

1.5、恢复vg/lv信息

root@node163:/etc/lvm/archive# vgcfgrestore -f ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg ceph-07e80157-b488-41e5-b217-4079d52edb08
  Restored volume group ceph-07e80157-b488-41e5-b217-4079d52edb08

root@node163:/etc/lvm/archive# vgs
  VG                                        #PV #LV #SN Attr   VSize   VFree
  ceph-07e80157-b488-41e5-b217-4079d52edb08   1   1   0 wz--n- 100.00g    0 

root@node163:/etc/lvm/archive# lvs
  LV                                             VG                                        Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d ceph-07e80157-b488-41e5-b217-4079d52edb08 -wi------- 100.00g

root@node163:~# ll /dev/mapper/
total 0
drwxr-xr-x  2 root root      80 Jul  1 17:28 ./
drwxr-xr-x 19 root root    4520 Jul  1 17:28 ../
lrwxrwxrwx  1 root root       7 Jul  1 17:33 ceph--07e80157--b488--41e5--b217--4079d52edb08-osd--block--8cd1658a--97d7--42d6--8f67--6a076c6fb42d -> ../dm-0
crw-------  1 root root 10, 236 Jul  1 17:28 control
root@node163:~# dd if=/dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d bs=512 count=2 | hexdump -C
2+0 records in
2+0 records out
00000000  62 6c 75 65 73 74 6f 72  65 20 62 6c 6f 63 6b 20  |bluestore block |
00000010  64 65 76 69 63 65 0a 38  63 64 31 36 35 38 61 2d  |device.8cd1658a-|
00000020  39 37 64 37 2d 34 32 64  36 2d 38 66 36 37 2d 36  |97d7-42d6-8f67-6|
00000030  61 30 37 36 63 36 66 62  34 32 64 0a 02 01 16 01  |a076c6fb42d.....|
00000040  00 00 8c d1 65 8a 97 d7  42 d6 8f 67 6a 07 6c 6f  |....e...B..gj.lo|
00000050  b4 2d 00 00 c0 ff 18 00  00 00 fd f6 bb 62 ac 78  |.-...........b.x|
00000060  dc 18 04 00 00 00 6d 61  69 6e 08 00 00 00 06 00  |......main......|
00000070  00 00 62 6c 75 65 66 73  01 00 00 00 31 09 00 00  |..bluefs....1...|
00000080  00 63 65 70 68 5f 66 73  69 64 24 00 00 00 39 62  |.ceph_fsid$...9b|
00000090  63 34 37 66 66 32 2d 35  33 32 33 2d 34 39 36 34  |c47ff2-5323-4964|
000000a0  2d 39 65 33 37 2d 34 35  61 66 32 66 37 35 30 39  |-9e37-45af2f7509|
000000b0  31 38 0a 00 00 00 6b 76  5f 62 61 63 6b 65 6e 64  |18....kv_backend|
000000c0  07 00 00 00 72 6f 63 6b  73 64 62 05 00 00 00 6d  |....rocksdb....m|
000000d0  61 67 69 63 14 00 00 00  63 65 70 68 20 6f 73 64  |agic....ceph osd|
000000e0  20 76 6f 6c 75 6d 65 20  76 30 32 36 09 00 00 00  | volume v026....|
000000f0  6d 6b 66 73 5f 64 6f 6e  65 03 00 00 00 79 65 73  |mkfs_done....yes|
00000100  07 00 00 00 6f 73 64 5f  6b 65 79 28 00 00 00 41  |....osd_key(...A|
00000110  51 44 35 39 72 74 69 41  62 65 2f 4c 52 41 41 65  |QD59rtiAbe/LRAAe|
00000120  6a 4b 6e 42 6d 56 4e 6a  4a 75 37 4e 78 37 79 37  |jKnBmVNjJu7Nx7y7|
00000130  58 38 57 55 41 3d 3d 05  00 00 00 72 65 61 64 79  |X8WUA==....ready|
00000140  05 00 00 00 72 65 61 64  79 06 00 00 00 77 68 6f  |....ready....who|
00000150  61 6d 69 01 00 00 00 31  7e 77 c5 2d 00 00 00 00  |ami....1~w.-....|
00000160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.00132415 s, 773 kB/s

2、osd恢复启动

ceph osd挂载由ceph-volume控制,当lvm修复成功之后,可以执行systemctl start ceph-volume@lvm-<osd_id>-`ceph osd dump | grep <osd_id> | awk '{print $NF'}`,启动lvm相关挂载和osd启动

root@node163:~# systemctl start ceph-volume@lvm-1-`ceph osd dump | grep osd.1 | awk '{print $NF'}`
root@node163:~# systemctl status ceph-volume@lvm-1-`ceph osd dump | grep osd.1 | awk '{print $NF'}`
● ceph-volume@lvm-1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d.service - Ceph Volume activation: lvm-1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d
   Loaded: loaded (/lib/systemd/system/ceph-volume@.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Fri 2022-07-01 17:54:49 CST; 4s ago
 Main PID: 55683 (code=exited, status=0/SUCCESS)

Jul 01 17:54:48 node163 systemd[1]: Starting Ceph Volume activation: lvm-1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d...
Jul 01 17:54:49 node163 sh[55683]: Running command: ceph-volume lvm trigger 1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d
Jul 01 17:54:49 node163 systemd[1]: Started Ceph Volume activation: lvm-1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d.

root@node163:~# ceph osd in osd.1
marked in osd.1. 

root@node163:~# ceph -s
  cluster:
    id:     9bc47ff2-5323-4964-9e37-45af2f750918
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node163,node164,node165
    mgr: node163(active), standbys: node164, node165
    mds: ceph-1/1/1 up  {0=node165=up:active}, 2 up:standby
    osd: 3 osds: 3 up, 3 in

  data:
    pools:   3 pools, 256 pgs
    objects: 46 objects, 100MiB
    usage:   3.21GiB used, 297GiB / 300GiB avail
    pgs:     256 active+clean

注:
如执行上面命令仍无法拉起osd,可执行ceph-volume lvm trigger <osd_id>-<osd_fs_id>查看详细执行步骤,进一步排查定位具体阻塞位置

root@node163:~# ceph-volume lvm trigger 1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d
Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-1
Running command: restorecon /var/lib/ceph/osd/ceph-1
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d --path /var/lib/ceph/osd/ceph-1
Running command: ln -snf /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d /var/lib/ceph/osd/ceph-1/block
Running command: chown -h ceph:ceph /var/lib/ceph/osd/ceph-1/block
Running command: chown -R ceph:ceph /dev/dm-0
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
Running command: systemctl enable ceph-volume@lvm-1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d
Running command: systemctl enable --runtime ceph-osd@1
Running command: systemctl start ceph-osd@1
--> ceph-volume lvm activate successful for osd ID: 1

标签:00,ceph,07e80157,41e5,lvm,osd
来源: https://www.cnblogs.com/luxf0/p/16435630.html