数据库
首页 > 数据库> > Oracle RAC中OCR整个磁盘的故障模拟恢复

Oracle RAC中OCR整个磁盘的故障模拟恢复

作者:互联网

测试目的: 模拟整个CRS盘损坏后,如何处理
处理过程: 重新创建一个同名的磁盘组给OCR使用。restore OCR信息,重新创建voting file。 即可。

RDBMS 11.2.0.4 
参考文档: Linux/Unix 平台,在CRS 磁盘组完全丢失后,如何恢复基于 ASM 的 OCR (Doc ID 2331776.1)

步骤
1 查看当前集群状态、OCRCHECK 、VOTEDISK 
2 查看当前的OCR的备份 ,如有必要,模拟前手工备份一次
3 使用DD命令进行模拟破坏
4 确认所有节点上的GI已经关闭
5 以排他模式启动CRS(在拥有最近OCR备份的节点上,仅在该节点上排他启动CRS)
6 创建ASM磁盘给CRS盘使用(注意,要和原来的CRS盘的名字一样,冗余模式可以不一样)
7 使用最近的OCR备份进行restore 
8 当前节点启动CRS daemon (仅仅适用于11.2.0.1.当前版本不适用,略)
9 重建ASM的spfile,否则重建voting file,会找不到asm盘 (主要原因是asm string参数,可以先设置这个参数,重建了voting file后,再次重新创建asm pfile)。
10 重建voting file
11 关闭CRS 
12 在各个节点上启动CRS 
13 检验CRS、OCRCHCK、VOTEDISK 

++++++++++以下为详细模拟过程

1  检查集群状态

[root@o11gr21 ~]# su - grid
[grid@o11gr21 ~]$ crsctl check cluster -all
**************************************************************
o11gr21:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
o11gr22:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
o11gr23:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
[grid@o11gr21 ~]$ 

-- 查看当前的ocr情况和votedisk 

[grid@o11gr21 o11gr2-cluster]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3324
         Available space (kbytes) :     258796
         ID                       :  953786240
         Device/File Name         :       +OCR
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check bypassed due to non-privileged user

[grid@o11gr21 o11gr2-cluster]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   2cb64f80fd974f61bf9ce6fa0f0bfbde (/dev/asm-ocr1) [OCR]
 2. ONLINE   8a44f55e4f8f4f8dbf3dbe890877a857 (/dev/asm-ocr2) [OCR]
 3. ONLINE   81ffe4523d5b4f24bf14083103925201 (/dev/asm-ocr3) [OCR]
Located 3 voting disk(s).
[grid@o11gr21 o11gr2-cluster]$ 

2  查看集群OCR的备份情况

[grid@o11gr21 ~]$ ocrconfig -showbackup

o11gr21     2019/08/24 13:50:40     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/backup00.ocr

o11gr21     2019/08/10 13:55:54     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/backup01.ocr

o11gr21     2019/08/24 13:50:40     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/day.ocr

o11gr21     2019/08/24 13:50:40     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/week.ocr

o11gr22     2022/03/18 16:45:11     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/backup_20220318_164511.ocr
[grid@o11gr21 ~]$ 

-- 手工备份一次后,再次查看备份情况

[grid@o11gr21 o11gr2-cluster]$ ocrconfig -showbackup

o11gr21     2019/08/24 13:50:40     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/backup00.ocr

o11gr21     2019/08/10 13:55:54     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/backup01.ocr

o11gr21     2019/08/24 13:50:40     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/day.ocr

o11gr21     2019/08/24 13:50:40     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/week.ocr

o11gr21     2022/03/19 09:19:17     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/backup_20220319_091917.ocr

o11gr22     2022/03/18 16:45:11     /u01/app/11.2.0/grid/cdata/o11gr2-cluster/backup_20220318_164511.ocr
[grid@o11gr21 o11gr2-cluster]$ 

3  查看当前ocr所在的盘,进行模拟破坏

SQL> select group_number ,name from v$asm_diskgroup;

GROUP_NUMBER NAME
------------ ------------------------------
           1 DATA
           2 OCR

SQL> 

SQL> select path from v$asm_disk where GROUP_NUMBER =2;

PATH
--------------------------------------------------------------------------------
/dev/asm-ocr3
/dev/asm-ocr2
/dev/asm-ocr1

SQL> 

-- 使用dd命令,破坏掉/dev/asm-ocr1,/dev/asm-ocr2,/dev/asm-ocr3盘 

dd if=/dev/zero of=/dev/asm-ocr1 bs=1M count=10
dd if=/dev/zero of=/dev/asm-ocr2 bs=1M count=10
dd if=/dev/zero of=/dev/asm-ocr3 bs=1M count=10
[root@o11gr21 ~]# dd if=/dev/zero of=/dev/asm-ocr1 bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0615155 s, 170 MB/s
[root@o11gr21 ~]# dd if=/dev/zero of=/dev/asm-ocr2 bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0584999 s, 179 MB/s
[root@o11gr21 ~]# dd if=/dev/zero of=/dev/asm-ocr3 bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0442179 s, 237 MB/s
[root@o11gr21 ~]# 

-- 使用asmca后,会发现找不到ocr盘,其中alert log如下  (alerto11gr21.log 和alert_+ASM1.log )

2022-03-19 09:28:11.498:
[/u01/app/11.2.0/grid/bin/oraagent.bin(4383)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/oraagent_grid' disconnected from server. Details at (:CRSAGF00117:) {0:1:9} in /u01/app/11.2.0/grid/log/o11gr21/agent/crsd/oraagent_grid/oraagent_grid.log.
2022-03-19 09:28:11.499:
[/u01/app/11.2.0/grid/bin/orarootagent.bin(4387)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:3:62} in /u01/app/11.2.0/grid/log/o11gr21/agent/crsd/orarootagent_root/orarootagent_root.log.
2022-03-19 09:28:11.501:
[/u01/app/11.2.0/grid/bin/scriptagent.bin(4521)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/scriptagent_grid' disconnected from server. Details at (:CRSAGF00117:) {0:9:9} in /u01/app/11.2.0/grid/log/o11gr21/agent/crsd/scriptagent_grid/scriptagent_grid.log.
2022-03-19 09:28:12.848:
[crsd(9211)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /u01/app/11.2.0/grid/log/o11gr21/crsd/crsd.log.
2022-03-19 09:28:12.856:
[crsd(9211)]CRS-0804:Cluster Ready Service aborted due to Oracle Cluster Registry error [PROC-26: Error while accessing the physical storage
]. Details at (:CRSD00111:) in /u01/app/11.2.0/grid/log/o11gr21/crsd/crsd.log.
2022-03-19 09:28:17.906:
[ohasd(2734)]CRS-2765:Resource 'ora.crsd' has failed on server 'o11gr21'.
[client(9325)]CRS-10001:19-Mar-22 09:28 ACFS-9203: true
[client(9341)]CRS-10001:19-Mar-22 09:28 ACFS-9203: true
SQL> alter diskgroup OCR dismount force /* ASM SERVER:4059585829 */
GMON dismounting group 2 at 11 for pid 32, osid 8717
NOTE: Disk OCR_0000 in mode 0x7f marked for de-assignment
NOTE: Disk OCR_0001 in mode 0x7f marked for de-assignment
NOTE: Disk OCR_0002 in mode 0x7f marked for de-assignment
SUCCESS: diskgroup OCR was dismounted
SUCCESS: alter diskgroup OCR dismount force /* ASM SERVER:4059585829 */
Sat Mar 19 09:28:08 2022
NOTE: diskgroup resource ora.OCR.dg is offline
SUCCESS: ASM-initiated MANDATORY DISMOUNT of group OCR
Sat Mar 19 09:28:08 2022
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4284.trc:
ORA-15078: ASM diskgroup was forcibly dismounted
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4284.trc:
ORA-15078: ASM diskgroup was forcibly dismounted
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4284.trc:
ORA-15078: ASM diskgroup was forcibly dismounted
WARNING: requested mirror side 1 of virtual extent 5 logical extent 0 offset 708608 is not allocated; I/O request failed
WARNING: requested mirror side 2 of virtual extent 5 logical extent 1 offset 708608 is not allocated; I/O request failed
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4284.trc:
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
Sat Mar 19 09:28:08 2022
SQL> alter diskgroup OCR check /* proxy */
ORA-15032: not all alterations performed
ORA-15001: diskgroup "OCR" does not exist or is not mounted
ERROR: alter diskgroup OCR check /* proxy */
 Received dirty detach msg from inst 3 for dom 2
List of instances:
 1 2 3
Dirty detach reconfiguration started (new ddet inc 3, cluster inc 6)
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 2 invalid = TRUE
 0 GCS resources traversed, 0 cancelled
freeing rdom 2
Dirty Detach Reconfiguration complete
NOTE: client exited [4273]
Sat Mar 19 09:28:12 2022
NOTE: [crsd.bin@o11gr21 (TNS V1-V3) 9211] opening OCR file
Sat Mar 19 09:28:19 2022
NOTE: [crsd.bin@o11gr21 (TNS V1-V3) 9315] opening OCR file
Sat Mar 19 09:28:25 2022
NOTE: [crsd.bin@o11gr21 (TNS V1-V3) 9576] opening OCR file
Sat Mar 19 09:28:32 2022
NOTE: [crsd.bin@o11gr21 (TNS V1-V3) 9626] opening OCR file
Sat Mar 19 09:28:38 2022
NOTE: [crsd.bin@o11gr21 (TNS V1-V3) 9642] opening OCR file
Sat Mar 19 09:28:46 2022
NOTE: [crsd.bin@o11gr21 (TNS V1-V3) 9673] opening OCR file
Sat Mar 19 09:28:52 2022
NOTE: [crsd.bin@o11gr21 (TNS V1-V3) 9716] opening OCR file
Sat Mar 19 09:28:59 2022
NOTE: [crsd.bin@o11gr21 (TNS V1-V3) 9730] opening OCR file
Sat Mar 19 09:29:05 2022
NOTE: [crsd.bin@o11gr21 (TNS V1-V3) 9784] opening OCR file
Sat Mar 19 09:29:11 2022
NOTE: [crsd.bin@o11gr21 (TNS V1-V3) 9797] opening OCR file

-- 检查集群,会发现crs没有启动

[grid@o11gr23 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[grid@o11gr23 ~]$ crsctl check cluster -all
**************************************************************
o11gr21:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
o11gr22:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
o11gr23:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
[grid@o11gr23 ~]$ 

4 在所有的节点上关闭掉GI 

crsctl stop crs -f 

5 以exclusive模式启动CRS,(在拥有最近的 OCR 备份的节点上,最近备份的ocr的节点是o11gr21)

crsctl start crs -excl   -- 11.2.0.1版本
crsctl start crs -excl -nocrs  -- 11.2.0.2及以上版本
[root@o11gr21 bin]# ./crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'o11gr21'
CRS-2676: Start of 'ora.mdnsd' on 'o11gr21' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'o11gr21'
CRS-2676: Start of 'ora.gpnpd' on 'o11gr21' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'o11gr21'
CRS-2672: Attempting to start 'ora.gipcd' on 'o11gr21'
CRS-2676: Start of 'ora.cssdmonitor' on 'o11gr21' succeeded
CRS-2676: Start of 'ora.gipcd' on 'o11gr21' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'o11gr21'
CRS-2672: Attempting to start 'ora.diskmon' on 'o11gr21'
CRS-2676: Start of 'ora.diskmon' on 'o11gr21' succeeded
CRS-2676: Start of 'ora.cssd' on 'o11gr21' succeeded
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'o11gr21'
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'o11gr21'
CRS-2672: Attempting to start 'ora.ctssd' on 'o11gr21'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'o11gr21' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'o11gr21'
CRS-2676: Start of 'ora.drivers.acfs' on 'o11gr21' succeeded
CRS-2676: Start of 'ora.ctssd' on 'o11gr21' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'o11gr21' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'o11gr21'
CRS-2676: Start of 'ora.asm' on 'o11gr21' succeeded
[root@o11gr21 bin]# 

--这个时候,检查GI的状态,是看不到的,但是asm实例是启动的

[root@o11gr21 bin]# ./crsctl status resource -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
[root@o11gr21 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4534: Cannot communicate with Event Manager
[root@o11gr21 bin]# ps -ef | grep pmon
grid      11924      1  0 09:41 ?        00:00:00 asm_pmon_+ASM1
root      13358   6219  0 09:47 pts/0    00:00:00 grep pmon
[root@o11gr21 bin]# 

6  通过SQL PLUS创建磁盘组 ,这里创建了CRS,和原来的不一样,最后发现restore ocr的时候有问题,又重新创建成了和原来一样的OCR

create diskgroup CRS external redundancy disk '/dev/asm-test3' attribute 'COMPATIBLE.ASM' = '11.2';

[grid@o11gr21 o11gr2-cluster]$ sqlplus /nolog

SQL*Plus: Release 11.2.0.4.0 Production on Sat Mar 19 09:48:47 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

SQL> conn / as sysasm
Connected.
SQL> create diskgroup CRS external redundancy disk '/dev/asm-test3' attribute 'COMPATIBLE.ASM' = '11.2';

Diskgroup created.

SQL> 
[grid@o11gr21 o11gr2-cluster]$ asmcmd
ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576      2047     1995                0            1995              0             N  CRS/
ASMCMD> 

-- restore最近的一次的ocr备份 ,发现出错,主要原因是,现有的OCR盘的名称是CRS,和备份中的OCR盘的名称不一样。创建和原来一样的OCR asm盘名称一样即可(原来的盘名称是OCR)。

[root@o11gr21 bin]# cd /u01/app/11.2.0/grid/cdata/o11gr2-cluster/
[root@o11gr21 o11gr2-cluster]# ls -l
total 37148
-rw-------. 1 root root 7602176 Aug 24  2019 backup00.ocr
-rw-------. 1 root root 7602176 Aug 10  2019 backup01.ocr
-rw-------. 1 root root 7630848 Mar 19 09:19 backup_20220319_091917.ocr
-rw-------. 1 root root 7602176 Aug 24  2019 day.ocr
-rw-------. 1 root root 7602176 Aug 24  2019 week.ocr
[root@o11gr21 o11gr2-cluster]# 
ocrconfig -restore backup_20220319_091917.ocr

[root@o11gr21 o11gr2-cluster]# /u01/app/11.2.0/grid/bin/ocrconfig -restore backup_20220319_091917.ocr
PROT-35: The configured OCR locations are not accessible.
[root@o11gr21 o11gr2-cluster]#

-- 在恢复的时候报错,初步怀疑,原来的OCR盘叫做OCR,现在的盘叫做CRS。名字不一样。更改后试试。

create diskgroup OCR external redundancy disk '/dev/asm-test3' attribute 'COMPATIBLE.ASM' = '11.2';

SQL> alter diskgroup CRS mount;

Diskgroup altered.

SQL> drop diskgroup CRS;

Diskgroup dropped.

SQL> create diskgroup OCR external redundancy disk '/dev/asm-test3' attribute 'COMPATIBLE.ASM' = '11.2';

Diskgroup created.

SQL> 

7 重新进行restore OCR的备份,这次可以正常restore,看来OCR盘的名字要和原来一样

[root@o11gr21 ~]# cd /u01/app/11.2.0/grid/cdata/o11gr2-cluster/
[root@o11gr21 o11gr2-cluster]# /u01/app/11.2.0/grid/bin/ocrconfig -restore backup_20220319_091917.ocr
[root@o11gr21 o11gr2-cluster]# 

8 在当前节点上启动CRS (只适用于11.2.0.1.    11.2.0.4略过)

crsctl start res ora.crsd -init

9 重建voteing file ,出现问题。查看log,主要原因是asm_diskstring这个是null

crsctl replace votedisk +OCR 

[root@o11gr21 ~]# /u01/app/11.2.0/grid/bin/crsctl replace votedisk +OCR 
CRS-4602: Failed 27 to add voting file aee3a63cd40b4fc6bf2f334b0738088f.
Failed to replace voting disk group with +OCR.
CRS-4000: Command Replace failed, or completed with errors.
[root@o11gr21 ~]# /u01/app/11.2.0/grid/bin/crsctl query css votedisk
Located 0 voting disk(s).
[root@o11gr21 ~]# 

-- 查看alter log 

[client(5938)]CRS-1002:The OCR was restored from file backup_20220319_091917.ocr.
2022-03-19 10:44:13.513:
[cssd(3002)]CRS-1638:Unable to locate voting file with ID aee3a63c-d40b4fc6-bf2f334b-0738088f that is being added to the list of configured voting files; details at (:CSSNM00022:) in /u01/app/11.2.0/grid/log/o11gr21/cssd/ocssd.log
2022-03-19 10:44:13.513:
[cssd(3002)]CRS-1638:Unable to locate voting file with ID aee3a63c-d40b4fc6-bf2f334b-0738088f that is being added to the list of configured voting files; details at (:CSSNM00027:) in /u01/app/11.2.0/grid/log/o11gr21/cssd/ocssd.log
2022-03-19 10:44:13.513:
[cssd(3002)]CRS-1630:A configuration change request failed because not all the new voting files were discovered; Details at (:CSSNM00012:) in /u01/app/11.2.0/grid/log/o11gr21/cssd/ocssd.log
2022-03-19 10:44:15.786:
[cssd(3002)]CRS-1638:Unable to locate voting file with ID aee3a63c-d40b4fc6-bf2f334b-0738088f that is being added to the list of configured voting files; details at (:CSSNM00022:) in /u01/app/11.2.0/grid/log/o11gr21/cssd/ocssd.log
2022-03-19 10:44:15.786:
[cssd(3002)]CRS-1638:Unable to locate voting file with ID aee3a63c-d40b4fc6-bf2f334b-0738088f that is being added to the list of configured voting files; details at (:CSSNM00027:) in /u01/app/11.2.0/grid/log/o11gr21/cssd/ocssd.log
2022-03-19 10:44:15.786:
[cssd(3002)]CRS-1630:A configuration change request failed because not all the new voting files were discovered; Details at (:CSSNM00012:) in /u01/app/11.2.0/grid/log/o11gr21/cssd/ocssd.log

--继续查看ocssd.log 

2022-03-19 10:44:13.512: [    CSSD][3005736704]clssnmReadDiscoveryProfile: voting file discovery string()
2022-03-19 10:44:13.512: [    CSSD][3005736704]clssnmvDDiscThread: using discovery string  for voting file add
2022-03-19 10:44:13.512: [   SKGFD][3005736704]Discovery with str::

2022-03-19 10:44:13.512: [   SKGFD][3005736704]UFS discovery with ::

2022-03-19 10:44:13.512: [   SKGFD][3005736704]Execute glob on the string /dev/raw/*
2022-03-19 10:44:13.513: [   SKGFD][3005736704]running stat on disk:/dev/raw/rawctl
2022-03-19 10:44:13.513: [   SKGFD][3005736704]Fetching UFS disk :/dev/raw/rawctl:

2022-03-19 10:44:13.513: [   SKGFD][3005736704]OSS discovery with ::

-- 修改asm string ,可以看到asm_diskstring 是null ,修改为合适的值 

SQL> show parameter string

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring                       string
SQL>      

SQL> alter system set asm_diskstring ='/dev/asm*';

System altered.

SQL> 

10  修改后,再次重建voting disk ,可以了 

[root@o11gr21 ~]# /u01/app/11.2.0/grid/bin/crsctl replace votedisk +OCR 
Successful addition of voting disk 837c32bd1b864f42bfb55750cb0babca.
Successfully replaced voting disk group with +OCR.
CRS-4266: Voting file(s) successfully replaced
[root@o11gr21 ~]# 
[root@o11gr21 ~]# /u01/app/11.2.0/grid/bin/crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   837c32bd1b864f42bfb55750cb0babca (/dev/asm-test3) [OCR]
Located 1 voting disk(s).
[root@o11gr21 ~]# 
[root@o11gr21 ~]# /u01/app/11.2.0/grid/bin/crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   837c32bd1b864f42bfb55750cb0babca (/dev/asm-test3) [OCR]
Located 1 voting disk(s).
[root@o11gr21 ~]# /u01/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3304
         Available space (kbytes) :     258816
         ID                       :  953786240
         Device/File Name         :       +OCR
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

[root@o11gr21 ~]# 

-- 为ASM重新创建spfile 

create pfile='/home/grid/asm_pfile' from memory ;

SQL> create pfile='/home/grid/asm_pfile' from memory ;

File created.

SQL> 

-- 编辑asm_pfile后,重新创建spfile 

create spfile='+OCR' from pfile='/home/grid/asm_pfile';

SQL> create spfile='+OCR' from pfile='/home/grid/asm_pfile';

File created.

SQL> 

SQL> shutdown immediate
ASM diskgroups volume disabled
ASM diskgroups dismounted
ASM instance shutdown
SQL> startup
ASM instance started

Total System Global Area 1135747072 bytes
Fixed Size                  2260728 bytes
Variable Size            1108320520 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted
ASM diskgroups volume enabled
SQL> 

SQL> show parameter spfile

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +OCR/o11gr2-cluster/asmparamet
                                                 erfile/registry.253.1099739169
SQL> show parameter string

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring                       string      /dev/asm*
SQL> 

-- 或者,使用下面基本的参数来创建 (本次没有使用这个)  

vi /home/grid/pfile_tmp

*.asm_power_limit=1
*.diagnostic_dest='/u01/app/oragrid'
*.instance_type='asm'
*.large_pool_size=12M
*.remote_login_passwordfile='EXCLUSIVE'

11  关闭CRS

crsctl stop crs -f

[root@o11gr21 bin]# ./crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'o11gr21'
CRS-2673: Attempting to stop 'ora.ctssd' on 'o11gr21'
CRS-2673: Attempting to stop 'ora.asm' on 'o11gr21'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'o11gr21'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'o11gr21'
CRS-2677: Stop of 'ora.mdnsd' on 'o11gr21' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'o11gr21' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'o11gr21' succeeded
CRS-2677: Stop of 'ora.asm' on 'o11gr21' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'o11gr21'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'o11gr21' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'o11gr21'
CRS-2677: Stop of 'ora.cssd' on 'o11gr21' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'o11gr21'
CRS-2677: Stop of 'ora.gipcd' on 'o11gr21' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'o11gr21'
CRS-2677: Stop of 'ora.gpnpd' on 'o11gr21' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'o11gr21' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@o11gr21 bin]# 

12  集群中所有节点启动crs 

[root@o11gr21 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@o11gr21 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
[root@o11gr21 bin]# 
[root@o11gr21 bin]# 
[root@o11gr21 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[root@o11gr21 bin]# 

[root@o11gr21 bin]# ./crsctl check cluster -all
**************************************************************
o11gr21:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
o11gr22:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
o11gr23:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
[root@o11gr21 bin]# 

13 检查,ocrcheck,votedisk 

[root@o11gr21 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   837c32bd1b864f42bfb55750cb0babca (/dev/asm-test3) [OCR]
Located 1 voting disk(s).
[root@o11gr21 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3304
         Available space (kbytes) :     258816
         ID                       :  953786240
         Device/File Name         :       +OCR
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

[root@o11gr21 bin]# 

到此,CRS磁盘组故障模拟,修复完毕。

END

标签:CRS,RAC,o11gr21,故障模拟,grid,Oracle,OCR,root,asm
来源: https://blog.csdn.net/xxzhaobb/article/details/123591657