18c & 19c Physical Standby Switchover Best Practices using SQL*Plus (Doc ID 2485237.1)
作者:互联网
18c & 19c Physical Standby Switchover Best Practices using SQL*Plus (Doc ID 2485237.1)
APPLIES TO:
Oracle Database - Enterprise Edition - Version 18.3.0.0.0 and later
Information in this document applies to any platform.
GOAL
This Document explain about switchover steps for 18c and 19c 本文档说明有关18c和19c的切换步骤
SOLUTION
Prerequisites 先决条件
Latest psu/bundle patches 最新的psu/bundle补丁
Master Note for Database Proactive Patch Program (Doc ID 756671.1)
Setup/configuration verification Setup/configuration验证
- Primary & standby should be running with same version of RDBMS 主库和备库版本要相同
- Verify the alert logfiles and make sure there are no erorrs 验证警报日志文件,并确保没有错误
- Run select on v$database_block_corruption & v$nonlogged_block from primary and standby and make sure there are no corruption 在主服务器和备用服务器上查询v$database_block_corruption和v$nonlogged_block,确保没有损坏
- Make sure primary and physical standby configuration are good and there are no errors in redo transport and redo apply. 确保主库和备库配置都正确,并且redo传输和redo应用没有错误。
Verify the Physical Standby Database Is Performing Properly
You can also optionally, use the below queries to check the redo transport and apply status 您也可以选择使用以下查询来检查重做传输并应用状态
On primary
To check the remote redo transport status and if there are any errors, V$ARCHIVE_DEST.ERROR will show the details
要检查远程redo传输状态以及是否有任何错误,V$ARCHIVE_DEST.ERROR将显示详细信息
SQL> col DEST_NAME for a20 SQL> col DESTINATION for a25 SQL> col ERROR for a15 SQL> col ALTERNATE for a20 SQL> set lines 1000 SQL> select DEST_NAME,DESTINATION,ERROR,ALTERNATE,TYPE,status,VALID_TYPE,VALID_ROLE from V$ARCHIVE_DEST where STATUS <>'INACTIVE';
To check the last archivelog created at the primary: 要检查在主数据库上创建的最后一个归档日志:
SQL> select thread#, max(sequence#) "Last Primary Seq Generated" from gv$archived_log val, gv$database vdb where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;
On Standby:
Using the below query, check the last received Archivelog from primary database (if database is RAC, then result will be displayed for each thread)
使用以下查询,检查从主数据库最后收到的Archivelog(如果数据库是RAC,则将显示每个线程的结果)
Query output is: last archive log sequence received by standby: 查询输出为:备用数据库接收到的最后一个归档日志序列:
SQL> select thread#, max(sequence#) "Last Standby Seq Received" from gv$archived_log val, gv$database vdb where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;
Query output is: last archive log sequence Applied by standby 查询输出为:上次批量日志序列由备用
SQL> select thread#, max(sequence#) "Last Standby Seq Applied" from gv$archived_log val, gv$database vdb where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in ('YES','IN-MEMORY') group by thread# order by 1;
Verify Initialization Parameters 验证初始化参数
Mainly below parameter should have configured correctly 主要是以下参数应该已正确配置
log_archive_config : should include primary and standby database (if multiple standby databases are existing, then all the standby database details should be included) 应包括主数据库和备用数据库(如果存在多个备用数据库,则应包括所有备用数据库详细信息)
fal_server : remote server from where archivelog can be fetched 可从中获取archivelog的远程服务器
db_unique_name : uniuque name under this configuration 在此配置下的唯一名称
log_archive_dest_n: for remote database to set archives. 用于远程数据库设置存档。
Once switchover completes, new primary will have log_archive_dest_n configuration to sent archive logs/redo 在空闲的主数据库和一个备用数据库配置中,主数据库应具有配置(log_archive_dest_n),以便使用VALID_FOR子句(PRIMARY_ROLE,ONLINE_FILE)将存档发送到备用数据库,并且备用数据库也将具有类似的配置。
切换完成后,新的主数据库将具有log_archive_dest_n配置以发送存档日志/重做
Ensure 'compatible' is set to same value at primary and standby 确保在主数据库和备用数据库上将 'compatible'设置为相同的值
If the file locations are different between primary and standby, use db_file_name_convert & log_file_name_convert for datafiles and redo logfiles respectively 如果主数据库和备用数据库之间的文件位置不同,请分别对数据文件和重做日志文件使用db_file_name_convert和log_file_name_convert
Refer: Set Primary Database Initialization Parameters 参考:设置主数据库初始化参数
Understand and Test Fallback Options 了解和测试Fallback选项
Check: A.4 Problems Switching Over to a Physical Standby Database
Pre-Switchover
Ensure Prerequisites are completely verified & Along with Prerequisites follow the below guidance to have sucessful swithover
These steps should be executed before real planned outtage starts and make sure there are no issue
Verify Redo/Archive log apply is goof and there are no gap
run the below query in physical standby to check last archive log sequence received and applied from all the thread, This will not include current sequence as the SQL is extracing details from v$archived_log
from gv$archived_log val, gv$database vdb
where val.resetlogs_change# = vdb.resetlogs_change#
and val.applied in ('YES','IN-MEMORY')
group by thread# order by 1;
Check the MRP process status (it should be started running and applying the logs)
Commands to stop & start the managed recovery process:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
For any reason, If standby database recovery (MRP) started with delay OR if the standby always maintained with lag then switchover will consume time to apply the logs to be sync.
Before switchover, try to maintain minimal archive log apply lag, which will reduce the total switchover time window.
Verify the apply delay configurations
If archive log gap is huge then
1) Monitoring Redo Transport Services to make sure it there are no transport log
2) Standby can also be recovered using incremental backup taken from primary
Restoring and Recovering Files Over the Network
Check the datafiles & Tempfiles status
Expected all the datafiles should be online in primary and standby, Incase if there are files offline (OR) NOT in online status, then restore the file and recover to make sure the standby database files are same as primary database files.
If there files made offline and after switchover if those files are needed to be in online after switchver, then make the files online
SQL> ALTER DATABASE DATAFILE 'datafile-name' ONLINE;
For Tempfiles:
The listed tempfiles are good enough for the application, it should be fine.
If more tempfiles needs to be added, then check in primary as well and add additional files.
Online and standby redo logfile configuration
Online redologfile:
col member for a50
select a.thread#,a.group#,a.bytes,a.blocksize,b.type,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;
From primary when the above command executed, we may get a.status in (INACTIVE,ACTIVE,CURRENT)
Expected a.status from Standby is UNUSED, CLEARING or CLEARING_CURRENT, if output has different result, then manually redo logfiles needs to be cleared.
For Standby redo logfile(SRL):
Standby redo logfile status would be in UNASSIGNED OR ACTIVE.
Command to clear ORL group:
If ORL or SRL needs to be cleared in the standby, Managed recovery process has to be stopped.
If the ORLs are not cleared till switchover time, then SWITCHOVER command will clear the ORLs and the start the database. But switchover will be consuming time to complete.
If the wait is longer (more than 15 min) then due to timeout session will get killed for oracle process, if the switchover is terminated due to timeout, retry again until switchover is sucessful.
If database is configured to use OMF files for Redologfile OR log_file_name_convert is set, then Online redo logfiles would get cleared automatically with the managed recovery process is started.
If the file locations are same at primary and standby, then configure log_file_name_convert with same value as replacing string
Example: log_file_name_convert='dummy','dummy'
To manage standby redo logfiles Refer: Managing Standby Redo Logs
Checking the alert logfiles
1) from primary alert logfile:
* Check are there any issue reported for redo transport ?
* There is no password file issue?
* There are no TNS or connection issue
2) From Standby database make sure,
* There are no error related to Managed recovery
* Recovery is moving forward by applying the archive log / redo log
* There are no TNS or connection issue
* There are no I/O issue or corurption issue
select * from v$database_block_corruption; -- it returns no rows
select * from v$nonlogged_block; -- it returns no rows
Check Archive log GAP & Redo Delay apply
You must configure the LOG_ARCHIVE_DEST_n and LOG_ARCHIVE_DEST_STATE_n parameters for each standby database so that when a switchover or failover occurs, all standby sites continue to receive redo data from the new primary database
You execute the below command in primary database:
Considering log_archive_dest_2 is configured for the redo shipping.
STATUS should be Valid
GAP_STATUS should be NO GAP
If different result is reported, then switchover should NOT be tried.
If the delay configured, stop the managed recovery process and the start the process without delay
If the delay is not removed, then switchover will take longer time.
Specifying a Time Delay for the Application of Archived Redo Log Files
Switchover:
While doing switchover, if standby connection needs to be maintained without disconnecting, then set the parameter STANDBY_DB_PRESERVE_STATES to SESSION or ALL
Verify the switchover
If this operation had been successful, a Database Altered message should be returned (execute the below SQL in the primary)
SQL> ALTER DATABASE SWITCHOVER TO <standby db_name> VERIFY;In case of error, fix an issue and then rerun switchover verify command.
Example: "ORA-16475: succeeded with warnings, check alert log for more details", in this case check the alert logfile and then resolve all the errors/warningsSwitchover steps
If switchover verify is successful, then execute the command to switchover the database.
1) Execute in the current primary
if the step 1 is successful, then follow step 2 open the new primary database in open mode
2) execute in new primary database
3) Old primary (current/new standby) should be mounted Or opened depends on the case .
If standby is Oracle Active data guard physical standby:
If standby is NOT Oracle Active data guard physical standby:
4) start redo apply in new standby
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
Post Switchover
In primary:
Check is the archivelogs are being transferred to the standby and getting applied
SQL>select dest_id,error,status from v$archive_dest where dest_id=<your remote log_archive_dest_<n>>;
SQL>select max(sequence#),thread# from v$log_history group by thread#;
If remote log_Archive_destination is 2 i.e log_archive_dest_2.
In standby:
Verify the archivelog availability and the application of the archivelog file
SQL>select max(sequence#),thread# from v$archived_log group by thread#;SQL> select name,role,instance,thread#,sequence#,action from gv$dataguard_process;
Additionally, Alert logfiles can be verified to confirm the archivelog transfer and archivelog apply in standby
标签:Switchover,18c,log,database,standby,primary,2485237.1,SQL,redo 来源: https://www.cnblogs.com/zylong-sys/p/12040628.html