数据库
首页 > 数据库> > 18c & 19c Physical Standby Switchover Best Practices using SQL*Plus (Doc ID 2485237.1)

18c & 19c Physical Standby Switchover Best Practices using SQL*Plus (Doc ID 2485237.1)

作者:互联网

18c & 19c Physical Standby Switchover Best Practices using SQL*Plus (Doc ID 2485237.1)

APPLIES TO:

Oracle Database - Enterprise Edition - Version 18.3.0.0.0 and later
Information in this document applies to any platform.

GOAL

 This Document explain about switchover steps for 18c and 19c  本文档说明有关18c和19c的切换步骤

SOLUTION

Prerequisites  先决条件

Latest psu/bundle patches  最新的psu/bundle补丁

      Master Note for Database Proactive Patch Program (Doc ID 756671.1)

Setup/configuration verification  Setup/configuration验证

You can also optionally, use the below queries to check the redo transport and apply status   您也可以选择使用以下查询来检查重做传输并应用状态

On primary
To check the remote redo transport status and if there are any errors,  V$ARCHIVE_DEST.ERROR will show the details
要检查远程redo传输状态以及是否有任何错误,V$ARCHIVE_DEST.ERROR将显示详细信息

SQL> col DEST_NAME for a20
SQL> col DESTINATION for a25
SQL> col ERROR for a15
SQL> col ALTERNATE for a20
SQL> set lines 1000
SQL> select DEST_NAME,DESTINATION,ERROR,ALTERNATE,TYPE,status,VALID_TYPE,VALID_ROLE from V$ARCHIVE_DEST where STATUS <>'INACTIVE';

To check the last archivelog created at the primary:  要检查在主数据库上创建的最后一个归档日志:

SQL> select thread#, max(sequence#) "Last Primary Seq Generated"  
from gv$archived_log val, gv$database vdb
where val.resetlogs_change# = vdb.resetlogs_change#
group by thread# order by 1;

 

On Standby:
Using the below query, check the last received Archivelog from primary database (if database is RAC, then result will be displayed for each thread)

使用以下查询,检查从主数据库最后收到的Archivelog(如果数据库是RAC,则将显示每个线程的结果)

Query output is: last archive log sequence received by standby:  查询输出为:备用数据库接收到的最后一个归档日志序列:

SQL> select thread#, max(sequence#) "Last Standby Seq Received"  
 from gv$archived_log val, gv$database vdb
 where val.resetlogs_change# = vdb.resetlogs_change#
 group by thread# order by 1;

Query output is: last archive log sequence Applied by standby  查询输出为:上次批量日志序列由备用

SQL> select thread#, max(sequence#) "Last Standby Seq Applied"
 from gv$archived_log val, gv$database vdb
 where val.resetlogs_change# = vdb.resetlogs_change#
 and val.applied in ('YES','IN-MEMORY')
 group by thread# order by 1;   

Verify Initialization Parameters  验证初始化参数

Mainly below parameter should have configured correctly  主要是以下参数应该已正确配置

log_archive_config : should include primary and standby database (if multiple standby databases are existing, then all the standby database details should be included)  应包括主数据库和备用数据库(如果存在多个备用数据库,则应包括所有备用数据库详细信息)
fal_server             : remote server from where archivelog can be fetched  可从中获取archivelog的远程服务器
db_unique_name   : uniuque name under this configuration  在此配置下的唯一名称
log_archive_dest_n: for remote database to set archives.  用于远程数据库设置存档。

In idle primary & one standby configuration, primary should have configuration (log_archive_dest_n) to sent archives to standby with VALID_FOR clause (PRIMARY_ROLE,ONLINE_FILE) & standby will also have similar configuration.
Once switchover completes, new primary will have log_archive_dest_n configuration to sent archive logs/redo 在空闲的主数据库和一个备用数据库配置中,主数据库应具有配置(log_archive_dest_n),以便使用VALID_FOR子句(PRIMARY_ROLE,ONLINE_FILE)将存档发送到备用数据库,并且备用数据库也将具有类似的配置。
切换完成后,新的主数据库将具有log_archive_dest_n配置以发送存档日志/重做

Ensure 'compatible' is set to same value at primary and standby  确保在主数据库和备用数据库上将 'compatible'设置为相同的值
If the file locations are different between primary and standby, use db_file_name_convert log_file_name_convert  for datafiles and redo logfiles respectively 如果主数据库和备用数据库之间的文件位置不同,请分别对数据文件和重做日志文件使用db_file_name_convertlog_file_name_convert 

Refer: Set Primary Database Initialization Parameters  参考:设置主数据库初始化参数

Understand and Test Fallback Options  了解和测试Fallback选项

Check:  A.4 Problems Switching Over to a Physical Standby Database

Pre-Switchover


Ensure Prerequisites are  completely verified & Along with Prerequisites follow the below guidance to have sucessful swithover
These steps should be executed before real planned outtage starts and make sure there are no issue

Verify Redo/Archive log apply is goof and there are no gap

run the below query in physical standby to check last archive log sequence received and applied from all the thread, This will not include current sequence as the SQL is extracing details from v$archived_log

SQL> select    thread#, max(sequence#) "Last Standby Seq Applied"
              from     gv$archived_log val, gv$database vdb
              where    val.resetlogs_change# = vdb.resetlogs_change#
              and      val.applied in ('YES','IN-MEMORY')
              group by thread# order by 1;


Check the MRP process status (it should be started running and applying the logs)

SQL> select * from gv$dataguard_process;


Commands to stop & start the managed recovery process:

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  DISCONNECT;


For any reason, If standby database recovery (MRP) started with delay OR if the standby always maintained with lag then switchover will consume time to apply the logs to be sync.
Before switchover, try to maintain minimal archive log apply lag, which will reduce the total switchover time window.

Verify the apply delay configurations

 

If archive log gap is huge then

1) Monitoring Redo Transport Services to make sure it there are no transport log

2) Standby can also be recovered using incremental backup taken from primary

Restoring and Recovering Files Over the Network

 

Check the datafiles & Tempfiles status

Expected all the datafiles should be online in primary and standby, Incase if there are files offline (OR) NOT in online status, then restore the file and recover to make sure the standby database files are same as primary database files.

If there files made offline and after switchover if those files are needed to be in online after switchver, then make the files online

SQL> SELECT NAME FROM V$DATAFILE WHERE STATUS=’OFFLINE’;
SQL> ALTER DATABASE DATAFILE 'datafile-name' ONLINE;


For Tempfiles:

SQL> select tf.name filename, bytes, ts.name tablespace    from v$tempfile tf, v$tablespace ts where tf.ts#=ts.ts#;

The listed tempfiles are good enough for the application, it should be fine.

If more tempfiles needs to be added, then check in primary as well and add additional files.

Online and standby redo logfile configuration

 Online redologfile:

set lines 150
col member for a50
select a.thread#,a.group#,a.bytes,a.blocksize,b.type,a.status,b.member from v$log a,v$logfile b where a.group#=b.group#;


From primary when the above command executed, we may get  a.status in (INACTIVE,ACTIVE,CURRENT)
Expected a.status  from Standby is UNUSED, CLEARING or CLEARING_CURRENT, if output has different result, then manually redo logfiles needs to be cleared.

For Standby redo logfile(SRL):

select s.thread#,s.group#,s.status,s.bytes,l.type,l.member from v$logfile l,v$standby_log s where s.group#=l.group#;


Standby redo logfile status would be in UNASSIGNED OR ACTIVE.

Command to clear ORL group:

SQL> ALTER DATABASE CLEAR LOGFILE GROUP <ORL GROUP# >;


If ORL or SRL needs to be cleared in the standby, Managed recovery process has to be stopped.


If the ORLs are not cleared till switchover time, then SWITCHOVER command will clear the ORLs and the start the database. But switchover will be consuming time to complete.
If the wait is longer (more than 15 min) then due to timeout session will get killed for oracle process, if the switchover is terminated due to timeout, retry again until switchover is sucessful.

If database is configured to use OMF files for Redologfile OR log_file_name_convert is set, then Online redo logfiles would get cleared automatically with the managed recovery process is started.

Note: log_file_name_convert parameter is recommened to set in primary & standby eventhough SRL&ORL locations are same at primary and standby
If the file locations are same at primary and standby, then configure log_file_name_convert with same value as replacing string
Example:  log_file_name_convert='dummy','dummy'

To manage standby redo logfiles Refer: Managing Standby Redo Logs

Checking the alert logfiles

 

1) from primary alert logfile:
    * Check are there any issue reported for redo transport ?
    * There is no password file issue?
    * There are no TNS or connection issue

2) From Standby database make sure,
    * There are no error related to Managed recovery
    * Recovery is moving forward by applying the archive log / redo log
    * There are no TNS or connection issue
    * There are no I/O issue or corurption issue
    select * from v$database_block_corruption;  -- it returns no rows
    select * from v$nonlogged_block; -- it returns no rows

 

Check Archive log GAP & Redo Delay apply

You must configure the LOG_ARCHIVE_DEST_n and LOG_ARCHIVE_DEST_STATE_n parameters for each standby database so that when a switchover or failover occurs, all standby sites continue to receive redo data from the new primary database

You execute the below command in primary database:

Considering log_archive_dest_2 is configured for the redo shipping.

SQL> SELECT STATUS, GAP_STATUS FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID = 2;


STATUS should be Valid
GAP_STATUS should be NO GAP

If different result is reported, then switchover should NOT be tried.

If the delay configured, stop the managed recovery process and the start the process without delay

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE NODELAY;


If the delay is not removed, then switchover will take longer time.

Specifying a Time Delay for the Application of Archived Redo Log Files

 

Switchover:

While doing switchover, if standby connection needs to be maintained without disconnecting, then set the parameter STANDBY_DB_PRESERVE_STATES to SESSION or ALL

Verify the switchover

If this operation had been successful, a Database Altered message should be returned (execute the below SQL in the primary)

SQL> ALTER DATABASE SWITCHOVER TO <standby db_name> VERIFY;

In case of error, fix an issue and then rerun switchover verify command.

    Example: "ORA-16475: succeeded with warnings, check alert log for more details", in this case check the alert logfile and then resolve all the errors/warnings

Switchover steps

If switchover verify is successful, then execute the command to switchover the database.

1) Execute in the current primary

SQL> ALTER DATABASE SWITCHOVER TO <standby db_name>;

if the step 1 is successful, then follow step 2 open the new primary database in open mode

2) execute in new primary database

SQL> ALTER DATABASE OPEN;

3) Old primary (current/new standby) should be mounted Or opened depends on the case .

If standby is Oracle Active data guard physical standby:

SQL> STARTUP;

If standby is NOT Oracle Active data guard physical standby:

SQL> STARTUP MOUNT;

4) start redo apply in new standby

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

 

Post Switchover

In primary:

Check is the archivelogs are being transferred to the standby and getting applied

SQL> alter system archive log current;
SQL>select dest_id,error,status from v$archive_dest where dest_id=<your remote log_archive_dest_<n>>;
SQL>select max(sequence#),thread# from v$log_history group by thread#;



If remote log_Archive_destination is 2 i.e log_archive_dest_2.

SQL>select max(sequence#)  from v$archived_log where applied='YES' and dest_id=2;


In standby:

Verify the archivelog availability and the application of the archivelog file

SQL>select max(sequence#),thread# from v$archived_log group by thread#;
SQL> select name,role,instance,thread#,sequence#,action from gv$dataguard_process;

 

Additionally, Alert logfiles can be verified to confirm the archivelog transfer and archivelog apply in standby

标签:Switchover,18c,log,database,standby,primary,2485237.1,SQL,redo
来源: https://www.cnblogs.com/zylong-sys/p/12040628.html