其他分享
首页 > 其他分享> > 拷贝分区统计信息

拷贝分区统计信息

作者:互联网

场景:
分区表按月分区,
例如:11月30日建12月的分区, 11月30日22:00收集统计信息时,12月分区数据为0,造成12月1日早晨业务上来的时候走错执行计划。
分区内数据随业务线性增长,无法提前铺底收集统计信息。 涉及改表分区列为谓词的sql较多,不适合逐条使用hint或者sql profile修复。
如上场景可以考虑拷贝分区统计信息。

(1)拷贝后新分区统计信息的最大值和最小值会按照规则自动设定。
(2)即使原分区的统计信息上有列的直方图,copy时不会拷贝直方图信息到新的分区。
(3)使用copy的方法不同于统计信息的导入导出,不会锁定统计信息,不影响后续收集。
(4)针对不同的表和分区可能有未知影响,需开发测试并确认后可实施。


介绍和测试如下:


Concept:

COPY_TABLE_STATS Procedure
This procedure copies the statistics of the source [sub] partition to the destination [sub] partition.
It also copies statistics of all dependent object such as columns and local indexes.
If the statistics forsource are not available then nothing is copied.
It can optionally scale the statistics (such as the number of blks, or number of rows) based on the given scale_factor.
【存储过程可以将一个分区的统计信息拷贝到另外一个分区,也包含相关依赖对象的统计信息,例如列统计信息和本地索引的统计信息。
可以按照比例扩容拷贝后统计信息的规模,这里不做研究。】


Usage Notes

This procedure updates the minimum and maximum values of destination partition for the first partitioning column as follows:
If the partitioning type is HASH the minimum and maximum values of the destination partition are same as that of the source partition.
【对于hash分区,copy后的统计信息分区列最大值和最小值与原分区相同】。

If the partitioning type is LIST then
--if the destination partition is a NOT DEFAULT paritition then
----the minimum value of the destination partition is set to the minimum value of the value list that describes the destination partition
----the maximum value of the destination partition is set to the maximum value of the value list that describes the destination partition
--alternatively, if the destination partition is a DEFAULT paritition, then
----the minimum value of the destination partition is set to the minimum value of the source partition
----the maximum value of the destination partition is set to the maximum value of the source partition
【对于list分区,
如果拷贝的目标分区不是default分区,分区列统计信息最小值设置为分区定义的最小值,最大值设置为分区定义的最大值。(测试不拷贝直方图)。
如果拷贝的目标分区 是default分区,分区列统计信息最小值设置为源分区的最小值, 最大值设置为源分区的最大值。】

If the partitioning type is RANGE then
--the minimum value of the destination partition is set to the high bound of previous partition
--the maximum value of the destination partition is set to the high bound of the destination partition
unless the high bound of the destination partition is MAXVALUE,
in which case the maximum value of the destination partition is set to the high bound of the previous partition
【对于range分区:
拷贝后分区列的最小值将设置为上一个分区的上限值。 最大值将设置为目标分区定义的最大值。】


Note that if the destination partition is the first partition then minimum values are equal to maximum values.
The procedure is is extended to handle the second and subsequent key columns with two additional rules (which do not apply to the first key column) :
If the source partition column's minimum value is equal to its maximum value,
and both are equal to the source partition's lower bound, and it has a single distinct value,
then the destination partition column's minimum and maximum values are both set to the destination partition's lower bound.

If the above condition does not apply,
the destination partition column's maximum value is
set to the greater of the destination partition upper bound and
the source partition column's maximum value, with the following exception:
if the destination partition is D and its preceding partition is D-1 and the key column to be adjusted is Cn,
the maximum value for Cn is set to the upper bound of D (ignoring the maximum value of the source partition column) provided
that the upper bounds of the previous key column Cn-1 are the same in partitions D and D-1.

 

DBMS_STATS.COPY_TABLE_STATS (
ownname VARCHAR2,
tabname VARCHAR2,
srcpartname VARCHAR2,
dstpartname VARCHAR2,
scale_factor VARCHAR2 DEFAULT 1,
flags NUMBER DEFAULT,
force BOOLEAN DEFAULT FALSE);

Parameter introduce:
ownname
Schema of the table of source and destination [sub] partitions

tabname
Table name of source and destination [sub] partitions

srcpartname
Source [sub] partition

dtspartname
Destination [sub] partition

scale_factor
Scale factor to scale nblks, nrows etc. in dstpartname

flags
For internal Oracle use (should be left as NULL)

force
When value of this argument is TRUE, copy statistics even if locked

 


如下为测试:
create table TEST_TAB1 (
start_date DATE,
store_id NUMBER
)
PARTITION BY RANGE (start_date)
(
PARTITION TEST_TAB_p1 VALUES LESS THAN (TO_DATE('1-2-2020', 'DD-MM-YYYY')),
PARTITION TEST_TAB_p2 VALUES LESS THAN (TO_DATE('1-3-2020', 'DD-MM-YYYY'))
);


insert into TEST_TAB1 values ( to_date('2020-01-01','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-02','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-03','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-04','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-05','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-06','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-07','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-08','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-09','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-10','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-11','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-12','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-13','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-14','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-15','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-16','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-17','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-18','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-19','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 values ( to_date('2020-01-20','yyyy-mm-dd hh24'), 2);
insert into TEST_TAB1 select * from TEST_TAB1;
insert into TEST_TAB1 select * from TEST_TAB1;
insert into TEST_TAB1 select * from TEST_TAB1;
insert into TEST_TAB1 select * from TEST_TAB1;
insert into TEST_TAB1 select * from TEST_TAB1;
insert into TEST_TAB1 select * from TEST_TAB1;
insert into TEST_TAB1 select * from TEST_TAB1;
/
/
/
commit;

 

exec dbms_stats.gather_table_stats('TEST','TEST_TAB1',CASCADE => TRUE ,degree => 1,NO_INVALIDATE => FALSE, method_opt=>'for columns start_date size 254');

exec dbms_stats.gather_table_stats('TEST','TEST_TAB1',CASCADE => TRUE ,degree => 1,NO_INVALIDATE => FALSE,method_opt=>'for all columns size repeat');

create index idx1 on TEST_TAB1(start_date) global;


set lines 200 pages 2000
col table_name for a15
col COLUMN_NAME for a15
col LOW_VALUE for a20
col HIGH_VALUE for a20
col HISTOGRAM for a10
col partition_name for a15
col Analyzed for a30
SELECT table_name,partition_name,num_rows,blocks,sample_size,to_char(last_analyzed,'yyyy-mm-dd hh24:mi:ss') Analyzed
FROM Dba_Tab_Partitions WHERE table_name='TEST_TAB1';

TABLE_NAME PARTITION_NAME NUM_ROWS BLOCKS SAMPLE_SIZE ANALYZED
--------------- --------------- ---------- ---------- ----------- ------------------------------
TEST_TAB1 TEST_TAB_P1 1310720 3022 1310720 2020-12-02 10:03:26
TEST_TAB1 TEST_TAB_P2 0 0 2020-12-02 10:03:26


SELECT table_name,partition_name,column_name,num_distinct,low_value,HIGH_VALUE,density,HISTOGRAM,last_analyzed
FROM user_part_col_statistics WHERE table_name='TEST_TAB1';

TABLE_NAME PARTITION_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY HISTOGRAM LAST_ANALYZED
--------------- --------------- --------------- ------------ -------------------- -------------------- ---------- ---------- -------------------
TEST_TAB1 TEST_TAB_P1 START_DATE 20 78780101010101 78780114010101 3.7101E-07 FREQUENCY 2020-12-02 10:19:06
TEST_TAB1 TEST_TAB_P1 STORE_ID 1 C103 C103 1 NONE 2020-12-02 10:16:53
TEST_TAB1 TEST_TAB_P2 START_DATE 0 0 NONE 2020-12-02 10:19:06
TEST_TAB1 TEST_TAB_P2 STORE_ID 0 0 NONE 2020-12-02 10:16:53


DBMS_STATS.COPY_TABLE_STATS (
ownname VARCHAR2,
tabname VARCHAR2,
srcpartname VARCHAR2,
dstpartname VARCHAR2,
scale_factor VARCHAR2 DEFAULT 1,
flags NUMBER DEFAULT,
force BOOLEAN DEFAULT FALSE);


begin
DBMS_STATS.COPY_TABLE_STATS (
ownname =>'TEST',
tabname =>'TEST_TAB1',
srcpartname =>'TEST_TAB_P1',
dstpartname =>'TEST_TAB_P2');
end;
/


set lines 200 pages 2000
col table_name for a15
col COLUMN_NAME for a15
col LOW_VALUE for a20
col HIGH_VALUE for a20
col HISTOGRAM for a10
col partition_name for a15
col Analyzed for a30
SELECT table_name,partition_name,num_rows,blocks,sample_size,to_char(last_analyzed,'yyyy-mm-dd hh24:mi:ss') Analyzed
FROM Dba_Tab_Partitions WHERE table_name='TEST_TAB1';

TABLE_NAME PARTITION_NAME NUM_ROWS BLOCKS SAMPLE_SIZE ANALYZED
--------------- --------------- ---------- ---------- ----------- ------------------------------
TEST_TAB1 TEST_TAB_P1 1310720 3022 1310720 2020-12-02 10:19:06
TEST_TAB1 TEST_TAB_P2 1310720 3022 1310720 2020-12-02 10:19:06


SELECT table_name,partition_name,column_name,num_distinct,low_value,HIGH_VALUE,density,HISTOGRAM,last_analyzed
FROM user_part_col_statistics WHERE table_name='TEST_TAB1';

TABLE_NAME PARTITION_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY HISTOGRAM LAST_ANALYZED
--------------- --------------- --------------- ------------ -------------------- -------------------- ---------- ---------- -------------------
TEST_TAB1 TEST_TAB_P1 START_DATE 20 78780101010101 78780114010101 3.7101E-07 FREQUENCY 2020-12-02 10:19:06
TEST_TAB1 TEST_TAB_P1 STORE_ID 1 C103 C103 1 NONE 2020-12-02 10:16:53
TEST_TAB1 TEST_TAB_P2 START_DATE 20 78780201010101 78780301010101 .05 NONE 2020-12-02 10:19:06
TEST_TAB1 TEST_TAB_P2 STORE_ID 1 C103 C103 1 NONE 2020-12-02 10:16:53


select UTL_RAW.CAST_TO_VARCHAR2('78780201010101') from dual;
select utl_raw.cast_to_number('78780201010101') from dual;
【raw为Oracle的内部类型,通过utl函数可以尝试将varchar类型和number类型转换后的raw类型反编译回去,
但日期和时间戳类型暂无相应工具,无法实际查看最大值最小值】

【复制的统计信息直方图是不复制的】


explain plan for select * from TEST_TAB1 where START_DATE = to_date('2020-02-11','yyyy-mm-dd');
拷贝前:

----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 1 | 12 | 2 (0)| 00:00:01 | 2 | 2 |
|* 2 | TABLE ACCESS FULL | TEST_TAB1 | 1 | 12 | 2 (0)| 00:00:01 | 2 | 2 |
----------------------------------------------------------------------------------------------------

拷贝后:
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 65536 | 704K| 4 (0)| 00:00:01 | | |
| 1 | TABLE ACCESS BY GLOBAL INDEX ROWID| TEST_TAB1 | 65536 | 704K| 4 (0)| 00:00:01 | 2 | 2 |
|* 2 | INDEX RANGE SCAN | IDX1 | 1 | | 3 (0)| 00:00:01 | | |
----------------------------------------------------------------------------------------------------------------

exec dbms_stats.gather_table_stats('TEST','TEST_TAB1',CASCADE => TRUE ,degree => 1,NO_INVALIDATE => FALSE,method_opt=>'for columns start_date size repeat');
【使用copy的方法不同于统计信息的导入导出,不会锁定统计信息,不影响后续收集。】

标签:insert,01,TAB1,partition,2020,TEST,拷贝,分区,统计
来源: https://www.cnblogs.com/nathon-wang/p/14112033.html