数据库
首页 > 数据库> > MaxCompute SQL中insert语句使用详情

MaxCompute SQL中insert语句使用详情

作者:互联网

MaxCompute insert语句使用

本文介绍使用INSERT OVERWRITE和INSERT INTO两种命令更新表数据,主要内容包括:

  1. insert into table ... values ...语句
  2. insert into/overwrite ... select ...语句
  3. Insert多路输出(MULTI INSERT)
  4. 输出到动态分区(DYNAMIC PARTITION)

INSERT VALUES语句

命令使用

INSERT INTO TABLE tablename 
[PARTITION (partcol1=val1, partcol2=val2,...)][(co1name1,colname2,...)] 
[VALUES (col1_value,col2_value,...),(col1_value,col2_value,...),...]

 

注意:通过VALUES写入DATETIME、TIMESTAMP类型数据时,需要在VALUES中指定类型名称,如下所示。 

insert into table srcp (p='abc') values (datetime'2017-11-11 00:00:00',timestamp'2017-11-11 00:00:00.123456789');

1. 特定分区内插入数据:

insert into table sale_detail
partition (sale_date='202003', region='hangzhou')
values ('unique', 'user1', 1000),
  ('adidas', 'user2', 2000),
  ('zara', 'user3', 1500),
  ('veromoda', 'user1', 2000),
  ('nike', 'user4', 1000),
  ('ur', 'user2', 1400),
  ('ochirly', 'user1', 2000),
  ('lily', 'user3', 3000),
  ('onemore', "user2", 1200);

查询插入数据结果如下:

shop_name	customer_id	total_price	sale_date	region
+----------+------------+------------+----------+-------+
unique	user1	1000.0	202003	hangzhou
adidas	user2	2000.0	202003	hangzhou
zara	user3	1500.0	202003	hangzhou
veromoda	user1	2000.0	202003	hangzhou
nike	user4	1000.0	202003	hangzhou
ur	user2	1400.0	202003	hangzhou
ochirly	user1	2000.0	202003	hangzhou
lily	user3	3000.0	202003	hangzhou
onemore	user2	1200.0	202003	hangzhou

2. 非特定分区插入数据:

insert into table sale_detail
partition (sale_date, region)
  (shop_name, customer_id, total_price, sale_date, region)
values
  ('unique', 'user1', 1000, '202001', 'shanghai'),
  ('adidas', 'user2', 2000, '202001', 'shanghai'),
  ('zara', 'user3', 1500, '202001', 'shanghai');

查询插入结果:

shop_name	customer_id	total_price	sale_date	region
+----------+------------+------------+----------+-------+
unique	user1	1000.0	202001	shanghai
adidas	user2	2000.0	202001	shanghai
zara	user3	1500.0	202001	shanghai

VALUES TABLE功能

VALUES TABLE并不仅限于在INSERT语句中使用,任何DML语句都可以使用。VALUES TABLE功能的使用方式如下:

使用示例:

--删除已存在的表srcp。
drop table if exists srcp;
--创建分区表srcp。
create table if not exists srcp (key string,value bigint) partitioned by (p string);
--为表srcp中插入数据。
insert into table srcp partition (p) select concat(a,b), length(a)+length(b),'20170102' from values ('d',4),('e',5),('f',6) t(a,b);
--查询表srcp。
select * from srcp where p='20170102';
+-----+------------+---+
| key | value      | p |
+-----+------------+---+
| d4  | 2          | 20170102 |
| e5  | 2          | 20170102 |
| f6  | 2          | 20170102 |
+-----+------------+---

INSERT into/overwrite语句

在MaxCompute SQL处理数据的过程中,INSERT OVERWRITE/INTO用于将计算的结果保存目标表中。

INSERT OVERWRITE|INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)] [(col1,col2 ...)]
select_statement
FROM from_statement;

使用示例:

1. 计算sale_detail表中不同地区的销售额存入表sale_detail_insert中:

-- 创建目标表sale_detail_insert
create table sale_detail_insert like sale_detail;

-- 给目标表增加分区
alter table sale_detail_insert add partition (
  sale_date='201912',
  region='guangzhou'
);

-- 从源表sale_detail中取出数据插入目标表sale_detail_insert
insert overwrite table sale_detail_insert
partition (sale_date='201912', region='guangzhou')
select shop_name, customer_id, total_price
from sale_detail where region='beijing';

-- 查询插入的数据
select * from sale_detail_insert where region='guangzhou' and sale_date='201912';

查询插入的结果如下:

shop_name	customer_id	total_price	sale_date	region
+----------+------------+------------+----------+-------+
unique	user1	1000.0	201912	guangzhou
adidas	user2	2000.0	201912	guangzhou
zara	user3	1500.0	201912	guangzhou
veromoda	user1	2000.0	201912	guangzhou
nike	user4	1000.0	201912	guangzhou
ur	user2	1400.0	201912	guangzhou
ochirly	user1	2000.0	201912	guangzhou
lily	user3	3000.0	201912	guangzhou
onemore	user2	1200.0	201912	guangzhou

2. 源表与目标表的对应关系依赖于在select子句中列的顺序,而不是表与表之间列名的对应关系。例如如下语句:

insert overwrite table sale_detail_insert partition (sale_date='2018', region='china')
  select customer_id, shop_name, total_price from sale_detail;

此时,会将sale_detail.customer_id的数据插入sale_detail_insert.shop_name,将sale_detail.shop_name的数据插入sale_detail_insert.customer_id

3. 向某个分区插入数据时,分区列不允许出现在select列表中。下面语句报错返回,sale_date,region为分区列,不允许出现在静态分区的insert语句中。

insert overwrite table sale_detail_insert partition (sale_date='2013', region='china')
   select shop_name, customer_id, total_price, sale_date, region  from sale_detail;

4. partition的值只能是常量,不可以出现表达式。以下为错误用法。

insert overwrite table sale_detail_insert partition (sale_date=datepart('2016-09-18 01:10:00', 'yyyy') , region='china')
   select shop_name, customer_id, total_price from sale_detail;

使用动态分区注意事项

如果您需要更新表数据到动态分区,请注意以下事项:

Insert多路输出(MULTI INSERT)

MaxCompute SQL支持在一个语句中将数据插入不同的目标表或者分区中实现多路输出。

命令格式

FROM from_statement
INSERT OVERWRITE | INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)]
select_statement1 [FROM from_statement]
[INSERT OVERWRITE | INTO TABLE tablename2 [PARTITION (partcol1=val3, partcol2=val4 ...)]
select_statement2 [FROM from_statement]]

使用示例:

1. 将表sale_detail的数据插入到sale_detail_multi里的202003年及202004年中hangzhou和beijing的销售记录中:

-- 创建表sale_detail_multi
create table sale_detail_multi like sale_detail;

-- 开启全表扫描,仅此session有效
set odps.sql.allow.fullscan=ture;

-- 将表sale_detail中的数据插入到表sale_detail_multi
from sale_detail
insert overwrite table sale_detail_multi partition (sale_date='202003', region='hangzhou')
select shop_name, customer_id, total_price where region='hangzhou'
insert overwrite table sale_detail_multi partition (sale_date='202004', region='beijing')
select shop_name, customer_id, total_price where region='beijing';

select * from sale_detail_multi;

查询插入的数据如下:

shop_name	customer_id	total_price	sale_date	region
+----------+------------+------------+----------+-------+
unique	user1	1000.0	202003	hangzhou
adidas	user2	2000.0	202003	hangzhou
zara	user3	1500.0	202003	hangzhou
veromoda	user1	2000.0	202003	hangzhou
nike	user4	1000.0	202003	hangzhou
ur	user2	1400.0	202003	hangzhou
ochirly	user1	2000.0	202003	hangzhou
lily	user3	3000.0	202003	hangzhou
onemore	user2	1200.0	202003	hangzhou
unique	user1	1000.0	202004	beijing
adidas	user2	2000.0	202004	beijing
zara	user3	1500.0	202004	beijing
veromoda	user1	2000.0	202004	beijing
nike	user4	1000.0	202004	beijing
ur	user2	1400.0	202004	beijing
ochirly	user1	2000.0	202004	beijing
lily	user3	3000.0	202004	beijing
onemore	user2	1200.0	202004	beijing

 2. 如果同一分区出现多次,如下语句,则报错返回:

from sale_detail
insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
select shop_name, customer_id, total_price
insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
select shop_name, customer_id, total_price;

3. 如果同一张表的不同分区,同时有insert overwrite和insert into操作,如下语句则报错返回:

from sale_detail
insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
select shop_name, customer_id, total_price
insert into table sale_detail_multi partition (sale_date='2011', region='china' )
select shop_name, customer_id, total_price;

输出到动态分区(DYNAMIC PARTITION)

背景信息

在使用INSERT OVERWRITE语句将数据插入到分区表时,MaxCompute提供了如下两种方式:

动态分区语法

INSERT OVERWRITE|INTO TABLE tablename PARTITION (partcol1, partcol2 ...) 
select_statement FROM from_statement;

使用示例:

示例1:将源表中的数据插入到目标表中。在SQL运行之前,无法得知会产生哪些分区。只有在语句运行结束后,才能通过region字段产生的值确定产生的分区。

-- 创建目标表total_revenues
create table total_revenues (revenue bigint) partitioned by (region string);

-- 将源表sale_detail中的数据插入到目标表total_revenues
-- MaxCompute 2.0 支持total_price的自动类型转换,这里MaxCompute 1.0不支持
insert overwrite table total_revenues partition (region)
select cast(total_price as bigint) as revenue, region from sale_detail where region='hangzhou';

select * from total_revenues;

插入数据查询结果如下:

revenue	region
+--------+-------+
1000	hangzhou
2000	hangzhou
1500	hangzhou
2000	hangzhou
1000	hangzhou
1400	hangzhou
2000	hangzhou
3000	hangzhou
1200	hangzhou

示例2:将源表中的数据插入到目标表中。多级分区,指定一级分区sale_date。

insert overwrite table sale_detail_dypart partition (sale_date='2013', region)
select shop_name,customer_id,total_price,region from sale_detail;

示例3:动态分区插入时,动态分区列必须在SELECT列表中,否则会执行失败,例如下面语句。

insert overwrite table sale_detail_dypart partition (sale_date='2013', region)
select shop_name,customer_id,total_price from sale_detail;

示例4:动态分区插入时,不能仅指定低级子分区,而动态插入高级分区,否则会执行失败,例如下面语句。

insert overwrite table sales partition (region='china', sale_date)
select shop_name,customer_id,total_price,sale_date from sale_detail;

标签:insert,分区,MaxCompute,detail,sale,hangzhou,SQL,region
来源: https://blog.csdn.net/yitian_z/article/details/105588335