2014-10-20 62 views
1

蜂巢支持范围分区吗?蜂巢中的范围分区

我的意思是做蜂巢支持类似如下:

insert overwrite table table2 PARTITION (employeeId BETWEEN 2001 and 3000) 
select employeeName FROM emp10 where employeeId BETWEEN 2001 and 3000; 

凡表2 & emp10有两列:

employeeName & 雇员

当我运行上面的查询我面临的一个错误:

FAILED: ParseException line 1:56 mismatched input 'BETWEEN' expecting) near 'employeeId' in destination specification 

回答

2

是不可能的。以下是来自Hive documentation的报价:

A table can have one or more partition columns and a separate data directory is created for each distinct value combination in the partition columns

1

没有它不可能。即使我使用单独的计算列类似,

insert overwrite table table2 PARTITION (employeeId_range) select employeeName , employeeId/1000 FROM emp10 where employeeId BETWEEN 2000 and 2999;

这将确保所有值处在同一个分区。 而查询的表,因为我们已经知道的范围内计算,我们可以

select employeeName , employeeId FROM table2 where employeeId_range=2;

因此我们也可以parallelise给定范围的查询。 希望它有帮助。