0
Hive是否能够根据查询中的分组语句将查询写出到不同的文件(例如不同的.csv文件)?通过Hive中的分组写出表
例如用于玩具数据集extract
:
LName FName Car_make Year
----- ----- -------- ----
Smith Audrey Ford 2000
Smith Audrey Ford 2013
Smith Audrey Toyota 1996
Miller Heath Ford 1995
Miller Heath Dodge 1990
Miller Heath Dodge 2010
想写出使用一组由数据集:
INSERT OVERWRITE LOCAL DIRECTORY '/user/drwho/foodf'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
SELECT
LNAME,
FNAME,
CAR_MAKE,
AVG(YEAR) AS AVERG
FROM EXTRACT
GROUP BY LNAME, FNAME, CAR_MAKE
,并具有输出端置SMITH_AUBREY_FORD.csv,SMITH_AUDREY_TOYOTA.csv等写出到本地目录。这在Hive中可能吗?如果不是,那么猪呢?
编辑:
我发现,虽然这是不可能的蜂巢,我们可以使用@KS Nidhin的建议,编写查询出本地目录,而是用awk:
$ cat extract.txt
Smith,Audrey,Ford,2000
Smith,Audrey,Ford,2013
Smith,Audrey,Toyota,1996
Miller,Heath,Ford,1995
Miller,Heath,Dodge,1990
Miller,Heath,Dodge,2010
$ awk -F "," '{ print > $1"_"$2"_$3".txt" }' extract.txt
$ ls -1
extract.txt
Miller_Heath_Dodge.txt
Miller_Heath_Ford.txt
Smith_Audrey_Ford.txt
Smith_Audrey_Toyota.txt