Hive：将列标题写入本地文件？

我想将查询的结果写入本地文件以及列的名称。

Hive支持吗？

Insert overwrite local directory 'tmp/blah.blah' select * from table_name;

另外，单独的问题：是否StackOverflow是获得Hive帮助的最佳位置？ @Nija，一直很有帮助，但我不打扰他们......

来源

2011-04-13 CMaury

Hive支持写入本地目录。你的语法也适合它。
查看the docs on SELECTS and FILTERS了解更多信息。

我不认为Hive有办法将列的名称写入到您正在运行的查询的文件中。。。我不能肯定地说这不是，但我不知道有什么办法。

我认为对于Hive问题唯一比SO好的地方是the mailing list。

来源

2011-04-14 02:28:15 Nija

尝试

set hive.cli.print.header=true;

来源

2011-11-26 20:28:33 iggy

有没有一种方法可以永久地将此设置为默认值，而不必在每个配置单元shell和/或命令调用时指定此设置？ – 2012-10-01 22:10:06

+13

我试过了;它会将标题输出到控制台，而不是本地文件。。。 – maverick 2012-11-09 21:42:04

@JD是的，只是把它放到你的主目录中的'.hiverc'文件中 – wlk 2013-09-16 14:38:44

当然可以。将set hive.cli.print.header=true;放入主目录中的.hiverc文件或任何其他配置单元用户属性文件中。

模糊警告：小心，因为这已经使我的查询在过去崩溃（但我不记得原因）。

来源

2012-10-10 18:38:10

属性hive.cli.print.header = true不适用于“插入覆盖本地目录”命令。它运作，如果我们运行'蜂巢-e'选择..'> Out.tsv' – Munesh 2016-07-30 00:52:18

的确，@ nija的回答是正确的 - 至少据我所知。在执行insert overwrite into [local] directory ...（无论使用本地还是不使用）时，没有任何方法来编写列名。

至于由@ user1735861描述的崩溃，有在蜂房0.7.1（固定在0.8.0）一个已知的错误是，这样做后set hive.cli.print.header=true;，导致NullPointerException任何HQL命令/查询不产生输出。例如：

 
$ hive -S 
hive> use default; 
hive> set hive.cli.print.header=true; 
hive> use default; 
Exception in thread "main" java.lang.NullPointerException 
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:222) 
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:287) 
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:517) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:616) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:197)

虽然这是好的：

 
$ hive -S 
hive> set hive.cli.print.header=true; 
hive> select * from dual; 
c 
c 
hive>

非HQL命令都很好，但（set，dfs!，等...）

此处了解详情：https://issues.apache.org/jira/browse/HIVE-2334

来源

2012-10-26 15:04:34 Hercynium

不是一个很好的解决方案，但这里是我所做的：

create table test_dat 
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t" STORED AS 
INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat" 
OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" 
LOCATION '/tmp/test_dat' as select * from YOUR_TABLE; 

hive -e 'set hive.cli.print.header=true;select * from YOUR_TABLE limit 0' > /tmp/test_dat/header.txt 

cat header.txt 000* > all.dat

来源

2013-03-20 17:19:36 Jeremy

这可能会很慢 – OneSolitaryNoob 2014-10-01 21:33:09

我今天遇到了这个问题，并能够通过在原始查询和创建标题行的新的虚拟查询之间进行UNION ALL来获得所需的内容。我在每个部分添加了一个排序列，并将标题设置为0，将数据设置为1，以便我可以按该字段进行排序，并确保标题行排在最前面。

create table new_table as 
select 
    field1, 
    field2, 
    field3 
from 
(
    select 
    0 as sort_col, --header row gets lowest number 
    'field1_name' as field1, 
    'field2_name' as field2, 
    'field3_name' as field3 
    from 
    some_small_table --table needs at least 1 row 
    limit 1 --only need 1 header row 
    union all 
    select 
    1 as sort_col, --original query goes here 
    field1, 
    field2, 
    field3 
    from 
    main_table 
) a 
order by 
    sort_col --make sure header row is first

这是有点笨重，但至少你可以得到你需要的一个单一的查询。

希望这会有所帮助！

来源

2014-08-09 01:40:30 McLeodComputing

如果col值是布尔值，数组等等，这将失败。 – amrk7 2016-09-12 14:30:51

Hive：将列标题写入本地文件？

回答

相关问题