星火CSV数据验证失败的日期和时间戳记数据类型蜂巢

蜂巢表模式：星火CSV数据验证失败的日期和时间戳记数据类型蜂巢

c_date     date           
c_timestamp    timestamp

它是文本表

蜂巢表数据：后得到

hive> select * from all_datetime_types; 
OK 
0001-01-01 0001-01-01 00:00:00.000000001 
9999-12-31 9999-12-31 23:59:59.999999999

CSV火花招聘：

c_date,c_timestamp 
0001-01-01 00:00:00.0,0001-01-01 00:00:00.0 
9999-12-31 00:00:00.0,9999-12-31 23:59:59.999

个

问题：

00:00:00.0日期类型添加
时间戳被截断，以毫秒为单位精确

有用的代码：

SparkConf conf = new SparkConf(true).setMaster("yarn-cluster").setAppName("SAMPLE_APP"); 
SparkContext sc = new SparkContext(conf); 
HiveContext hc = new HiveContext(sc); 
DataFrame df = hc.table("testdb.all_datetime_types"); 
df.printSchema(); 
DataFrameWriter writer = df.repartition(1).write(); 
writer.format("com.databricks.spark.csv").option("header", "true").save(outputHdfsFile);

我知道dateFormat选项。但date和timestamp列在Hive中可以有不同的格式。

我可以简单地将所有列转换为字符串吗？

来源

2017-03-23 dev ツ

您可以使用spark中的timestampFormat选项指定您的时间戳记格式。

spark.read.option("timestampFormat", "MM/dd/yyyy h:mm:ss a").csv("path")

来源

2017-03-24 08:49:02 raam86

感谢您的回复！但我不能像我在问题末尾提到的那样使用硬编码的时间戳格式。 –

有没有办法在火花1.6中存储纳秒时间戳？ –

你提到的日期和时间戳有不同的格式，你可以同时使用它们 – raam86

星火CSV数据验证失败的日期和时间戳记数据类型蜂巢

回答

相关问题