2
假设字段time
看起来像2013-01-01T00:00:00.000Z
,piggybank.jar
已经被导入,并命令EXTRACT
已经被定义(DEFINE EXTRACT org.apache.pig.piggybank.evaluation.string.EXTRACT();)什么是提取字段year, month, day, hour, minute, second
的最佳方式?这就是我迄今为止所做的:处理与正则表达式的日期在Apache的猪
data = FOREACH data GENERATE FLATTEN(EXTRACT(time, '(\\d+)-(\\d+)-(\\d+)T(\\d+):(\\d+):(\\d+).(\\s+)'))
AS (
year: int,
month: int,
day: int,
hour: int,
minute: int,
second: int,
tail: chararray
);