2014-01-13 16 views
2

这是非常简单的演示,可以在0.11重现问题。猪模式和类型异常

=== testSchemaDATA ===

1_a 
2_b 
3_c 

的第一个脚本:

a = load 'testSchemaDATA' as (str:chararray); 
a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as num; 
a2 = foreach a1 generate (int)num as num; 
dump a2; 

是合适的剧本和转储他回答:

第二个错误的脚本是(唯一的区别是tw Ø脚本是A1声明的架构声明):

a = load 'testSchemaDATA' as (str:chararray); 
a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num,char); 
a2 = foreach a1 generate (int)num as num; 
dump a2; 

举报 错误org.apache.pig.tools.grunt.Grunt - 错误1052: 不能投ByteArray的诠释

我不不知道如何解释这一点。这是一个错误?

回答

0

这将工作:

a = load 'testSchemaDATA' as (str:chararray); 
a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num:int,char:chararray); 
a2 = foreach a1 generate num as num; 
dump a2; 

会给你的输出:

(1) 
(2) 
(3) 

而且

a = load 'testSchemaDATA' as (str:chararray); 
a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num:int,char:chararray); 
a2 = foreach a1 generate char as char; 
dump a2; 

会给你的输出:

(a) 
(b) 
(c) 

区别在于,在这种情况下,您将STRSPLIT的结果明确地转换为int和chararray。如果没有给出,它将默认为bytearray。

如果你a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as num; 然后describe a1

a1: {num: bytearray} 

如果你 a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num,char);然后describe a1给出:

a1: {num: NULL,char: NULL} 

看起来型即将在此情况下为空。我不确定为什么会这样。如果有人可以说,会很好。