0
使用Cassandra 1.1.6,Pig 0.10.0和Hadoop 1.1.0,我可以成功运行examples/pig中提供的cassandra中的pig_cassandra示例脚本。Cassandra小猪示例在启用宽行输入时失败
但是当我改变
rows = LOAD 'cassandra://PigTest/SomeApp' USING CassandraStorage();
到:
rows = LOAD 'cassandra://PigTest/SomeApp?widerows=true' USING CassandraStorage();
我收到以下错误:
java.lang.IndexOutOfBoundsException: Index: 8, Size: 2
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
at org.apache.pig.data.DefaultTuple.get(DefaultTuple.java:156)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:579)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:248)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPreCombinerLocalRearrange.getNext(POPreCombinerLocalRearrange.java:126)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
在无论是在本地和MapReduce模式下运行时出现这种情况,或如果我设置PIG_WIDEROW_INPUT = true。
以下猪拉丁文脚本将失败,并显示“widerows = true”参数。
rows = LOAD 'cassandra://PigTest/SomeApp?widerows=true' USING CassandraStorage();
cols = FOREACH rows GENERATE flatten(columns.name);
DUMP cols;
我似乎无法超越此,使用宽行输入时不读取SomeApp列家族中的静态列。其他列家族也存在同样的问题。
谢谢Justen ,这个问题似乎与卡桑德拉1.1.8一直存在。 – Rob
@Rob - 1.1.8中有另一个bug,CassandraStorage可能会在1.1.9/1.2.1中修复。 [CASSANDRA-5098](https://issues.apache.org/jira/browse/CASSANDRA-5098)。 – Justen