2014-02-24 18 views
1

你如何测试mongo-hadoop作业?用MrUnit测试Mongo-Hadoop作业的单元

我迄今为止尝试:

public class MapperTest { 

    MapDriver<Object, BSONObject, Text, IntWritable> d; 

    @Before 
    public void setUp() throws IOException { 
     WordMapper mapper = new WordMapper(); 
     d = MapDriver.newMapDriver(mapper); 
    } 

    @Test 
    public void testMapper() throws IOException { 

     BSONObject doc = new BasicBSONObject("sentence", "Two words"); 
     d.withInput(new Text("anykey"), doc); 

     d.withOutput(new Text("Two"), new IntWritable(1)); 
     d.withOutput(new Text("words"), new IntWritable(1)); 

     d.runTest(); 
    } 
} 

其中产生这样的输出:

没有适用于类为io.serializations类org.bson.BasicBSONObject

的Java实现中的conf序列化.lang.IllegalStateException at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:67) at org.apache.hadoop.mrunit。 internal.io.Serialization.copy(Serialization.java:91) at org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:104) at org.apache.hadoop.mrunit.TestDriver。复制(TestDriver.java:608) at org.apache.hadoop.mrunit.TestDriver.copyPair(TestDriver.java:612) at org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:118) at org.apache.hadoop.mrunit.MapDriverBase.withInput(MapDriverBase.java:207) ...

+0

尝试将BSONObject转换为BSONWritable – Archit

+0

您能解决您的问题吗?目前我面临同样的问题。我猜@Archit评论是不goig工作,因为BSONObject是你的输入,而不是输出 –

+0

不,不幸的不是。我们决定沟Hadoop。 :) – marko

回答

1

您需要设置串行器。 示例:mapDriver.getConfiguration()。setStrings(“io.serializations”, “org.apache.hadoop.io.serializer.WritableSerialization”,MongoSerDe.class.getName());

MongoSerDe SRC:https://gist.github.com/lfrancke/01d1819a94f14da171e3

但是我面对错误 “org.bson.io.BasicOutputBuffer.pipe(Ljava/IO/DataOutput中;)I”,而使用此(MongoSerDe)。