Hadoop：在MapReduce中实现嵌套for循环[Java]

我想实现一个统计公式，该公式需要将数据点与所有其他可能的数据点进行比较。例如我的数据集是一样的东西：Hadoop：在MapReduce中实现嵌套for循环[Java]

我需要通过这个文件，如：

for (i=0;i< data.length();i++) 
    for (j=0;j< data.length();j++) 
    Sum +=(data[i] + data[j])

基本上当我通过我的地图功能，让每一行，我需要执行的一些指令还原器中文件的其余部分就像嵌套for循环一样。现在我已经尝试使用分布式缓存，某种形式的ChainMapper，但无济于事。任何想法我如何能做到这一点将非常感激。即使是开箱即用的方式也会有所帮助。

来源

2014-04-30 user3587335

您能否详细说明您的示例，请添加几行然后用一个数据点显示示例 – Sudarshan

就像一个简单的示例，其中10.22是第一个点，15.77是第二个点。因此，i = 0（10.22）和j = 0（10.22），然后是1（15.77），然后是2（16.55），然后是3（9.88）。因此，对于数据集中某个点的每个值，都会遍历数据集中所有剩余的点。 – user3587335

因此，对于文件中的每一行，您需要遍历整个文件，我是否正确理解了该问题？ – Sudarshan

您需要重写Reducer类的运行方法实现。

public void run(Context context) throws IOException, InterruptedException { 
    setup(context); 
    while (context.nextKey()) { 
    //This corresponds to the ones corresponding to i of first iterator 
    Text currentKey = context.getCurrentKey(); 
    Iterator<VALUEIN> currentValue = context.getValues(); 
    if(context.nextKey()){ 
    //You can get the Next Values the ones corresponding to j of you second iterator 
    } 
} 
cleanup(context);

}

，或者如果你没有减速，你可以做同样的映射，以及通过重写

public void run(Context context) throws IOException, InterruptedException { 
setup(context); 
while (context.nextKeyValue()) { 
/*context.nextKeyValue() if invoked again gives you the next key values which is same as the ones you are looking for in the second loop*/ 
} 
cleanup(context);

}

让我知道，如果这帮助。

来源

2014-04-30 08:17:23

谢谢。当我早晨醒来时让我测试它。 – user3587335

Hadoop：在MapReduce中实现嵌套for循环[Java]

回答

相关问题