2013-10-23 43 views
3

我使用HDInsight .NET Hadoop API在asp.net应用程序中提交Map Reduce作业。Hadoop HDInsight .NET SDK API提交作业

using Microsoft.Hadoop.Mapreduce;

var hadoop = Hadoop.Connect();

var result = hadoop.MapReduceJob.ExecuteJob();

//也尝试这一点,但相同的异常

//变种结果= hadoop.MapReduceJob.ExecuteJob(配置);

ExecuteJob()调用失败并在运行时抛出异常。这个世界上的任何人都能够成功地运行这个呼叫。是否可以通过添加更多输入参数或对象(除了Microsoft提供的MapperBase类以外)来自定义Map()函数? Mapper和Reducer Methods中的逻辑可以访问缓存/数据库吗?

回答

1

提交使用HDInsight .NET SDK中的MapReduce工作的样品张贴在这里:

http://www.windowsazure.com/en-us/manage/services/hdinsight/submit-hadoop-jobs-programmatically/#mapreduce-sdk

// Define the MapReduce job 
MapReduceJobCreateParameters mrJobDefinition = new MapReduceJobCreateParameters() 
{ 
    JarFile = "wasb:///example/jars/hadoop-examples.jar", 
    ClassName = "wordcount" 
}; 

mrJobDefinition.Arguments.Add("wasb:///example/data/gutenberg/davinci.txt"); 
mrJobDefinition.Arguments.Add("wasb:///example/data/WordCountOutput"); 

// Get the certificate object from certificate store using the friendly name to identify it 
X509Store store = new X509Store(); 
store.Open(OpenFlags.ReadOnly); 
X509Certificate2 cert = store.Certificates.Cast<X509Certificate2>().First(item => item.FriendlyName == certfrientlyname); 
JobSubmissionCertificateCredential creds = new JobSubmissionCertificateCredential(new Guid(subscriptionID), cert, clusterName); 

// Create a hadoop client to connect to HDInsight 
var jobClient = JobSubmissionClientFactory.Connect(creds); 

// Run the MapReduce job 
JobCreationResults mrJobResults = jobClient.CreateMapReduceJob(mrJobDefinition); 

// Wait for the job to complete 
WaitForJobCompletion(mrJobResults, jobClient); 
+0

请注明您正在使用这些API的.NET命名空间:MapReduceJobCreateParameters,JobSubmissionClientFactory,WaitForJobCompletion – Rajesh

+0

这里记录: –