2016-04-13 47 views
0

我有一个简单的spark工作,它用给定的输入文件中的逗号替换空格。如何以编程方式在yarn-client模式下提交spark应用程序?

当本地提交此作业时(使用IDE并执行内置的jar),它会成功完成,并且当主设备设置为“yarn-client”时,作业会挂起很长时间,并引发以下异常。

我们有一个用例,我们希望以编程方式提交工作,而不是通过构建jar并通过spark-submit提交它。

星火版本:1.6.1 的Hadoop版本:2.7.1

和我在我的POM所有的火花,纱和Hadoop的依赖。

工作失败,原因是以下异常

java.net.ConnectException: Call From spark.node123.com/192.168.2.1 to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 
    at sun.reflect.GeneratedConstructorAccessor13.newInstance(Unknown Source) 
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1480) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1407) 
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) 
    at com.sun.proxy.$Proxy10.getClusterMetrics(Unknown Source) 
    at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:152) 
    at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) 
    at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source) 
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:246) 
    at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:129) 
    at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:129) 
    at org.apache.spark.Logging$class.logInfo(Logging.scala:58) 
    at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:62) 
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:128) 
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) 
    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144) 
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:530) 
    at tardis.platform.TardisContext$.apply(TardisContext.scala:20) 
    at tardis.common.plugins.Heartbeat.isAbleTocreateContext(Heartbeat.scala:45) 
    at tardis.common.plugins.Heartbeat.performAction(Heartbeat.scala:33) 
    at tardis.core.scheduler.jobs.PluginExecutorJob.execute(PluginExecutorJob.scala:40) 
    at org.quartz.core.JobRunShell.run(JobRunShell.java:202) 
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) 
Caused by: java.net.ConnectException: Connection refused 
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) 
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) 
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) 
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) 
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707) 
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) 
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1446) 
    ... 25 more 

回答

1

我不得不添加hadoop和yarn配置,以成功提交yarn-client模式下的应用程序。

0

您不能远程提交您的火花的工作在客户端模式,因为您的计算机必须运行需要大量连接的驱动程序本身。如果您坚持使用此方法,则必须配置防火墙以允许某个端口连接到群集。使用集群模式或从主节点提交它不那么痛苦。

+0

我的用例是以编程方式提交工作(即不使用spark-submit命令).. bdw,问题不在防火墙 – sandyyyy

+0

@sandyyyy您可能需要检查https://github.com/kakao/cuesheet – iboss

相关问题