2016-01-26 169 views
0

我使用IntelliJ创建了一个spark作业,并且我希望它由spark作业服务器加载并运行。为此,我按照此链接中的步骤操作:http://github.com/ooyala/spark-jobserver 而我的火花版本是1.4.0。Spark作业服务器中的Spark作业“java.lang.NoClassDefFoundError:org/apache/spark/sql/SQLContext”的错误

这是我的项目中Scala代码:

import org.apache.spark.sql.DataFrame 
import org.apache.spark.sql.hive.HiveContext 

import org.apache.spark.{SparkConf, SparkContext} 

import scala.collection.mutable.ArrayBuffer 
//spark job server 
import com.typesafe.config.{Config, ConfigFactory} 
import scala.util.Try 
import spark.jobserver.SparkJob 
import spark.jobserver.SparkJobValidation 
import spark.jobserver.SparkJobValid 
import spark.jobserver.SparkJobInvalid 

class hiveSparkRest extends SparkJob { 
    var idCard:String ="" 


    def main(args: Array[String]): Unit = { 


    val sc = new SparkContext("local[4]", "SmartApp") 
    val config = ConfigFactory.parseString("") 

    val results = runJob(sc, config) 
    println("Result is " + results) 


    enterTimesMax(sc, hiveContext) 

    } 


    override def validate(sc: SparkContext, config: Config): SparkJobValidation = { 
    Try(config.getString("input.string")) 
     .map(x => SparkJobValid) 
     .getOrElse(SparkJobInvalid("No input.string config param")) 
    } 

    override def runJob(sc: SparkContext,config: Config): Any = { 
    idCard = config.getString("input.string") 
    enterTimesMax(sc, hiveContext) 
    } 

    def enterTimesMax(sc:SparkContext,hiveContext:HiveContext): Unit = { 
    val hiveContext = new HiveContext(sc) 
    hiveContext.sql("use default") 

    val sqlUrl = "select max(num) from (select idcard,count(1) as num from passenger group by idcard)as t" 

    val idCardArray = hiveContext.sql(sqlUrl).collect() 


    } 
} 

但是,当我执行它,我得到卷曲:(52)空从服务器与此错误火花作业服务器回复:

> job-server[ERROR] Uncaught error from thread [JobServer-akka.actor.default-dispatcher-12] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[JobServer] 
job-server[ERROR] java.lang.NoClassDefFoundError: org/apache/spark/sql/SQLContext 
job-server[ERROR] at java.lang.ClassLoader.defineClass1(Native Method) 
job-server[ERROR] at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) 
job-server[ERROR] at java.lang.ClassLoader.defineClass(ClassLoader.java:615) 
job-server[ERROR] at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) 
job-server[ERROR] at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) 
job-server[ERROR] at java.net.URLClassLoader.access$000(URLClassLoader.java:58) 
job-server[ERROR] at java.net.URLClassLoader$1.run(URLClassLoader.java:197) 
job-server[ERROR] at java.security.AccessController.doPrivileged(Native Method) 
job-server[ERROR] at java.net.URLClassLoader.findClass(URLClassLoader.java:190) 
job-server[ERROR] at java.lang.ClassLoader.loadClass(ClassLoader.java:306) 
job-server[ERROR] at java.lang.ClassLoader.loadClass(ClassLoader.java:247) 
job-server[ERROR] at sql.hiveSparkRest.shadePassenger(hiveSparkRest.scala:62) 
job-server[ERROR] at sql.hiveSparkRest.runJob(hiveSparkRest.scala:56) 
job-server[ERROR] at spark.jobserver.JobManagerActor$$anonfun$spark$jobserver$JobManagerActor$$getJobFuture$4.apply(JobManagerActor.scala:222) 
job-server[ERROR] at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) 
job-server[ERROR] at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) 
job-server[ERROR] at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:42) 
job-server[ERROR] at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) 
job-server[ERROR] at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
job-server[ERROR] at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
job-server[ERROR] at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
job-server[ERROR] at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 
job-server[ERROR] Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.SQLContext 
job-server[ERROR] at java.net.URLClassLoader$1.run(URLClassLoader.java:202) 
job-server[ERROR] at java.security.AccessController.doPrivileged(Native Method) 
job-server[ERROR] at java.net.URLClassLoader.findClass(URLClassLoader.java:190) 
job-server[ERROR] at java.lang.ClassLoader.loadClass(ClassLoader.java:306) 
job-server[ERROR] at java.lang.ClassLoader.loadClass(ClassLoader.java:247) 
job-server[ERROR] ... 22 more 
job-server ... finished with exit code 255 

似乎HiveContext是由spark jar文件支持的spark-assembly-1.4.0-hadoop1.0.4.jar。

+0

你确定你已经添加[火花SQL](http://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10/1.4.0)JAR依赖? –

+0

@FelipeAlmeida Thx的帮助!哪个spark-sql JAR文件是你的意思?我认为Spark类中的spark-assembly-1.4.0-hadoop1.0.4.jar支持类SQLContext。 – Robin

+2

你可以在我给你的链接上找到这个罐子。您需要将它添加到您的依赖文件(pom.xml,build.sbt或同等文件)。 –

回答

0

我不认为ooyala回购是主要的。在维护的repo中,下面的链接显示了使用HiveContext的测试工作。对于SparkHiveJob特质,您需要jobserver-extras jar。

https://github.com/spark-jobserver/spark-jobserver/blob/master/job-server-extras/src/spark.jobserver/HiveTestJob.scala

+0

感谢Norul的帮助。我会尝试。 – Robin

+0

似乎答案是正确的!但是,当我运行我的应用程序(我的斯卡拉类 延伸SparkSqlJob),我得到了以下的回答: { “状态”:“ERROR”, “结果”:“无效的作业类型的上下文中,” } 你能帮我解决这个问题吗? 我提出了一个新问题来跟踪它,谢谢! http://stackoverflow.com/questions/35032545/the-error-invalid-job-type-for-this-context-in-spark-sql-job-with-spark-jobse – Robin

相关问题