2015-10-30 55 views
1

我试图从cassandra获取记录并创建rdd。Spark 1.5.1从Cassandra创建RDD(ClassNotFoundException:com.datastax.spark.connector.japi.rdd.CassandraTableScanJavaRDD)

JavaRDD<Encounters> rdd = javaFunctions(ctx).cassandraTable("kesyspace1", "employee", mapRowTo(Employee.class)); 

我越来越对星火提交作业1.5.1

Exception in thread "main" java.lang.NoClassDefFoundError: com/datastax/spark/connector/japi/rdd/CassandraTableScanJavaRDD 
at java.lang.Class.forName0(Native Method) 
at java.lang.Class.forName(Class.java:274) 
at org.apache.spark.util.Utils$.classForName(Utils.scala:173) 
at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:56) 
at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala) 
Caused by: java.lang.ClassNotFoundException: com.datastax.spark.connector.japi.rdd.CassandraTableScanJavaRDD 
at java.net.URLClassLoader$1.run(URLClassLoader.java:366) 
at java.net.URLClassLoader$1.run(URLClassLoader.java:355) 
at java.security.AccessController.doPrivileged(Native Method) 
at java.net.URLClassLoader.findClass(URLClassLoader.java:354) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:425) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:358) 

当前依赖此错误:

<dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-core_2.11</artifactId> 
     <version>1.5.1</version> 
    </dependency> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-sql_2.11</artifactId> 
     <version>1.5.1</version> 
    </dependency> 
    <dependency> 
     <groupId>org.apache.hadoop</groupId> 
     <artifactId>hadoop-client</artifactId> 
     <version>2.7.1</version> 
    </dependency> 
    <dependency> 
     <groupId>com.datastax.spark</groupId> 
     <artifactId>spark-cassandra-connector-java_2.11</artifactId> 
     <version>1.5.0-M2</version> 
    </dependency> 
<dependency> 
    <groupId>com.datastax.cassandra</groupId> 
    <artifactId>cassandra-driver-core</artifactId> 
    <version>3.0.0-alpha4</version> 
</dependency> 

Java代码:

import com.tempTable.Encounters; 
import org.apache.spark.SparkContext; 
import org.apache.spark.api.java.JavaRDD; 
import org.apache.spark.SparkConf; 
import static com.datastax.spark.connector.japi.CassandraJavaUtil.javaFunctions; 
import static com.datastax.spark.connector.japi.CassandraJavaUtil.mapRowTo; 

Long now = new Date().getTime(); 
SparkConf conf = new SparkConf(true) 
    .setAppName("SparkSQLJob_" + now) 
    set("spark.cassandra.connection.host", "192.168.1.75") 
    set("spark.cassandra.connection.port", "9042"); 

SparkContext ctx = new SparkContext(conf); 
JavaRDD<Encounters> rdd = javaFunctions(ctx).cassandraTable("keyspace1", "employee", mapRowTo(Employee.class)); 
System.out.println("rdd count = "+rdd.count()); 

是否有问题版本在依赖关系?
请帮助解决此错误。 在此先感谢。

+0

你使用mvn软件包还是程序集? – eliasah

回答

0

简单的答案是“

你需要的jar文件里面捆绑了所有的依赖

执行人机器应该包含所有相关的jar文件 其类路径

解决方案构建fatJar使用gradle这个:

buildscript { 
    dependencies { 
     classpath 'com.github.jengelman.gradle.plugins:shadow:1.2.2' 
    } 
    repositories { 
     jcenter() 
    } 
} 

apply plugin: 'com.github.johnrengelman.shadow' 

然后请致电"gradle shadowJar"来构建您的jar文件。之后提交你的工作,它应该可以解决你的问题。