2017-06-06 43 views
1

我想创建一个简单的Apache Spark应用程序,它将使用Datastax Cassandra连接器连接到Cassandra并执行一些操作并获取错误与Spark和Cassandra的SBT应用程序 - 符号类型<none> .package.DataFrame'从类路径中丢失

Symbol 'type <none>.package.DataFrame' is missing from the classpath.

build.sbt

name := "spark-app" 
version := "1.0" 
scalaVersion := "2.11.11" 


libraryDependencies ++= Seq(
    "com.datastax.spark" %% "spark-cassandra-connector" % "2.0.0", 
    "org.apache.spark" %% "spark-core" % "2.1.1" % "provided" 
) 

resolvers += "Spark Packages Repo" at "https://dl.bintray.com/spark-packages/maven" 

我的简单应用:

package com.budgetbakers.be.dwh.spark 
import com.datastax.spark.connector._ 
import org.apache.spark.{SparkConf, SparkContext} 

object Distinct { 
    def main(args: Array[String]): Unit = { 
    val conf = new SparkConf(true) 
     .set("spark.cassandra.connection.host", "127.0.0.1") 

    val sc = new SparkContext(conf) 
    println(sc.cassandraTable("ks", "users").select("gender").distinct().collect().mkString(",")) 
    sc.stop() 
    } 
} 

当我尝试package项目我获得以下编译错误:

[error] /.../Distinct.scala:18: Symbol 'type <none>.package.DataFrame' is missing from the classpath. 
[error] This symbol is required by 'value com.datastax.spark.connector.package.dataFrame'. 
[error] Make sure that type DataFrame is in your classpath and check for conflicting dependencies with `-Ylog-classpath`. 
[error] A full rebuild may help if 'package.class' was compiled against an incompatible version of <none>.package. 
[error]  println(sc.cassandraTable("ks", "users").select("gender").distinct().collect().mkString(",")) 
[error]   ^

我缺少的东西?也许有一些依赖冲突?应用程序的

版本我使用:

  • 卡桑德拉:3.1
  • apache的火花:2.1.1
  • 火花卡桑德拉连接器:2.0.0
  • 阶:2.11
  • SBT: 0.13.15
  • sbt assembly plugin:0.14.0

回答

3

尝试添加spark-sql依赖项以及核心库。为了将来的参考,有示例构建文件here

+0

它不应该是Cassandra Spark连接器的传递依赖项吗?只是好奇。鉴于OP根本不使用Spark SQL,我很难理解为什么依赖应该在OP的'build.sbt'中。 –

+0

它通过包含包来包含它。它必须查找所有类。为什么它不能自动拉入scc,我不记得为什么做出这个决定,也许是为了允许多个spark版本 – RussS