2016-04-14 37 views
2

创建数据框后,registerTempTable出现问题。可能的原因是什么?谢谢。sparkSql .registerTempTable:ERROR表未找到

import org.apache.spark.sql.SQLContext 
val sqlContext = new SQLContext(sc) 
import sqlContext.implicits._ 
trainingData.registerTempTable("trainingdata") 
val countResult = sqlContext.sql("SELECT COUNT(*) FROM trainingdata").collect() 

的错误信息是:

了java.lang.RuntimeException:表中找不到: 在组织:在scala.sys.package $ .error(27 package.scala)trainingdata 。 apache.spark.sql.catalyst.analysis.SimpleCatalog.lookupRelation(Catalog.scala:139) at org.apache.spark.sql.catalyst.analysis.Analyzer $ ResolveRelations $ .getTable(Analyzer.scala:257) at org .apache.spark.sql.catalyst.analysis.Analyzer $ ResolveRelations $$ anonfun $ apply $ 7.applyOrElse(Analyzer.scala:268) a t org.apache.spark.sql.catalyst.analysis.Analyzer $ ResolveRelations $$ anonfun $ apply $ 7.applyOrElse(Analyzer.scala:264) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan $$ anonfun $ resolveOperators $ 1.apply(LogicalPlan.scala:57) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan $$ anonfun $ resolveOperators $ 1.apply(LogicalPlan.scala:57) at org.apache .org.pg.sql.catalyst.trees.CurrentOrigin $ .withOrigin(TreeNode.scala:51) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:56) at org。 apache.spark.sql.catalyst.plans.logical.LogicalPlan $$ anonfun $ 1.apply(LogicalPlan.scala:54) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan $$ anonfun $ 1.apply( LogicalPlan.scala:54) at org.apache.spark.sql.catalyst.trees .TreeNode $$ anonfun $ 4.apply(TreeNode.scala:249)

+0

你能分享你DF'trainingData'的来源吗?你从哪里得到它? – FaigB

回答

0

是否有可能您实际上未创建trainingData数据帧。

你需要有像下面这样的语句:

  • 如果从蜂房表

    val trainingData = sqlContext.table(s"libname.tablename") 
    
  • 如果要转换一个序列读取/阵列的数据帧

    val trainingData = Seq((1,2,3,4)).toDF("ce_sp", "ce_sp2", "ce_colour", "ce_sp3") 
    

这里一堆其他的方式来RDD转换为DF:How to convert rdd object to dataframe in spark

0

按照火花版本2及以上的,你不需要导入implicits类,您可以直接运行查询像下面的东西:

val sqlSeason=spark.sql(""" 

    select distinct a.sku,a.season,a.counter from SEASON_UPDATE2 a, 

""") 

sqlSeason.createOrReplaceTempView("SEASON_UPDATE1") 

sqlSeason.show()