2016-11-19 65 views
0

配置单元.14 Spark 1.6 。试图从实用的spark连接hive表。我已经把我的hive-site.xml放在spark conf文件夹中。但是当我运行这个代码时,每次连接到底层蜂房Metastore,即Derby。我尝试了很多google搜索,但是我得到的建议是将hive-site.xml放入spark cofiguration文件夹中,我已经这么做了。请有人建议我的解决方案。下面是我的代码无法连接来自sparksql的hive Metastore

仅供参考:我现有的配置单元使用MYSQL作为metastore。

我直接从eclipse运行此代码,不使用spark-submit实用程序。

package org.scala.spark 

import org.apache.spark.SparkContext 
import org.apache.spark.SparkConf 
import org.apache.spark.sql.hive.HiveContext 

object HiveToHdfs { 

def main(args: Array[String]) 
    { 

    val conf=new SparkConf().setAppName("HDFS to Local").setMaster("local") 
    val sc=new SparkContext(conf) 
    val hiveContext=new org.apache.spark.sql.hive.HiveContext(sc) 
    import hiveContext.implicits._ 
    hiveContext.sql("load data local inpath '/home/cloudera/Documents/emp_table.txt' into table employee") 
    sc.stop() 
    } 
} 

下面是我的Eclipse错误日志:

16/11/18 22:09:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 
16/11/18 22:09:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 
16/11/18 22:09:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 
16/11/18 22:09:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 
**16/11/18 22:09:06 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY** 
16/11/18 22:09:06 INFO ObjectStore: Initialized ObjectStore 
16/11/18 22:09:06 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 
16/11/18 22:09:06 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 
16/11/18 22:09:07 INFO HiveMetaStore: Added admin role in metastore 
16/11/18 22:09:07 INFO HiveMetaStore: Added public role in metastore 
16/11/18 22:09:07 INFO HiveMetaStore: No user is added in admin role, since config is empty 
16/11/18 22:09:07 INFO HiveMetaStore: 0: get_all_databases 
16/11/18 22:09:07 INFO audit: ugi=cloudera ip=unknown-ip-addr cmd=get_all_databases 
16/11/18 22:09:07 INFO HiveMetaStore: 0: get_functions: db=default pat=* 
16/11/18 22:09:07 INFO audit: ugi=cloudera ip=unknown-ip-addr cmd=get_functions: db=default pat=* 
16/11/18 22:09:07 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table. 
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------ 
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) 
    at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:194) 
    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238) 
    at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:218) 
    at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:208) 
    at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:462) 
    at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:461) 
    at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40) 
    at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330) 
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90) 
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101) 
    at org.scala.spark.HiveToHdfs$.main(HiveToHdfs.scala:15) 
    at org.scala.spark.HiveToHdfs.main(HiveToHdfs.scala) 
Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------ 
    at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612) 
    at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554) 
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508) 
    ... 12 more 
16/11/18 22:09:07 INFO SparkContext: Invoking stop() from shutdown hook 

请让我知道,如果在其他任何其他信息还需要加以纠正。

+0

你可以分享你hive- site.xml –

+0

嗨Nirmal,这是我的hive-site.xml。我将其更改为txt格式 [hive-site.xml](https://www.dropbox.com/s/gj6mtt07e0po20w/hive-site.txt?dl=0) –

+0

检查目录权限。我看到下面的错误 引起:java.lang.RuntimeException:HDFS上的根临时目录:/ tmp/hive应该是可写的。当前权限为:rwx ------ –

回答

0

检查此链接 - >https://issues.apache.org/jira/browse/SPARK-15118 metastore可能使用MySQL数据库

上面的错误是,

<property> 
    <name>hive.exec.scratchdir</name> 
    <value>/tmp/hive</value> 
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description> 
    </property> 

的/ tmp目录/蜂房授予权限

相关问题