无法连接来自sparksql的hive Metastore

配置单元.14 Spark 1.6 。试图从实用的spark连接hive表。我已经把我的hive-site.xml放在spark conf文件夹中。但是当我运行这个代码时，每次连接到底层蜂房Metastore，即Derby。我尝试了很多google搜索，但是我得到的建议是将hive-site.xml放入spark cofiguration文件夹中，我已经这么做了。请有人建议我的解决方案。下面是我的代码无法连接来自sparksql的hive Metastore

仅供参考：我现有的配置单元使用MYSQL作为metastore。

我直接从eclipse运行此代码，不使用spark-submit实用程序。

package org.scala.spark 

import org.apache.spark.SparkContext 
import org.apache.spark.SparkConf 
import org.apache.spark.sql.hive.HiveContext 

object HiveToHdfs { 

def main(args: Array[String]) 
    { 

    val conf=new SparkConf().setAppName("HDFS to Local").setMaster("local") 
    val sc=new SparkContext(conf) 
    val hiveContext=new org.apache.spark.sql.hive.HiveContext(sc) 
    import hiveContext.implicits._ 
    hiveContext.sql("load data local inpath '/home/cloudera/Documents/emp_table.txt' into table employee") 
    sc.stop() 
    } 
}

下面是我的Eclipse错误日志：

16/11/18 22:09:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 
16/11/18 22:09:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 
16/11/18 22:09:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 
16/11/18 22:09:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 
**16/11/18 22:09:06 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY** 
16/11/18 22:09:06 INFO ObjectStore: Initialized ObjectStore 
16/11/18 22:09:06 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 
16/11/18 22:09:06 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 
16/11/18 22:09:07 INFO HiveMetaStore: Added admin role in metastore 
16/11/18 22:09:07 INFO HiveMetaStore: Added public role in metastore 
16/11/18 22:09:07 INFO HiveMetaStore: No user is added in admin role, since config is empty 
16/11/18 22:09:07 INFO HiveMetaStore: 0: get_all_databases 
16/11/18 22:09:07 INFO audit: ugi=cloudera ip=unknown-ip-addr cmd=get_all_databases 
16/11/18 22:09:07 INFO HiveMetaStore: 0: get_functions: db=default pat=* 
16/11/18 22:09:07 INFO audit: ugi=cloudera ip=unknown-ip-addr cmd=get_functions: db=default pat=* 
16/11/18 22:09:07 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table. 
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------ 
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) 
    at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:194) 
    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238) 
    at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:218) 
    at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:208) 
    at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:462) 
    at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:461) 
    at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40) 
    at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330) 
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90) 
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101) 
    at org.scala.spark.HiveToHdfs$.main(HiveToHdfs.scala:15) 
    at org.scala.spark.HiveToHdfs.main(HiveToHdfs.scala) 
Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------ 
    at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612) 
    at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554) 
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508) 
    ... 12 more 
16/11/18 22:09:07 INFO SparkContext: Invoking stop() from shutdown hook

请让我知道，如果在其他任何其他信息还需要加以纠正。

来源

2016-11-19 Kapil Kumar

你可以分享你hive- site.xml –

嗨Nirmal，这是我的hive-site.xml。我将其更改为txt格式 [hive-site.xml]（https://www.dropbox.com/s/gj6mtt07e0po20w/hive-site.txt?dl=0） –

检查目录权限。我看到下面的错误引起：java.lang.RuntimeException：HDFS上的根临时目录：/ tmp/hive应该是可写的。当前权限为：rwx ------ –

检查此链接 - >https://issues.apache.org/jira/browse/SPARK-15118 metastore可能使用MySQL数据库

上面的错误是，

<property> 
    <name>hive.exec.scratchdir</name> 
    <value>/tmp/hive</value> 
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description> 
    </property>

的/ tmp目录/蜂房授予权限

来源

2016-11-19 14:06:16

无法连接来自sparksql的hive Metastore

回答

相关问题