我试图使Spark 2.1.0上的Hive 2.1.1在单个实例上工作。我不确定这是否正确。目前我只有一个实例,所以我无法构建群集。Hive on Spark:无法创建火花客户端
当我运行在蜂巢任何插入查询,我得到的错误:
hive> insert into mcus (id, name) values (1, 'ARM');
Query ID = server_20170223121333_416506b4-13ba-45a4-a0a2-8417b187e8cc
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
我怕我没有正确配置,因为我无法找到hdfs dfs -ls /spark/eventlog
下任何星火日志。下面是这是星火有关我的蜂巢-site.xml中的一部分,纱线:
<property>
<name>hive.exec.stagingdir</name>
<value>/tmp/hive-staging</value>
</property>
<property>
<name>hive.fetch.task.conversion</name>
<value>more</value>
</property>
<property>
<name>hive.execution.engine</name>
<value>spark</value>
</property>
<property>
<name>spark.master</name>
<value>spark://ThinkPad-W550s-Lab:7077</value>
</property>
<property>
<name>spark.eventLog.enabled</name>
<value>true</value>
</property>
<property>
<name>spark.eventLog.dir</name>
<value>hdfs://localhost:8020/spark/eventlog</value>
</property>
<property>
<name>spark.executor.memory</name>
<value>2g</value>
</property>
<property>
<name>spark.serializer</name>
<value>org.apache.spark.serializer.KryoSerializer</value>
</property>
<property>
<name>spark.home</name>
<value>/home/server/spark</value>
</property>
<property>
<name>spark.yarn.jar</name>
<value>hdfs://localhost:8020/spark-jars/*</value>
</property>
1)由于我没有配置Hadoop中的fs.default.name
价值,我可以只使用hdfs://localhost:8020
作为文件系统路径配置文件或更改端口为9000(当我将8020更改为9000时,出现同样的错误)?
2)我开始通过start-master.sh
和start-slave.sh spark://ThinkPad-W550s-Lab:7077
火花,是正确的吗?
3)根据本thread,我怎么能检查Spark Executor Memory + Overhead
值,以设定的yarn.scheduler.maximum-allocation-mb
和yarn.nodemanager.resource.memory-mb
值是多少?
yarn.scheduler.maximum-allocation-mb
和yarn.nodemanager.resource.memory-mb
的值远大于spark.executor.memory
。
4)如何修复Failed to create spark client
错误? 非常感谢!
谢谢我找到它们! –