2016-05-18 92 views
2

我试图在Windows 10安装星火1.6.1,到目前为止,我已经做了以下...winutils火花Windows安装

  1. 下载火花1.6.1,解压到某个目录,然后设置SPARK_HOME
  2. 下载2.11.8阶,解压到某个目录,然后设置SCALA_HOME
  3. 设置_JAVA_OPTION环境变量
  4. 由刚下载的zip目录,然后设置HADOOP_HOME环境变量从https://github.com/steveloughran/winutils.git下载winutils。 (不知道这是否不正确,我无法克隆目录,因为权限被拒绝)。

当我去火花家和运行BIN \火花壳我得到

'C:\Program' is not recognized as an internal or external command, operable program or batch file. 

我必须失去了一些东西,我不明白我怎么可以从Windows反正运行的bash脚本环境。但希望我不需要明白只是为了得到这个工作。我一直在关注这个人的教程 - https://hernandezpaul.wordpress.com/2016/01/24/apache-spark-installation-on-windows-10/。任何帮助,将不胜感激。

回答

3

您需要下载winutils可执行文件,而不是源代码。

你可以下载它here,或者如果你真的想要整个Hadoop发行版,你可以找到2.6.0的二进制文件here。然后,您需要将HADOOP_HOME设置为包含winutils.exe的目录。

另外,确保放入Spark的目录是一个不包含空白的目录,这是非常重要的,否则它将无法工作。

一旦你设置它,你不开始spark-shell.sh,你开始spark-shell.cmd

C:\Spark\bin>spark-shell 
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). 
log4j:WARN Please initialize the log4j system properly. 
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. 
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties 
To adjust logging level use sc.setLogLevel("INFO") 
Welcome to 
     ____    __ 
    /__/__ ___ _____/ /__ 
    _\ \/ _ \/ _ `/ __/ '_/ 
    /___/ .__/\_,_/_/ /_/\_\ version 1.6.1 
     /_/ 

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_91) 
Type in expressions to have them evaluated. 
Type :help for more information. 
Spark context available as sc. 
16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-core-3.2.10.jar." 
16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-api-jdo-3.2.6.jar." 
16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-rdbms-3.2.9.jar." 
16/05/18 19:31:56 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 
16/05/18 19:31:56 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 
16/05/18 19:32:01 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 
16/05/18 19:32:01 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 
16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-core-3.2.10.jar." 
16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-api-jdo-3.2.6.jar." 
16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-rdbms-3.2.9.jar." 
16/05/18 19:32:07 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 
16/05/18 19:32:08 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 
16/05/18 19:32:12 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 
16/05/18 19:32:12 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 
SQL context available as sqlContext. 

scala> 
+0

感谢这么多的帮助,在路径空间,并帮助!我运行spark-shell时遇到了另一个与构建spark有关的错误“Failed to find Spark assembly JAR。 您需要在运行此程序之前构建Spark。”我会检查这一个帮助。我很高兴我发布了这个问题,我没有想到这只是一个空间问题,但它是有道理的,因为强大的命令行解析不会与像这样的实用程序有关 –

+1

@Mike我同意,但这就是我们所拥有的:\ 。 –

+0

嗨Yuval,是32位还是64位winutils?在尝试初始化SQL上下文时,我仍然遇到了一个错误,我试图追赶它,正好在spark-shell启动的过程中。它给了我下面的警告:“你的主机名,DELE-6565解析为一个环回/不可达地址:fe80:0:0:0:0:5efe:c0a8:103%net1,但是我们找不到任何外部IP地址!”然后抛出异常... “java.lang.RuntimeException:java.lang.NullPointerException。”我试图找出这个原因。 –