这是一个基本问题,但是,我试图使用Apache Spark服务中Analytics上的Bluemix笔记本中Scala中的代码检索文件的内容,并且关于认证的错误不断弹出。有人有一个用于访问文件的Scala认证示例吗?先谢谢你!Bluemix Apache Spark服务 - Scala - 读取文件
我尝试以下简单的脚本:
val file = sc.textFile("swift://notebooks.keystone/kdd99.data")
file.take(1)
我也试过:
def setConfig(name:String) : Unit = {
val pfx = "fs.swift.service." + name
val conf = sc.getConf
conf.set(pfx + "auth.url", "hardcoded")
conf.set(pfx + "tenant", "hardcoded")
conf.set(pfx + "username", "hardcoded")
conf.set(pfx + "password", "hardcoded")
conf.set(pfx + "apikey", "hardcoded")
conf.set(pfx + "auth.endpoint.prefix", "endpoints")
}
setConfig("keystone")
我也试着从以前的问题,这个脚本:
import scala.collection.breakOut
val name= "keystone"
val YOUR_DATASOURCE = """auth_url:https://identity.open.softlayer.com
project: hardcoded
project_id: hardcoded
region: hardcoded
user_id: hardcoded
domain_id: hardcoded
domain_name: hardcoded
username: hardcoded
password: hardcoded
filename: hardcoded
container: hardcoded
tenantId: hardcoded
"""
val settings:Map[String,String] = YOUR_DATASOURCE.split("\\n").
map(l=>(l.split(":",2)(0).trim(), l.split(":",2)(1).trim()))(breakOut)
val conf = sc.getConf conf.set("fs.swift.service.keystone.auth.url",settings.getOrElse("auth_url",""))
conf.set("fs.swift.service.keystone.tenant", settings.getOrElse("tenantId", ""))
conf.set("fs.swift.service.keystone.username", settings.getOrElse("username", ""))
conf.set("fs.swift.service.keystone.password", settings.getOrElse("password", ""))
conf.set("fs.swift.service.keystone.apikey", settings.getOrElse("password", ""))
conf.set("fs.swift.service.keystone.auth.endpoint.prefix", "endpoints")
println("sett: "+ settings.getOrElse("auth_url",""))
val file = sc.textFile("swift://notebooks.keystone/kdd99.data")
/* The following line gives errors */
file.take(1)
误差低于:
姓名:org.apache.hadoop.fs.swift.exceptions.SwiftConfigurationException 消息:缺少必需的配置选项:fs.swift.service.keystone.auth.url
编辑
这将是一个Python的好选择。我试过以下,以“火花”作为配置名称为两个不同的文件:
def set_hadoop_config(credentials):
prefix = "fs.swift.service." + credentials['name']
hconf = sc._jsc.hadoopConfiguration()
hconf.set(prefix + ".auth.url", credentials['auth_url']+'/v3/auth/tokens')
hconf.set(prefix + ".auth.endpoint.prefix", "endpoints")
hconf.set(prefix + ".tenant", credentials['project_id'])
hconf.set(prefix + ".username", credentials['user_id'])
hconf.set(prefix + ".password", credentials['password'])
hconf.setInt(prefix + ".http.port", 8080)
hconf.set(prefix + ".region", credentials['region'])
hconf.setBoolean(prefix + ".public", True)
谢谢NSHUKLA – tbuda
我已经用Python版本编辑了这个问题。你能看看吗? – tbuda
对于Python,代码似乎是正确的(您可以参考示例“Analytics Notebooks和Apache Spark”,它具有用于def set_hadoop_config(凭证)的python代码。 我尝试过使用keystone名称的.csv和.txt文件。您是否遇到spark问题,如.data文件中的配置文件,如您所说的与.txt文件一起使用的文件? – NSHUKLA