2016-08-03 95 views
0

每当我尝试Apache Spark数据分析的设置过程时,都会出现此错误。IBM Bluemix set_hadoop_config错误

def set_hadoop_config(credentials): 
    prefix = "fs.swift.service." + credentials['name'] 
    hconf = sc._jsc.hadoopConfiguration() 
    hconf.set(prefix + ".auth.url", credentials['auth_url']+'/v3/auth/tokens') 
    hconf.set(prefix + ".auth.endpoint.prefix", "endpoints") 
    hconf.set(prefix + ".tenant", credentials['project_id']) 
    hconf.set(prefix + ".username", credentials['user_id']) 
    hconf.set(prefix + ".password", credentials['password']) 
    hconf.setInt(prefix + ".http.port", 8080) 
    hconf.set(prefix + ".region", credentials['region']) 
    hconf.setBoolean(prefix + ".public", True) 

credentials['name'] = 'keystone' 
set_hadoop_config(credentials) 

--------------------------------------------------------------------------- 
NameError         Traceback (most recent call last) 
<ipython-input-6-976c35e1d85e> in <module>() 
----> 1 credentials['name'] = 'keystone' 
     2 set_hadoop_config(credentials) 

NameError: name 'credentials' is not defined 

有谁知道如何解决这个问题?我坚持

回答

1

我认为你缺少凭据字典即你应该通过值的访问对象存储服务的参数如下:

credentials = 
{ 
    'auth_uri':'', 
    'global_account_auth_uri':'', 
    'username':'admin_b055482b7febbd287d9020d65cdd55f5653d0ffb', 
    'password':"XXXXXX", 
    'auth_url':'https://identity.open.softlayer.com', 
    'project':'object_storage_e5e45537_ea14_4d15_b90a_5fdd271ea402', 
    'project_id':'7d7e5f2a83fe47e586b91f459d47169f', 
    'region':'dallas', 
    'user_id':'001c394e06d74b86a76a786615e358e2', 
    'domain_id':'2df6373c549e49f8973fb6d22ab18c1a', 
    'domain_name':'639347', 
    'filename':'2015_SQL.csv', 
    'container':'notebooks', 
    'tenantId':'s322-e1e9acad6196b9-a1259eb961e2' 
} 

如果您使用的是笔记本电脑,你可以得到上述对数据源面板(右侧)下列出的文件使用“插入到代码”。

访问该文件,则需要斯威夫特URI如下:

raw_data = sc.textFile("swift://" + credentials['container'] + "." + credentials['name'] + "/" + credentials['filename']) 
+0

谢谢!你帮我解决了这个问题。 –