2013-10-28 71 views
1

伙计们, 以下Python脚本与EMR工作失败

job state = FAILED 

终止
Last State Change: Access denied checking streaming input path: s3n://elasticmapreduce/samples/wordcount/input/ 

代码:

import boto 
import boto.emr 
from boto.emr.step import StreamingStep 
from boto.emr.bootstrap_action import BootstrapAction 
import time 

S3_BUCKET="mytesetbucket123asdf" 
conn = boto.connect_emr() 

step = StreamingStep(
    name='Wordcount', 
    mapper='s3n://elasticmapreduce/samples/wordcount/wordSplitter.py', 
    reducer='aggregate', 
    input='s3n://elasticmapreduce/samples/wordcount/input/', 
    output='s3n://' + S3_BUCKET + '/wordcount/output/2013-10-25') 

jobid = conn.run_jobflow(
    name="test", 
    log_uri="s3://" + S3_BUCKET + "/logs/", 
    visible_to_all_users="True", 
    steps = [step],) 

state = conn.describe_jobflow(jobid).state 
print "job state = ", state 
print "job id = ", jobid 
while state != u'COMPLETED': 
    print time.localtime() 
    time.sleep(10) 
    state = conn.describe_jobflow(jobid).state 
    print conn.describe_jobflow(jobid) 
    print "job state = ", state 
    print "job id = ", jobid 

print "final output can be found in s3://" + S3_BUCKET + "/output" + TIMESTAMP 
print "try: $ s3cmd sync s3://" + S3_BUCKET + "/output" + TIMESTAMP + " ." 
+0

如果您尝试'input ='s3n://会发生什么elasticmapreduce/samples/wordcount/input',''或'input ='s3n:// elasticmapreduce/samples/wordcount/input/*''而不是? – alko

回答

0

的问题是在某处的Boto ...如果我们指定IAM用户而不是使用角色,工作完美。 EMR支持IAM的课程角色......并且我们测试的IAM角色具有执行任何任务的完全权限,所以它不是错误配置问题...

+0

你可以发布你的解决方案代码吗?谢谢 – ecoe