2016-08-14 31 views
0

所以,后台优先。我正在运行Ubuntu 14.04并运行以下脚本而没有问题,其中我put文件在EC2实例上。 注意:我使用相同的IP来说明成功和失败,但假装我从两次开始运行此脚本,每次都生成新的IP。EC2上的Python结构并行执行失败:已更新

import boto.ec2 
import os 
from fabric.api import run, parallel, env, sudo 
from fabric.tasks import execute 
from fabric.operations import put 

# file path python scripts and data 
rps_file = "review_page_scraper.py" 


# make sure hosts are clear before we add to them 
env.hosts = [] 


# how many instances to start and how to split up the data frame 
num_instances = 3 


# EC2 access keys 
access_key = 'my_access_key' 
secret_key = 'my_secret_key' 


# get a connection to the east region 
conn = boto.ec2.connect_to_region("us-east-1", 
            aws_access_key_id=access_key, 
            aws_secret_access_key=secret_key) 


# create the reservation of instances 
reservation = conn.run_instances('my_ami_id', 
           key_name='original_key', # my original key 
           security_groups=['my_sec_grp'], 
           instance_type='t2.micro', 
           min_count=num_instances, 
           max_count=num_instances) 


# get list of instances 
instance_lst = reservation.instances 


# get a status update and wait if the instance isn't up and running yet 
for instance in instance_lst: 
    while instance.state != "running": 
     sleep(5) 
     instance.update() 
    print "%s is running" % instance.ip_address 


# get username and host, add 'ubuntu' as username 
hosts = ["[email protected]" + ip.ip_address for ip in instance_lst] 
env.hosts = hosts # set environment variable 


@parallel 
def upload_scripts_data(file_name): 
    path = "~/amazon_proj/amazon/" 
    put(path + file_name, "~") # put it in the home dir of EC2 instance 


# execute functions w/ rps_file 
execute(upload_scripts_data, rps_file) # send review_page_scraper helpers 

下面是输出:

In [25]: execute(upload_scripts_data, rps_file) # send review_page_scraper helpers 
[[email protected]] Executing task 'upload_scripts_data' 
[[email protected]] Executing task 'upload_scripts_data' 
[[email protected]] Executing task 'upload_scripts_data' 
[[email protected]] put: /home/rerwin21/amazon_proj/amazon/review_page_scraper.py -> /home/ubuntu/review_page_scraper.py 
[[email protected]] put: /home/rerwin21/amazon_proj/amazon/review_page_scraper.py -> /home/ubuntu/review_page_scraper.py 
[[email protected]] put: /home/rerwin21/amazon_proj/amazon/review_page_scraper.py -> /home/ubuntu/review_page_scraper.py 
Out[25]: 
{u'[email protected]': None, 
u'[email protected]': None, 
u'[email protected]': None} 

现在,问题: 我毁了我的Ubuntu的安装和使用ssh-keygen -t rsa,我把它叫做 'original_key' 失去了我的密钥对,生成的时候我将公钥导入AWS。所以,我不得不重新安装Ubuntu,我选择了16.04。我使用ssh-keygen -t rsa生成了一个新密钥,并分别将其保存到~/.ssh/id_rsa和〜/ .ssh/id_rsa.pub,分别用于私钥和公钥。

然后我导入了公钥并用“id_rsa_pub”名称保存。所以,现在我运行上面的同一个脚本,将key_name参数更改为“id_rsa_pub”。另外,我根据AWS的指示跑了chmod 0400 id_rsa。该ouptut是:

In [22]: execute(upload_scripts_data, rps_file) 
[[email protected]] Executing task 'upload_scripts_data' 
[[email protected]] Executing task 'upload_scripts_data' 
[[email protected]] Executing task 'upload_scripts_data' 
!!! Parallel execution exception under host u'[email protected]': 
Process [email protected]: 
Traceback (most recent call last): 
File "/home/rerwin21/anaconda2/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap 
self.run() 
File "/home/rerwin21/anaconda2/lib/python2.7/multiprocessing/process.py", line 114, in run 
self._target(*self._args, **self._kwargs) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/tasks.py", line 242, in inner 
submit(task.run(*args, **kwargs)) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/tasks.py", line 174, in run 
return self.wrapped(*args, **kwargs) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/decorators.py", line 181, in inner 
Traceback (most recent call last): 

File "<ipython-input-22-ed18eb00cc62>", line 1, in <module> 
execute(upload_scripts_data, rps_file) 

File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/tasks.py", line 412, in execute 
ran_jobs = jobs.run() 

File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/job_queue.py", line 168, in run 
self._fill_results(results) 

File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/job_queue.py", line 191, in _fill_results 
datum = self._comms_queue.get_nowait() 

File "/home/rerwin21/anaconda2/lib/python2.7/multiprocessing/queues.py", line 152, in get_nowait 
return self.get(False) 

File "/home/rerwin21/anaconda2/lib/python2.7/multiprocessing/queues.py", line 135, in get 
res = self._recv() 

TypeError: ('__init__() takes exactly 2 arguments (3 given)', <class 'paramiko.ssh_exception.NoValidConnectionsError'>, (None, 'Unable to connect to port 22 on or 54.165.186.168')) 

    return func(*args, **kwargs) 
File "<ipython-input-19-ed4344124d24>", line 4, in upload_scripts_data 
put(path + file_name, "~") 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/network.py", line 677, in host_prompting_wrapper 
return func(*args, **kwargs) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/operations.py", line 345, in put 
ftp = SFTP(env.host_string) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/sftp.py", line 33, in __init__ 
self.ftp = connections[host_string].open_sftp() 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/network.py", line 159, in __getitem__ 
self.connect(key) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/network.py", line 151, in connect 
user, host, port, cache=self, seek_gateway=seek_gateway) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/network.py", line 603, in connect 
raise NetworkError(msg, e) 
NetworkError: Low level socket error connecting to host 54.165.186.168 on port 22: Unable to connect to port 22 on or 54.165.186.168 (tried 1 time) 
[[email protected]] put: /home/rerwin21/amazon_proj/amazon/review_page_scraper.py -> /home/ubuntu/review_page_scraper.py 
!!! Parallel execution exception under host u'[email protected]': 
Process [email protected]: 
Traceback (most recent call last): 
File "/home/rerwin21/anaconda2/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap 
self.run() 
File "/home/rerwin21/anaconda2/lib/python2.7/multiprocessing/process.py", line 114, in run 
self._target(*self._args, **self._kwargs) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/tasks.py", line 242, in inner 
submit(task.run(*args, **kwargs)) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/tasks.py", line 174, in run 
return self.wrapped(*args, **kwargs) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/decorators.py", line 181, in inner 
return func(*args, **kwargs) 
File "<ipython-input-19-ed4344124d24>", line 4, in upload_scripts_data 
put(path + file_name, "~") 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/network.py", line 677, in host_prompting_wrapper 
return func(*args, **kwargs) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/operations.py", line 345, in put 
ftp = SFTP(env.host_string) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/sftp.py", line 33, in __init__ 
self.ftp = connections[host_string].open_sftp() 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/network.py", line 159, in __getitem__ 
self.connect(key) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/network.py", line 151, in connect 
user, host, port, cache=self, seek_gateway=seek_gateway) 
File "/home/rerwin21/anaconda2/lib/python2.7/site-packages/fabric/network.py", line 603, in connect 
raise NetworkError(msg, e) 
NetworkError: Timed out trying to connect to 54.173.57.59 (tried 1 time) 

我对这样一个详细的问题和输出道歉。我用尽了我的知识,没有看到任何与我的问题完全相同的在线内容。

更新:
有几件事要注意。我正在使用与旧密钥对一样使用的安全组和AMI。接下来,更令人困惑的是,如果我再次运行execute(upload_scripts_data, rps_file)命令并且它没有错误地运行。

+0

检查安全组,例如使用ip'54.173.57.59'。它允许端口22上的传入TCP连接?它运行的是ssh守护进程吗? –

+0

谢谢@NehalJWani,我会回复上面的更新,请看看。 –

回答

0

UPDATE这并没有解决它。解决此问题的唯一方法是再次运行并行命令。

非常尴尬,但我必须发布。正如我所提到的,我被迫完全重新安装Ubuntu,并丢失了我的密钥对。我忽略的是我的ssh_config文件。要解决该问题:

sudo gedit 

一旦gedit中,取消22端口:

# Port 22 

Port 22 

保存,现在我重新启动和运行!