2014-02-06 59 views
0

Python新手,我试图自动从Google下载图片。我想输入一个关键字,然后让我的程序自动运行并将图像从Google下载/保存到一个文件夹中,以便在我的计算机上可用。这里是我的代码:从Python下载图片时出现Python错误?

import json 
import os 
import time 
import requests 
from PIL import Image 
from StringIO import StringIO 
from requests.exceptions import ConnectionError 


def go(query, path): 

BASE_URL = 'https://ajax.googleapis.com/ajax/services/search/images?'\ 
     'v=1.0&q=' + query + '&start=%d' 

BASE_PATH = os.path.join(path, query) 

if not os.path.exists(BASE_PATH): 
os.makedirs(BASE_PATH) 

start = 0 # Google's start query string parameter for pagination. 
while start < 60: # Google will only return a max of 56 results. 
r = requests.get(BASE_URL % start) 
for image_info in json.loads(r.text)['responseData']['results']: 
    url = image_info['unescapedUrl'] 
    try: 
    image_r = requests.get(url) 
    except ConnectionError, e: 
    print 'could not download %s' % url 
    continue 

    # Remove file-system path characters from name. 
    title = image_info['titleNoFormatting'].replace('/', '').replace('\\', '') 

    file = open(os.path.join(BASE_PATH, '%s.jpg') % title, 'w') 
    try: 
    Image.open(StringIO(image_r.content)).save(file, 'JPEG') 
    except IOError, e: 
    # Throw away some gifs 
    print 'could not save %s' % url 
    continue 
    finally: 
    file.close() 

print start 
start += 4 # 4 images per page. 


time.sleep(1.5) 

示例使用

去( '愤怒的人脸', 'mydirectory中')

但我不断收到错误说:

file = open(os.path.join(BASE_PATH, '%s.jpg') % title, 'w') 
IOError: [Errno 22] invalid mode ('w') or 
filename: u'myDirectory\\landscape\\Nature - Landscapes - Views - Desktop Wallpapers | MIRIADNA..jpg' 

怎么办我需要解决这个问题吗?请帮忙!对此,我真的非常感激。

回答

1
filename: u'... - Desktop Wallpapers | MIRIADNA..jpg' 
            ^This is a problem 

Windows不允许在文件名中使用管道字符(|)。

http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx

以下保留字符:

  • <(小于)
  • >(大于)
  • :(冒号)
  • “(双报价)
  • /(正斜杠)
  • \(反斜杠)
  • | (竖条或管道)
  • ? (问号)
  • *(星号)

在你的情况,保留字符出现在您下载并随后使用您的文件名图片的标题。你可以很容易地去掉这些字符,例如:

title = ''.join('%s' % lett for lett in [let for let in title if let not in '<>:"/\|?*']) 
+0

但是没有管道字符? – user3105664

+1

@ user3105664是的,'| miranda.jpg' – TankorSmash

+0

但是在代码本身中,我没有包含任何管道字符。当我运行该程序时,它返回了一个错误 – user3105664