39
A
回答
8
在Mac上,有webkit2png,在Linux + KDE上,您可以使用khtml2png。我试过前者,效果很好,听说后者正在使用。
我最近遇到了QtWebKit,它声称是跨平台的(Qt将WebKit放入他们的库中,我想)。但我从来没有尝试过,所以我不能告诉你更多。
QtWebKit链接显示了如何从Python进行访问。你应该至少可以使用子进程对其他进程做同样的事情。
0
你不提你在运行什么样的环境,这使得一个很大的不同,因为没有一个纯Python Web浏览器是能够呈现HTML的。
但是,如果您使用的是Mac,我已经使用webkit2png,并取得了巨大成功。如果没有,正如其他人指出的那样,有很多选择。
5
我不能评论ars的答案,但我实际上得到Roland Tapken's code运行使用QtWebkit,它工作得很好。
只想确认Roland在他的博客上发布的内容在Ubuntu上的效果如何。我们的产品版本最终没有使用他写的任何东西,但我们使用的PyQt/QtWebKit绑定取得了很大的成功。
38
这里有一个简单的解决方案使用的WebKit: http://webscraping.com/blog/Webpage-screenshots-with-webkit/
import sys
import time
from PyQt4.QtCore import *
from PyQt4.QtGui import *
from PyQt4.QtWebKit import *
class Screenshot(QWebView):
def __init__(self):
self.app = QApplication(sys.argv)
QWebView.__init__(self)
self._loaded = False
self.loadFinished.connect(self._loadFinished)
def capture(self, url, output_file):
self.load(QUrl(url))
self.wait_load()
# set to webpage size
frame = self.page().mainFrame()
self.page().setViewportSize(frame.contentsSize())
# render image
image = QImage(self.page().viewportSize(), QImage.Format_ARGB32)
painter = QPainter(image)
frame.render(painter)
painter.end()
print 'saving', output_file
image.save(output_file)
def wait_load(self, delay=0):
# process app events until page loaded
while not self._loaded:
self.app.processEvents()
time.sleep(delay)
self._loaded = False
def _loadFinished(self, result):
self._loaded = True
s = Screenshot()
s.capture('http://webscraping.com', 'website.png')
s.capture('http://webscraping.com/blog', 'blog.png')
33
下面是从各种渠道帮助抓住我的解决方案。它需要完整的网页屏幕截图,并裁剪(可选),并从裁剪后的图像生成缩略图。以下是要求:安装的NodeJS
- :
要求
npm -g install phantomjs
- 安装硒(在你的virtualenv,如果你正在使用)
- 安装imageMagick
- 将幻影添加到系统路径(在窗口中)
import os
from subprocess import Popen, PIPE
from selenium import webdriver
abspath = lambda *p: os.path.abspath(os.path.join(*p))
ROOT = abspath(os.path.dirname(__file__))
def execute_command(command):
result = Popen(command, shell=True, stdout=PIPE).stdout.read()
if len(result) > 0 and not result.isspace():
raise Exception(result)
def do_screen_capturing(url, screen_path, width, height):
print "Capturing screen.."
driver = webdriver.PhantomJS()
# it save service log file in same directory
# if you want to have log file stored else where
# initialize the webdriver.PhantomJS() as
# driver = webdriver.PhantomJS(service_log_path='/var/log/phantomjs/ghostdriver.log')
driver.set_script_timeout(30)
if width and height:
driver.set_window_size(width, height)
driver.get(url)
driver.save_screenshot(screen_path)
def do_crop(params):
print "Croping captured image.."
command = [
'convert',
params['screen_path'],
'-crop', '%sx%s+0+0' % (params['width'], params['height']),
params['crop_path']
]
execute_command(' '.join(command))
def do_thumbnail(params):
print "Generating thumbnail from croped captured image.."
command = [
'convert',
params['crop_path'],
'-filter', 'Lanczos',
'-thumbnail', '%sx%s' % (params['width'], params['height']),
params['thumbnail_path']
]
execute_command(' '.join(command))
def get_screen_shot(**kwargs):
url = kwargs['url']
width = int(kwargs.get('width', 1024)) # screen width to capture
height = int(kwargs.get('height', 768)) # screen height to capture
filename = kwargs.get('filename', 'screen.png') # file name e.g. screen.png
path = kwargs.get('path', ROOT) # directory path to store screen
crop = kwargs.get('crop', False) # crop the captured screen
crop_width = int(kwargs.get('crop_width', width)) # the width of crop screen
crop_height = int(kwargs.get('crop_height', height)) # the height of crop screen
crop_replace = kwargs.get('crop_replace', False) # does crop image replace original screen capture?
thumbnail = kwargs.get('thumbnail', False) # generate thumbnail from screen, requires crop=True
thumbnail_width = int(kwargs.get('thumbnail_width', width)) # the width of thumbnail
thumbnail_height = int(kwargs.get('thumbnail_height', height)) # the height of thumbnail
thumbnail_replace = kwargs.get('thumbnail_replace', False) # does thumbnail image replace crop image?
screen_path = abspath(path, filename)
crop_path = thumbnail_path = screen_path
if thumbnail and not crop:
raise Exception, 'Thumnail generation requires crop image, set crop=True'
do_screen_capturing(url, screen_path, width, height)
if crop:
if not crop_replace:
crop_path = abspath(path, 'crop_'+filename)
params = {
'width': crop_width, 'height': crop_height,
'crop_path': crop_path, 'screen_path': screen_path}
do_crop(params)
if thumbnail:
if not thumbnail_replace:
thumbnail_path = abspath(path, 'thumbnail_'+filename)
params = {
'width': thumbnail_width, 'height': thumbnail_height,
'thumbnail_path': thumbnail_path, 'crop_path': crop_path}
do_thumbnail(params)
return screen_path, crop_path, thumbnail_path
if __name__ == '__main__':
'''
Requirements:
Install NodeJS
Using Node's package manager install phantomjs: npm -g install phantomjs
install selenium (in your virtualenv, if you are using that)
install imageMagick
add phantomjs to system path (on windows)
'''
url = 'http://stackoverflow.com/questions/1197172/how-can-i-take-a-screenshot-image-of-a-website-using-python'
screen_path, crop_path, thumbnail_path = get_screen_shot(
url=url, filename='sof.png',
crop=True, crop_replace=False,
thumbnail=True, thumbnail_replace=False,
thumbnail_width=200, thumbnail_height=150,
)
这些是所生成的图像:
-1
尝试此..
#!/usr/bin/env python
import gtk.gdk
import time
import random
while 1 :
# generate a random time between 120 and 300 sec
random_time = random.randrange(120,300)
# wait between 120 and 300 seconds (or between 2 and 5 minutes)
print "Next picture in: %.2f minutes" % (float(random_time)/60)
time.sleep(random_time)
w = gtk.gdk.get_default_root_window()
sz = w.get_size()
print "The size of the window is %d x %d" % sz
pb = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB,False,8,sz[0],sz[1])
pb = pb.get_from_drawable(w,w.get_colormap(),0,0,0,0,sz[0],sz[1])
ts = time.time()
filename = "screenshot"
filename += str(ts)
filename += ".png"
if (pb != None):
pb.save(filename,"png")
print "Screenshot saved to "+filename
else:
print "Unable to get the screenshot."
相关问题
- 1. 如何截取网站的截图
- 2. 如何抓取网站并截取每个网页的截图?
- 3. 网站抓取和截图
- 4. 如何使用R截图网站?
- 5. 如何截取指定网站的截图?
- 6. 如何截取android截图
- 7. 获取网站截图,并用它在图像标签
- 8. 如何截取使用php的网页截图?
- 9. 如何使用VBA截取网页截图?
- 10. 如何使用Perl截取网页截图?
- 11. 如何使用PHP截取外部网页的iFrame的截图?
- 12. 用JavaScript截取网页截图?
- 13. 从命令行或python网站截图
- 14. 我如何使用截图的图像?
- 15. 如何快速使用PHP获取网站截图?
- 16. 在Android中使用移动图像截取屏幕截图
- 17. Android采取截图图像
- 18. 我如何获取许多网站的缩略图截图?
- 19. Phantomjs - 截取网页的屏幕截图
- 20. 如何从asp.net中的网页表单截取网址截图?
- 21. 获取网站的程序截图,包括实时网络摄像头图像
- 22. Python获取网页截图调色板
- 23. 如何截取部分UIView的截图?
- 24. 如何截取HTML表单的截图?
- 25. Android cocos2d如何截取CCscene的截图?
- 26. 如何截取特定LinearLayout的截图?
- 27. 如何截取Flex Spark VideoDisplay的截图?
- 28. 如何截取当前的MapView截图?
- 29. 如何使用java在网上截图?
- 30. 网站截图缩略图生成器
快速搜索网站会带来很多很多近似重复的内容。这是一个很好的开始:http://stackoverflow.com/questions/713938/how-can-i-generate-a-screenshot-of-a-webpage-using-a-server-side-script – Shog9 2009-07-28 22:55:30
Shog9:谢谢!你的链接有一些...会检查它。 – 2009-07-28 23:22:11
Shog9:你为什么不把它添加为答案?所以它可以给你点。 – 2009-07-28 23:27:05