0
我想打开,然后从包含在标签的URL看起来像这样凑数据:刮JavaScript网址,但硒返回空字符串
<script src="http://includes.mpt-static.com/data/7CE5047496" type="text/javascript" charset="utf-8"></script>
我试着用硒检索/打开网址,但它只是返回一个空白字符串。我认为这是因为当我直接点击src url时,打开一个页面并显示我想要的数据表。但是,当我复制并通过网址到浏览器中时,它会返回空白。另外,每次我重新加载页面时,都会生成一个新的src url。有谁知道为什么会发生这种情况?
的网址: 查看源代码:http://mypricetrack.com/amazon/B00N2BW2PK
我的代码:
import time
from fake_useragent import UserAgent
import urllib2
import csv
from bs4 import BeautifulSoup
import json
from selenium import webdriver
#FAKE-USER_AGENT
ua = UserAgent(cache = False)
headers = {'User-Agent': ua.randome}
#SENDING REQUEST TO PRICETRACKER WEBSITE
product = 'B00N2BW2PK'
page = requests.get('http://www.mypricetrack.com/amazon/'+str(product), headers = headers)
soup = BeautifulSoup(page.text)
#print(soup.prettify())
#GETTING URL FOR DATA
data_link = []
for tag in soup.findAll('script',{'charset':'utf-8'}):
data_link = data_link + [tag['src']]
string2 = data_link[1]
print string2
#OPENING URL FOR DATA
driver = webdriver.Firefox()
driver.get(string2)
time.sleep(5)
htmlSource = driver.page_source
print htmlSource