2015-11-30 37 views
0

我试图用硒和Python来打印一些信息,但它打印不是所有的CSS的路径,这是while循环怎么会只是一个信息打印信息..如何使用硒

pageIndex = 1 
while True: # Keep looping through all pages 
    # Navigate to the search page 
    browser.get("https://www.houz.com/page_num="+ str(pageIndex)) 
    time.sleep(6) 

    links = browser.find_elements_by_css_selector('div > h3 > a') 
    for link in links: 
     urls = link.text 


    jobs = browser.find_elements_by_css_selector('div > div.description') 
    for title in jobs: 
     jobtitles = title.text 


    with open("1Exportdata.csv", "a") as csvFile: 
     csvFile.write(url + "," + jobtitle + "\n") 

    pageIndex += 1 
    if pageIndex == 5010: 
     browser.close() 
+0

刚刚运行'for'循环为'urls'和'jobtitles'分配新值有什么意义? – Andersson

+0

刚刚更新的全闭环 – Sarfraz

回答

2

因为你使用:

for title in jobs: 
    jobtitles = title.text 

在第一循环中,jobtitles是第一title.text,但随后,在第二循环中,它成为第二个title.text。最后它会成为最后的title.text

例如:

>>> for i in [1, 2, 3]: 
...  num = i 
>>> print(num) 
3 
>>> 

所以你需要写with open("1Exportdata.csv", "a") as csvFile:for循环中。因为你有两个名单,我建议你使用zip类压缩他们:

pageIndex = 1 
while True: # Keep looping through all pages 
    # Navigate to the search page 
    browser.get("https://www.houz.com/page_num="+ str(pageIndex)) 
    time.sleep(6) 

    links = browser.find_elements_by_css_selector('div > h3 > a') 
    jobs = browser.find_elements_by_css_selector('div > div.description') 

    for link, title in zip(links, jobs): 
     url = link.text 
     jobtitle = title.text 


     with open("1Exportdata.csv", "a") as csvFile: 
      csvFile.write(url + "," + jobtitle + "\n") 

    pageIndex += 1 
    if pageIndex == 5010: 
     browser.close() 

而且我认为使用while循环是没用的,尽量使用for循环,而不是:

for pageIndex in range(1, 5011): 
    # Navigate to the search page 
    browser.get("https://www.houz.com/page_num="+ str(pageIndex)) 
    time.sleep(6) 

    links = browser.find_elements_by_css_selector('div > h3 > a') 
    jobs = browser.find_elements_by_css_selector('div > div.description') 

    for link, title in zip(links, jobs): 
     url = link.text 
     jobtitle = title.text 


     with open("1Exportdata.csv", "a") as csvFile: 
      csvFile.write(url + "," + jobtitle + "\n") 
+0

错误 >>> 文件 “C:\用户\ WIN \下载\ Runprog \项目cusom TAIO \ Profiles文件\ 1.py” 51行,在刮板 csvFile.write(URL + “,”+ jobtitle +“\ n”) 文件“C:\ Python34 \ lib \ encodings \ cp1252.py”,第19行,编码为 返回codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError:'charmap'编解码器无法编码字符'\ u2601'在位置15:字符映射到 Sarfraz

+0

@Sarfraz:嗯......只是一个错字......现在呢? –

+0

我已经变成单数已经 文件“C:\ Users \ Win \ Downloads \ Runprog \ Project cusom taio \ Profiles \ 1.py”,line 51,in scraper csvFile.write(url +“,”返回codecs.charmap_encode(输入,self.errors,encoding_table)[0] UnicodeEncodeError:''在文件“C:\ Python34 \ lib \ encodings \ cp1252.py”中,第19行,编码为 。 charmap'编解码器不能在位置15编码字符'\ u2601':字符映射到 这里是错误 – Sarfraz