2017-09-13 55 views
0

试图刮一些数据,但我不断收到超时错误。我的网络工作正常,我也更新到最新的R版本 - 在这一点上如何解决这个问题。随时随地尝试使用任何网址。错误:端口443超时 - 刮数据

library(RCurl) 
library(XML) 

url = "https://inciweb.nwcg.gov/" 
content <- getURLContent(url) 
    Error in function (type, msg, asError = TRUE) : 
     Failed to connect to inciweb.nwcg.gov port 443: Timed out 

回答

1

您可能需要设定一个明确的超时较慢的连接:

library(httr) 
library(rvest) 

pg <- GET("https://inciweb.nwcg.gov/", timeout(60)) 

incidents <- html_table(content(pg))[[1]] 

str(incidents) 
## 'data.frame': 10 obs. of 7 variables: 
## $ Incident: chr "Highline Fire" "Cottonwood Fire" "Rattlesnake Point Fire" "Coolwater Complex" ... 
## $ Type : chr "Wildfire" "Wildfire" "Wildfire" "Wildfire" ... 
## $ Unit : chr "Payette National Forest" "Elko District Office" "Nez Perce - Clearwater National Forests" "Nez Perce - Clearwater National Forests" ... 
## $ State : chr "Idaho, USA" "Nevada, USA" "Idaho, USA" "Idaho, USA" ... 
## $ Status : chr "Active" "Active" "Active" "Active" ... 
## $ Acres : chr "83,630" "1,500" "4,843" "2,969" ... 
## $ Updated : chr "1 min. ago" "1 min. ago" "3 min. ago" "5 min. ago" ... 

临时的解决方法

l <- charToRaw(paste0(readLines("https://inciweb.nwcg.gov/"), collapse="\n")) 

pg <- read_html(l) 

html_table(pg)[[1]] 
+0

嗯,尝试与不同的超时(#),但不断收到这:'pg < - GET(“https://inciweb.nwcg.gov/”,timeout(60)) curl :: curl_fetch_memory(url,handle = handle)的错误: 达到超时:10000毫秒后连接超时# – S31

+0

是的。我也尝试过R中的其他网站,但遇到同样的问题。通过浏览器访问这些网站正常工作 – S31

+0

是的。使用Windows 7 – S31