0
尝试从r-users.com检索一些信息。我使用下面的代码,我收到警告消息:xml内容似乎不是xml
XML content does not seem to be XML
任何帮助,将不胜感激。
library(data.table)
library(XML)
pages <- c(1:10)
urls <- rbindlist (lapply(pages, function(x) {
url <- paste("https://www.r-users.com/jobs/page/",x,"/", sep="")
data.frame(url)
}), fill=TRUE)
jobLocations <- rbindlist (apply(urls, 1, function(url) {
doc1 <- htmlParse (url)
locations <- getNodeSet(doc1, '//*[@id="mainContent"]/div[2]/ol/li/dl/dd[3]/span')
data.frame(sapply(locations, function(x) { xmlValue(x) }))
}), fill = TRUE)
如果我访问一个URL和查看源例如https://www.r-users.com/jobs/page/1/页面上没有XML(尽管它可能在后台加载XML以获得结果)。我怀疑你的错误是正确的,你解析HTML,而不是XML。 –