2015-12-23 45 views
1

我正在尝试重新编译以前工作的旧代码。遇到一些挑战。 MWE如下:R:获取J数据到R Dataframe

library("httr") 
library("XML") 
library("stringr") 
library("jsonlite") 
library("rgeos") 
library("maptools") 
library("stringr") 
library("RJSONIO") 

push <- readRDS("lq/lq_list.Rdata") # restore R object from a file 

push样子:

[898] "{title: \"La Quinta Inn Vancouver Airport\", innNumber: \"0759\", latitude:\n      \"49.177832\", longitude: \"-123.127116\", imagePath: \"/bin/lq-com/hotelSearchImage.0759.jpg\", isInnAndSuites: \"false\", street: \"8640 Alexandra Rd\", street2: \"\", city: \"Richmond\", stateProv: \"BC\", postalCode: \"V6X 1C4\", countryDisplay: \"Canada\"\n   }"                 
[899] "{title: \"La Quinta Inn & Suites Oshawa\", innNumber: \"6601\", latitude:\n      \"43.898034\", longitude: \"-78.861257\", imagePath: \"/bin/lq-com/hotelSearchImage.6601.jpg\", isInnAndSuites: \"true\", street: \"63 King Street East\", street2: \"\", city: \"Oshawa\", stateProv: \"ON\", postalCode: \"L1H 1B4\", countryDisplay: \"Canada\"\n   }" 

我的代码休息:

hotels <- data.frame(Title=character(), InnNumber=character(), Latitude=character(),Longitude= character(), 
       ImagePath=character(), isInnAndSuites= character(), 
       street = character(), street2=character(), city = character(), stateProv=character(), 
       postalCode = character(), countryDisplay=character(), 
       stringsAsFactors=FALSE) # create empty data frame 
    ## For-loop to parse the data for remaining La Quinta inns, 2:899 and store them in `tmp`. 
for (i in 2:length(push)){ 
    json_file <- fromJSON(push[i]) 

    ## Added for robustness: replaces NULL entries for inn column cells with NAs. 
    ## Can be removed without problems. 
    json_file <- lapply(json_file, function(x) { 
    x[sapply(x, is.null)] <- NA 
    unlist(x) 
    }) 
    tmp <- rbind(tmp,data.frame(do.call("cbind",json_file))) 
} 

hotels <- tmp[!duplicated(tmp$nnNumbe),] 

## Fix header names screwed up by fromJSON() 
colnames(hotels) <- c("Title","InnNumber","Latitude","Longitude", "ImagePath", "isInnAndSuites","street","street2", "city", "stateProv", 
       "postalCode", "countryDisplay") 

错误消息:

Error in rbind(deparse.level, ...) : 
    numbers of columns of arguments do not match 

这表明tmp <- rbind(tmp,data.frame(do.call("cbind",json_file)))有问题。但我不知道为什么列数不同,如果我已经用NAs替换NULL。

+1

一个简单的调试尝试会用'dplyr :: bind_rows'(它不需要相同的列)替换'rbind'并查看结果。 – Gregor

+0

@格雷戈谢谢,但我设法找到解决方法。 – user2205916

回答

0

由于我的数据操作之前的性质for循环我不得不通过添加一行代码我的数据从字符向量转换成一个列表,如下所示:

for (i in 2:length(push)){ 
    json_file <- fromJSON(push[i]) 
    json_file <- as.list(json_file) 
    json_file <- lapply(json_file, function(x) { 
    x[sapply(x, is.null)] <- NA 
    unlist(x) 
    }) 
    tmp <- rbind(tmp,data.frame(do.call("cbind",json_file))) 
}