我会通过机器学习的黑客,和我被困在这条线:意义ddply错误的:“名字”属性[9]必须是相同的长度矢量[1]
from.weight <- ddply(priority.train, .(From.EMail), summarise, Freq = length(Subject))
产生以下错误:
Error in attributes(out) <- attributes(col) :
'names' attribute [9] must be the same length as the vector [1]
这是一个回溯():
> traceback()
11: FUN(1:5[[1L]], ...)
10: lapply(seq_len(n), extract_col_rows, df = x, i = i)
9: extract_rows(x$data, x$index[[i]])
8: `[[.indexed_df`(pieces, i)
7: pieces[[i]]
6: function (i)
{
piece <- pieces[[i]]
if (.inform) {
res <- try(.fun(piece, ...))
if (inherits(res, "try-error")) {
piece <- paste(capture.output(print(piece)), collapse = "\n")
stop("with piece ", i, ": \n", piece, call. = FALSE)
}
}
else {
res <- .fun(piece, ...)
}
progress$step()
res
}(1L)
5: .Call("loop_apply", as.integer(n), f, env)
4: loop_apply(n, do.ply)
3: llply(.data = .data, .fun = .fun, ..., .progress = .progress,
.inform = .inform, .parallel = .parallel, .paropts = .paropts)
2: ldply(.data = pieces, .fun = .fun, ..., .progress = .progress,
.inform = .inform, .parallel = .parallel, .paropts = .paropts)
1: ddply(priority.train, .(From.EMail), summarise, Freq = length(Subject))
的priority.train对象是一个数据帧,并且在这里是更多信息:
> mode(priority.train)
[1] "list"
> names(priority.train)
[1] "Date" "From.EMail" "Subject" "Message" "Path"
> sapply(priority.train, mode)
Date From.EMail Subject Message Path
"list" "character" "character" "character" "character"
> sapply(priority.train, class)
$Date
[1] "POSIXlt" "POSIXt"
$From.EMail
[1] "character"
$Subject
[1] "character"
$Message
[1] "character"
$Path
[1] "character"
> length(priority.train)
[1] 5
> nrow(priority.train)
[1] 1250
> ncol(priority.train)
[1] 5
> str(priority.train)
'data.frame': 1250 obs. of 5 variables:
$ Date : POSIXlt, format: "2002-01-31 22:44:14" "2002-02-01 00:53:41" "2002-02-01 02:01:44" "2002-02-01 10:29:23" ...
$ From.EMail: chr "[email protected]" "[email protected]" "[email protected]" "[email protected]" ...
$ Subject : chr "please help a newbie compile mplayer :-)" "re: please help a newbie compile mplayer :-)" "re: please help a newbie compile mplayer :-)" "re: please help a newbie compile mplayer :-)" ...
$ Message : chr " \n Hello,\n \n I just installed redhat 7.2 and I think I have everything \nworking properly. Anyway I want to in"| __truncated__ "Make sure you rebuild as root and you're in the directory that you\ndownloaded the file. Also it might complain of a few depen"| __truncated__ "Lance wrote:\n\n>Make sure you rebuild as root and you're in the directory that you\n>downloaded the file. Also it might compl"| __truncated__ "Once upon a time, rob wrote :\n\n> I dl'd gcc3 and libgcc3, but I still get the same error message when I \n> try rpm --rebuil"| __truncated__ ...
$ Path : chr "../03-Classification/data/easy_ham/01061.6610124afa2a5844d41951439d1c1068" "../03-Classification/data/easy_ham/01062.ef7955b391f9b161f3f2106c8cda5edb" "../03-Classification/data/easy_ham/01063.ad3449bd2890a29828ac3978ca8c02ab" "../03-Classification/data/easy_ham/01064.9f4fc60b4e27bba3561e322c82d5f7ff" ...
Warning messages:
1: In encodeString(object, quote = "\"", na.encode = FALSE) :
it is not known that wchar_t is Unicode on this platform
2: In encodeString(object, quote = "\"", na.encode = FALSE) :
it is not known that wchar_t is Unicode on this platform
我会发布一个示例,但内容有点长,我不认为这里的内容是相关的。
同样的错误也发生在这里:
> ddply(priority.train, .(Subject))
Error in attributes(out) <- attributes(col) :
'names' attribute [9] must be the same length as the vector [1]
是否有人在这里发生了什么的线索?该错误似乎是由不同于priority.train的对象生成的,因为它的names属性显然有9个元素。
我很感激任何帮助。谢谢!
问题解决
我已经找到了问题的感谢@ user1317221_G的使用dput功能的提示。问题在于日期字段,该字段在此处是包含9个字段(秒,分钟,小时,星期一,星期一,星期一,星期六,星期一,星期几)的列表。为了解决这个问题,我简单地转换日期为特征向量,ddply使用,那么转换的历史可以追溯到日期:
> tmp <- priority.train$Date
> priority.train$Date <- as.character(priority.train$Date)
> from.weight <- ddply(priority.train, .(From.EMail), summarise, Freq = length(Subject))
> priority.train$Date <- tmp
> rm(tmp)
在您的附加信息的地方,我可以建议'STR(priority.train )'? –
@ sebastian-c当然!我现在编辑这个问题。 – Motasim
“这个错误在R中意味着什么?”可能是你可以使用的最无用的问题标题。请下次再考虑一下。 – flodel