我在R有一个data.frame
我需要比较两行数据,如果它们是相同的,我需要合并行并将数据合并到一列中。我觉得这是一个共同的需要,当使用R所以使用ddply
或其他包应该能够完成这项任务。下面是数据原样,dat
,它在一些代码后应该是什么样的,foo.
我是新来的,所以任何帮助都非常感谢。如何在R中重塑一个data.frame而没有循环?
前:
dat <- structure(list(detected_id = c(11, 11, 4), reviewer_name = c("mike",
"mike", "john"), created_at = c("2016-05-04 10:02:45", "2016-05-04 10:02:45",
"2016-05-04 10:02:45"), stage = c(2L, 2L, 1L), V7 = c("Detected Organism: Staphylococcus Aureus, Comment: Looks good",
"Detected Organism: Staphylococcus Aureus, Comment: Note 1",
"Detected Organism: Human Adenovirus 7, Comment: test")), .Names = c("detected_id",
"reviewer_name", "created_at", "stage", "V7"), row.names = c(NA,
-3L), class = "data.frame")
后:
foo <- structure(list(detected_id = c(11L, 4L), reviewer_name = c("mike",
"john"), created_at = structure(c(1L, 1L), .Label = "5/4/16 10:02", class = "factor"),
stage = c(2L, 1L), V7 = structure(c(2L, 1L), .Label = c("Detected Organism: Human Adenovirus 7, Comment: test",
"Detected Organism: Staphylococcus Aureus, Comment: Looks good; Detected Organism: Staphylococcus Aureus, Comment: Note 1"
), class = "factor")), .Names = c("detected_id", "reviewer_name",
"created_at", "stage", "V7"), row.names = c(NA, -2L), class = "data.frame")
编辑:
下面我提供的数据集工作的解决方案,但我发现的情况下,这些解决方案实际上并不像预期的那样工作。这是失败的data.frame的一个例子。请注意,detected_id列对我来说已经过时了。
dat <- structure(list(detected_id = c(11, 11, 11, 11, 12, 4), reviewer_name = c("Mike",
"Mike", "Mike", "Mike", "John", "John"), created_at = c("2016-05-04 10:02:45",
"2016-05-04 10:02:45", "2016-05-04 10:02:45", "2016-05-04 10:02:45",
"2016-05-04 10:02:45", "2016-05-04 10:02:45"), stage = c(2L,
3L, 2L, 3L, 1L, 1L), V7 = c("Detected Organism: Staphylococcus Aureus, Comment: Looks good",
"Detected Organism: Staphylococcus Aureus, Comment: Looks good",
"Detected Organism: Staphylococcus Aureus, Comment: Note 1",
"Detected Organism: Staphylococcus Aureus, Comment: Note 1",
"Detected Organism: Stenotrophomonas Maltophilia, Comment: new note",
"Detected Organism: Human Adenovirus 7, Comment: test")), .Names = c("detected_id",
"reviewer_name", "created_at", "stage", "V7"), row.names = c(NA,
-6L), class = "data.frame")
SOLUTION:重塑data.frame之前删除detected_id柱,由于使用@eddi
良好的解决方案,按预期工作。谢谢! – webDevleoper101
查看我编辑的 – webDevleoper101
@ webDevleoper101我不确定“失败”对您意味着什么。它完全按预期工作。有一点不清楚你所希望的 - 也许你想从''by'中取出'detected_id'。 – eddi