R：优化数据帧上发现的功能最大值，然后修剪我的所有数据的其余

首先来自Temperature.xls可以从这个链接下载：RBook R：优化数据帧上发现的功能最大值，然后修剪我的所有数据的其余

我的代码是这样的：

temp = read.table("Temperature.txt", header = TRUE) 
length(unique(temp$Year)) # number of unique values in the Year vector. 
res = ddply(temp, c("Year","Month"), summarise, Mean = mean(Temperature, na.rm = TRUE)) 
res1 = ddply(temp, .(Year,Month), summarise, 
    SD = sd(Temperature, na.rm = TRUE), 
    N = sum(!is.na(Temperature)) 
     ) 
# ordering res1 by sd and year: 
res1 = res1[order(res1$Year,res1$SD),]; 
# finding maximum of SD in res1 by year and displaying just them in a separate data frame 
res1_maxsd = ddply(res1, .(Year), summarise, MaxSD = max(SD, na.rm = TRUE)) # find the maxSD in each Year 
res1_max = merge(res1_maxsd,res1, all = FALSE) # merge it with the original to see other variables at the max's rows 
res1_m = res1_max[res1_max$MaxSD==res1_max$SD,] # find which rows are the ones corresponding to the max value 
res1_mm = res1_m[complete.cases(res1_m),] # trim all others (which are NA's)

我知道我可以将4条最后一行切成较少的行。我能以某种方式在一个命令中执行最后2行吗？我曾偶然发现：

res1_m = res1_max[complete.cases(res1_max$MaxSD==res1_max$SD),]

但是，这并没有给我我想要的是最终较小的数据帧只包含maxSD行（所有的变量）。

来源

2016-04-22 Corel

你试图找到这一年有最大的温度变化过滤那年的数据？ – Psidom

是的，只有按月而不是一年。所以我每年会得到一个最大变化月份的行... – Corel

而不是修复最后2行为什么不从res1开始？倒车SD顺序，并采取每年的第一行给你一个相当的最终数据集...

res1 <- res1[order(res1$Year,-res1$SD),] 
res_final <- res1[!duplicated(res1$Year),]

来源

2016-04-22 22:27:58

谢谢，这可以完成工作，但我想知道是否还有其他更一般的方法，因为此方法依赖于这样的事实：'！重复'只会考虑每年的第一次发生...... – Corel

最后四行可以，如果你使用dplyr包被砍倒。由于您希望保留原始数据集中的某些信息，因此您可能不想使用汇总，因为它只返回摘要信息，而且您必须合并原始数据集，因此mutate和filter将是更好的选择：

library(dplyr) 
res1_mm1 <- res1 %>% group_by(Year) %>% filter(SD == max(SD, na.rm = T))

你也可以使用一个mutate函数来创建新列MaxSD这是一样的你的情况下，结果数据帧SD：

res1_mm1 <- res1 %>% group_by(Year) %>% mutate(MaxSD = max(SD, na.rm = T)) %>% 
      filter(SD == MaxSD)

来源

2016-04-22 22:43:15 Psidom

R：优化数据帧上发现的功能最大值，然后修剪我的所有数据的其余

回答

相关问题