计数超出R的范围数据

希望这是有道理的：我从csv文件的形式接收来自同事的数据，每个文件的长度可能是数千行。这些文件中有多列，但最初我感兴趣的2个被命名为“目标”和“温度”。 “目标”有多个类别，每个目录中可以有很多（或很少）“温度”数据点。例如：计数超出R的范围数据

target  temperature 
RSV   87.2 
RSV   86.9 
...... 
HSV   84.3 
HSV   89.7

等

每个目标有它自己的定义的温度范围内，所以我需要限定这些范围，然后计数的样本数为每个目标的一些方法是内或所定义的外范围。

任何及所有建议感激地接受

来源

2017-03-10 Lee

之外。你想要的输出是什么。请提出这个问题[reproducible]（http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example）。 – r2evans

请参阅'？cut'或'？findInterval'来定义您的范围值。 – thelatemail

脚本计算范围，然后计算每个目标样本的数量内，或者你尝试过什么定义的范围

# data from colleagues 
df <- data.frame(target=c("RSV", "RSV", "RSV", "RSV", 
          "HSV", "HSV", "HSV", 
          "SRV", "SRV", "SRV"), 
       temperature=c(87.2, 86.9, 86.8, 86.7, 
           84.3, 89.7, 88.7, 
           54.3, 59.7, 58.7)) 

# target with ranges 
res <- data.frame(target=character(0), 
        min.temperature=numeric(0), 
        max.temperature=numeric(0), 
        within=numeric(0), 
        outside=numeric(0)) 

# targets 
l <- levels(df$target) 

for(i in 1:length(l)) { 
    t <- df[df$target==l[i],]$temperature 

    # some way of defining these ranges 
    t.min <- min(t) 
    t.max <- max(t) 

    # targets in [min; max] 
    in.range <- df$temperature >= t.min & 
    df$temperature <= t.max 

    t.within <- nrow(df[df$target==l[i] & in.range,]) 
    t.outside <- nrow(df[df$target==l[i] & !in.range,]) 

    res <- rbind(res, data.frame(target=l[i], 
        min.temperature=t.min, 
        max.temperature=t.max, 
        within=t.within, 
        outside=t.outside)) 
} 

print(res) 
# target min.temperature max.temperature within outside 
# 1 HSV   84.3   89.7  3  0 
# 2 RSV   86.7   87.2  4  0 
# 3 SRV   54.3   59.7  3  0

来源

2017-03-10 04:42:59

计数超出R的范围数据

回答

相关问题