从两列结合信息构造R中的数据帧

我有一个由两列组成的数据帧：true.de.status和decision.de。该数据集是可重复的，如下：从两列结合信息构造R中的数据帧

dat = structure(c(0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 
0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 
0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0), .Dim = c(100L, 
2L), .Dimnames = list(NULL, c("true.de.status", "decision.de" 
)))

的dat的前几行是：

 true.de.status decision.de 
    [1,]    0   0 
    [2,]    0   0 
    [3,]    1   1 
    [4,]    0   1 
    [5,]    1   0 
    [6,]    0   0 
    [7,]    1   1 
    [8,]    1   0

现在我希望画的曲线与x轴（基因的数量，即行的总数在dat）和y轴的真正数量。 x轴很容易确定：seq(0,100)会给我0,1，...，100个基因。对于Y轴，我需要根据两列true.de.status和decision.de进行计算：当我通过每一行时，随着基因数（行）的增加，我可以计算出真正的正数。例如，

first 1 gene included: True positive (TP) = 0 
first 2 genes included: TP = 0 
first 3 genes included: TP = 1 (since both columns have 1 and they match) 
first 4 genes included: TP = 1 (`decision.de` is 1, but `true.de.status` is 0, so it is a false positive) 
first 5 genes included: TP = 1 (two columns don't match) 
......

有一种简单的方式来操纵dat数据帧，以及与真阳性的数量返回相同的长度的矢量作为dim(dat)[1]？谢谢！

来源

2013-11-20 alittleboy

只是让我们很清楚，这不是一个数据帧，而是一个矩阵。这就是为什么我需要使用“[”而不是“$”来访问它的列。 –

看看这是你想要的东西：

plot(cumsum(dat[ , "true.de.status"] == 1 & 
       dat[ , "decision.de"] == 1) , 
     type="s")

（默认情况下，x值将1:100如果你想点或线，您可以更改类型参数很明显，你可以使用vec <- ...来指派。 cumsum值的名称）

enter image description here

来源

2013-11-20 21:41:54

它看起来像你想

df <- as.data.frame(dat) 
df$TP <- cumsum(as.numeric(df$true.de.status == 1 & df$decision.de == 1))

这将返回实例的累计计数，其中既列有1与它们相匹配。

来源

2013-11-20 21:24:58 colcarroll

也许OP想要你的TP的'cumsum'？ – Henrik

谢谢！我想知道为什么他提供的最后两个例子被视为真正的积极因素...... – colcarroll

这个问题可能已经更清楚地阐明了...... – Henrik

从两列结合信息构造R中的数据帧

回答

相关问题