我有一个由两列组成的数据帧:true.de.status
和decision.de
。该数据集是可重复的,如下:从两列结合信息构造R中的数据帧
dat = structure(c(0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,
0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0), .Dim = c(100L,
2L), .Dimnames = list(NULL, c("true.de.status", "decision.de"
)))
的dat
的前几行是:
true.de.status decision.de
[1,] 0 0
[2,] 0 0
[3,] 1 1
[4,] 0 1
[5,] 1 0
[6,] 0 0
[7,] 1 1
[8,] 1 0
现在我希望画的曲线与x轴(基因的数量,即行的总数在dat
)和y轴的真正数量。 x轴很容易确定:seq(0,100)
会给我0,1,...,100个基因。对于Y轴,我需要根据两列true.de.status
和decision.de
进行计算:当我通过每一行时,随着基因数(行)的增加,我可以计算出真正的正数。例如,
first 1 gene included: True positive (TP) = 0
first 2 genes included: TP = 0
first 3 genes included: TP = 1 (since both columns have 1 and they match)
first 4 genes included: TP = 1 (`decision.de` is 1, but `true.de.status` is 0, so it is a false positive)
first 5 genes included: TP = 1 (two columns don't match)
......
有一种简单的方式来操纵dat
数据帧,以及与真阳性的数量返回相同的长度的矢量作为dim(dat)[1]
?谢谢!
只是让我们很清楚,这不是一个数据帧,而是一个矩阵。这就是为什么我需要使用“[”而不是“$”来访问它的列。 –