2014-03-06 23 views
1

请原谅这个非常新手的问题,但我试图在包含基于其他列的百分比的数据框中创建一个新列。例如,我正在使用的数据类似于以下内容,其中该列是一个二元因子(即存在或不存在“that”),动词列是单个动词(即动词,可能或不可以在“that”之后),Freq列表示每个动词的频率。如何创建包含从其他列计算的百分比数据的新列?

 That Verb Freq 
1 That believe 3 
2 NoThat think 4 
3 That  say 3 
4 That believe 3 
5 That think 4 
6 NoThat  say 3 
7 NoThat believe 3 
8 NoThat think 4 
9 That  say 3 
10 NoThat think 4 

我想要的是添加另一列,为每个不同的动词提供“that”表达式(编码为“that”)的整体比率。类似如下:

 That Verb Freq Perc.That 
1 That believe 3  33.3 
2 NoThat think 4  25.0 
3 That  say 3  33.3 
4 That believe 3  33.3 
5 That think 4  25.0 
6 NoThat  say 3  33.3 
7 NoThat believe 3  33.3 
8 NoThat think 4  25.0 
9 That  say 3  33.3 
10 NoThat think 4  25.0 

这可能是我在其他地方错过了一个类似的问题。如果是这样,我表示歉意。不过,提前感谢任何帮助。

回答

0

您想使用的功能ddplyplyr库:

#install.packages('plyr') 
library(plyr) 

dat # your data frame 

ddply(dat, .(verb), transform, perc.that = freq/sum(freq)) 

#  that verb freq perc.that 
#1 That believe 3 0.3333333 
#2 That believe 3 0.3333333 
#3 NoThat believe 3 0.3333333 
#4 That  say 3 0.3333333 
#... 
+0

这个工作势如破竹。谢谢。 – user3388984