2014-01-09 42 views
0

我有以下数据框;基于具有因子的列值的行总和

Fruit <- c("orange", "orange", "apple", "pineapple", "lemon", "apple", "orange") 

Name <- c("julius", "julius", "john", "mary", "kathy", "john", "julius") 

df <- data.frame(Fruit, Name);df 

我的目标是让每个人吃的所有水果数量总和,以便最后得到下面的表格;

  orange apple pineapple lemon 
julius 2  1  
john    2  
mary      1 
kathy  1       1 

我在试用聚合函数,但只能设法让它输出每个人吃的水果总数如下;

df2 <- aggregate(Fruit~Name,df,length); df2 

输出是;

Name Fruit 
1 john  2 
2 julius  3 
3 kathy  1 
4 mary  1 

任何帮助将不胜感激。由于

回答

4

选项1个

library(reshape2) 
dcast(df, Name~Fruit) 
    Name apple lemon orange pineapple 
1 john  2  0  0   0 
2 julius  0  0  3   0 
3 kathy  0  1  0   0 
4 mary  0  0  0   1 

选项2

table(df) 
# as pointed out by lebatsnok, the general command would be with(df, table(Fruit, Name)) 
      Name 
Fruit  john julius kathy mary 
    apple  2  0  0 0 
    lemon  0  0  1 0 
    orange  0  3  0 0 
    pineapple 0  0  0 1 
+0

感谢@Codoremifa。你做得这么简单。我使用第二个选项,虽然自第一次抛出以下错误“package'dcast'不可用(对于R版本3.0.2)” – kigode

+0

'table(df)'在这种情况下工作,因为您没有任何其他变量在数据框中。作为一般情况,'with(df,table(Fruit,Name))'更好。 – lebatsnok

+0

谢谢@lebatsnok。 – TheComeOnMan

2

看起来你想要一个简单的双向频率表:

table(Fruit, Name) 
#   Name 
#Fruit  john julius kathy mary 
# apple  2  0  0 0 
# lemon  0  0  1 0 
# orange  0  3  0 0 
# pineapple 0  0  0 1 
1
> library(gmodels) 
> 
> CrossTable(Fruit, Name) 


    Cell Contents 
|-------------------------| 
|      N | 
| Chi-square contribution | 
|   N/Row Total | 
|   N/Col Total | 
|   N/Table Total | 
|-------------------------| 


Total Observations in Table: 7 


      | Name 
     Fruit |  john | julius |  kathy |  mary | Row Total | 
-------------|-----------|-----------|-----------|-----------|-----------| 
     apple |   2 |   0 |   0 |   0 |   2 | 
      |  3.571 |  0.857 |  0.286 |  0.286 |   | 
      |  1.000 |  0.000 |  0.000 |  0.000 |  0.286 | 
      |  1.000 |  0.000 |  0.000 |  0.000 |   | 
      |  0.286 |  0.000 |  0.000 |  0.000 |   | 
-------------|-----------|-----------|-----------|-----------|-----------| 
     lemon |   0 |   0 |   1 |   0 |   1 | 
      |  0.286 |  0.429 |  5.143 |  0.143 |   | 
      |  0.000 |  0.000 |  1.000 |  0.000 |  0.143 | 
      |  0.000 |  0.000 |  1.000 |  0.000 |   | 
      |  0.000 |  0.000 |  0.143 |  0.000 |   | 
-------------|-----------|-----------|-----------|-----------|-----------| 
     orange |   0 |   3 |   0 |   0 |   3 | 
      |  0.857 |  2.286 |  0.429 |  0.429 |   | 
      |  0.000 |  1.000 |  0.000 |  0.000 |  0.429 | 
      |  0.000 |  1.000 |  0.000 |  0.000 |   | 
      |  0.000 |  0.429 |  0.000 |  0.000 |   | 
-------------|-----------|-----------|-----------|-----------|-----------| 
    pineapple |   0 |   0 |   0 |   1 |   1 | 
      |  0.286 |  0.429 |  0.143 |  5.143 |   | 
      |  0.000 |  0.000 |  0.000 |  1.000 |  0.143 | 
      |  0.000 |  0.000 |  0.000 |  1.000 |   | 
      |  0.000 |  0.000 |  0.000 |  0.143 |   | 
-------------|-----------|-----------|-----------|-----------|-----------| 
Column Total |   2 |   3 |   1 |   1 |   7 | 
      |  0.286 |  0.429 |  0.143 |  0.143 |   | 
-------------|-----------|-----------|-----------|-----------|-----------|