粘贴值

欲枢转result列df水平创建与一个单独的行的数据组为每 region，state，county组合，其中列由year然后city排序。粘贴值

我也想找出新的数据通过region，state和county设置每一行和删除四个results列之间的空白。下面的代码完成了所有这些，但我怀疑它不是非常有效。

有没有办法做到这一点与reshape2没有创建每个组的唯一标识符和每组内的编号观察？有没有办法使用apply来代替for循环来从矩阵中去除空白区域？（矩阵的使用方式不同于数学或编程结构。）我意识到这是两个不同的问题，也许我应该单独发布每个问题。

鉴于我可以达到预期的效果，并且只是希望改进代码，我不知道是否应该发布此代码，但我希望能够学习。感谢您的任何建议。

df <- read.table(text= " 
region state county city year result 
1   1  1  1  1  1 
1   1  1  2  1  2 
1   1  1  1  2  3 
1   1  1  2  2  4 
1   1  2  3  1  4 
1   1  2  4  1  3 
1   1  2  3  2  2 
1   1  2  4  2  1 
1   2  1  1  1  0 
1   2  1  2  1 NA 
1   2  1  1  2  0 
1   2  1  2  2  0 
1   2  2  3  1  2 
1   2  2  4  1  2 
1   2  2  3  2  2 
1   2  2  4  2  2 
2   1  1  1  1  9 
2   1  1  2  1  9 
2   1  1  1  2  8 
2   1  1  2  2  8 
2   1  2  3  1  1 
2   1  2  4  1  0 
2   1  2  3  2  1 
2   1  2  4  2  0 
2   2  1  1  1  2 
2   2  1  2  1  4 
2   2  1  1  2  6 
2   2  1  2  2  8 
2   2  2  3  1  3 
2   2  2  4  1  3 
2   2  2  3  2  2 
2   2  2  4  2  2 
", header=TRUE, na.strings=NA) 

desired.result <- read.table(text= " 
region state county results 
1   1  1  1234 
1   1  2  4321 
1   2  1  0.00 
1   2  2  2222 
2   1  1  9988 
2   1  2  1010 
2   2  1  2468 
2   2  2  3322 
", header=TRUE, colClasses=c('numeric','numeric','numeric','character')) 

# redefine variables for package reshape2 creating a unique id for each 
# region, state, county combination and then number observations in 
# each of those combinations 

library(reshape2) 

id.var <- df$region*100000 + df$state*1000 + df$county 
obsnum <- sequence(rle(id.var)$lengths) 

df2 <- dcast(df, region + state + county ~ obsnum, value.var = "result") 

# remove spaces between columns of results matrix 
# with a for-loop. How can I use apply to do this? 

x <- df2[,4:(4+max(obsnum)-1)] 

# use a dot to represent a missing observation 

x[is.na(x)] = '.' 

x.cat = numeric(nrow(x)) 

for(i in 1:nrow(x)) { 
    x.cat[i] = paste(x[i,], collapse="") 
} 

df3 <- cbind(df2[,1:3],x.cat) 
colnames(df3) <- c("region", "state", "county", "results") 
df3 

df3 == desired.result

编辑：

马修伦德伯格的下面的答案是优秀的。之后，我意识到我还需要创建一个输出数据集，其中上面的四个结果列包含数字，有理数，并用空格分隔。所以，我已经发布了一个明显的方式来做到这一点，这改变了马修的答案。我不知道这是否是可以接受的协议，但是新的方案似乎与原始文章紧密相关，因此我认为我不应该发布新的问题。

来源

2012-12-31 Mark Miller

我想这你想要做什么：

df$result <- as.character(df$result) 
df$result[is.na(df$result)] <- '.' 


aggregate(result ~ county+state+region, data=df, paste0, collapse='') 

    county state region result 
1  1  1  1 1234 
2  2  1  1 4321 
3  1  2  1 0.00 
4  2  2  1 2222 
5  1  1  2 9988 
6  2  1  2 1010 
7  1  2  2 2468 
8  2  2  2 3322

这依赖于你的数据帧以正确的顺序进行排序（你是）。

来源

2012-12-31 23:09:25

谢谢你杰出的答案。后来我意识到我还需要一个输出数据集，其中四个结果列是数字的，并由空格分隔。我无法修改你的答案，但我靠近了，并在此发布了代码。 –

Matthew Lundberg的回答非常好。之后，我意识到我还需要创建一个输出数据集，其中上面的四个结果列包含数字，有理数，并用空格分隔。所以，在这里我通过修改Matthew的答案提供了一个明显的方法来做到这一点。我不知道这是否是可以接受的协议，但是新的方案似乎与原始文章紧密相关，因此我认为我不应该发布新的问题。

前两行是对Matthew答案的修改。

df$result[is.na(df$result)] <- 'NA' 
df2 <- aggregate(result ~ county+state+region, data=df, paste)

然后我指定NA代表缺少观察和使用apply获得数字输出。

df2$result[df2$result=='NA'] = NA 
new.df <- data.frame(df2[,1:3], apply(df2$result,2,as.numeric))

的输出低于所不同的是我加入0.5到在原岗位示于df每个值音符。

county state region X1 X2 X3 X4 
    1  1  1 1.5 2.5 3.5 4.5 
    2  1  1 4.5 3.5 2.5 1.5 
    1  2  1 0.5 NA 0.5 0.5 
    2  2  1 2.5 2.5 2.5 2.5 
    1  1  2 9.5 9.5 8.5 8.5 
    2  1  2 1.5 0.5 1.5 0.5 
    1  2  2 2.5 4.5 6.5 8.5 
    2  2  2 3.5 3.5 2.5 2.5

来源

2013-01-01 12:01:42

在我原来的职位，我问怎么删除列之间的空格使用apply的数据集。由于马修伦德伯格对我的大问题的回答，这并不是必要的。尽管如此，删除数据集的列之间的空格是我经常需要做的事情。为了保持完整性，我在这里发布了一个使用paste0和apply这样做的方法，部分来自Matthew的回答。

为了从数据中移除所有的空格设置x：

x <- read.table(text= " 
A B C D 
1 1 1 1 
1 1 2 2 
1 NA 1 3 
1 1 2 4 
1 2 1 5 
1 2 NA 6 
1 2 1 7 
1 2 2 8 
", header=TRUE, na.strings=NA) 

# use a dot to represent a missing observation 

x[is.na(x)] = '.' 

y <- as.data.frame(apply(x, 1, function(i) paste0(i, collapse=''))) 
colnames(y) <- 'result' 
y

给出：

下面的代码删除只在第二列和第三列之间的空间：

z <- as.data.frame(apply(x[,2:3], 1, function(i) paste0(i, collapse=''))) 

y <- data.frame(x[,1], z, x[,4]) 
colnames(y) <- c('A','BC','D') 
y

给予：

来源

2013-01-01 20:06:21

不需要为'apply'创建匿名函数。改为使用'...'参数传递给'paste0'。 'apply（x，1，paste0，collapse =''）'， –

回答

相关问题