2015-12-17 42 views
0

我创建了一个数据集来说明我拥有的问题。R - 排序半数字列

我的数据是这样的

id  time act 
1 1  time1 a 
2 1  time2 a 
3 1  time3 a 
4 1 time101 a 
5 1 time103 a 
6 1 time1001 b 
7 1 time1003 b 
9 1 time10000 b 
10 1 time100010 c 

我想是spread以正确的顺序与time的数据,这样的:

id 1 2 3 101 103 1001 1003 1004 10000 100010 
    1 a a a a a b b b  b  c 

这里是什么,我不完全理解。当我spread我的数据我得到类似

library(dplyr) 
library(tidyr) 

dt %>% spread(time, act) 

    id time1 time10000 time100010 time1001 time1003 time1004 time101 time103 time2 time3 
1 1  a   b   c  b  b  b  a  a  a  a 

所以R似乎认识到这样一些数字顺序排列的,但认为time10000是之前23

这是为什么?我可以解决这个问题。

我想是这样的:

id time1 time2 time3 time101 time103 time1001 time1003 time1004 time10000 time100010 
1 1  a  a  a  a  a  b  b  b   b   c 

数据

dt = structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
    time = structure(c(1L, 9L, 10L, 7L, 8L, 4L, 5L, 6L, 2L, 3L 
     ), .Label = c("time1", "time10000", "time100010", "time1001", 
    "time1003", "time1004", "time101", "time103", "time2", "time3" 
    ), class = "factor"), act = structure(c(1L, 1L, 1L, 1L, 1L, 
    2L, 2L, 2L, 2L, 3L), .Label = c("a", "b", "c"), class = "factor")), .Names = c("id", 
"time", "act"), class = "data.frame", row.names = c(NA, -10L)) 

回答

4

重新排序因子水平:

> dt$time<-factor(dt$time, as.character(dt$time)) 
> dt %>% spread(time, act) 
    id time1 time2 time3 time101 time103 time1001 time1003 time1004 time10000 
1 1  a  a  a  a  a  b  b  b   b 
    time100010 
1   c