2017-03-12 45 views
0

我试图在使用melt()函数将宽转换为长格式后,将分类变量转换为R中的因子。然而,当我运行因子功能和输入水平和标签时,我得到一个表格:R中的生成因子问题

有没有人知道为什么会发生这种情况?

law <- read.csv("lawyers_class_new.csv") 


library(reshape2) 
law <- melt(law, id.vars = c("Subj"), measure.vars = c("lawyer", "neutral", "engineer", "neutral_urb", "neutral_rur")) 
law <- law[order(law$Subj),] 
law <- within(law, 
       Subj <- factor(Subj), 
       variable <- factor(variable) 
      ) 
law$variable<- ordered(law$variable,levels=c(1,2,3,4,5),labels=c("lawyer","neutral", 
    "engineer","neutral_urb","neutral_rur")) 


Output: 

law$variable 
    [1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>  <NA> <NA> <NA> <NA> 
[18] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[35] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[52] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[69] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[86] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[103] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[120] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[137] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 

融化的数据帧:

**Subj Cond variable value** 
1   2  lawyer  3 
1   3  neutral  1 
1   1  engineer  3.5 
1   5  neutral_urb 3 
1   4  neutral_rur 3.5 
2   2  lawyer  1 
2   3  neutral  3.5 
2   1  engineer  4.5 
2   5  neutral_urb 2 
2   4  neutral_rur 3.5 

原始数据帧:

Subj lawyer neutral engineer neutral_urb neutral_rur 
1   3  1  3.5   3   3.5 
2   1  3.5  4.5   2   3.5 
+1

请做一个可重现的例子。我们无法访问lawyers_class_new.csv。 http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example –

+1

在第二次转换为有序因子时,似乎水平不是“1:5”。 levels参数应该是*因子级别显示为*,只有当您想要将它们更改为其他内容时,标签才是可选的。 – Gregor

+0

另外,我不知道你的目标,但许多人错误地认为,按照特定顺序(例如绘图)来设置一个有序的因子是必要的。事实并非如此。 “有序”因素的唯一原因是建模时使用的对比度。 – Gregor

回答

0

为了最大限度地减少错误,我也不会导入字符列的因素,似乎使用within不为法律$变量创造适当的因素。因此,我会指定这样的因素来确保正确的顺序。

law <- read.table(text="Subj Cond variable value 
1   2  lawyer  3 
1   3  neutral  1 
1   1  engineer  3.5 
1   5  neutral_urb 3 
1   4  neutral_rur 3.5 
2   2  lawyer  1 
2   3  neutral  3.5 
2   1  engineer  4.5 
2   5  neutral_urb 2 
2   4  neutral_rur 3.5", header=TRUE, stringsAsFactors=FALSE) 

law <- law[order(law$Subj),] 

law$Subj <- as.factor(law$Subj) 
law$variable <- factor(law$variable,levels =c("lawyer","neutral", 
    "engineer","neutral_urb","neutral_rur")) 

str(law) 
'data.frame': 10 obs. of 4 variables: 
$ Subj : Factor w/ 2 levels "1","2": 1 1 1 1 1 2 2 2 2 2 
$ Cond : int 2 3 1 5 4 2 3 1 5 4 
$ variable: Factor w/ 5 levels "lawyer","neutral",..: 1 2 3 4 5 1 2 3 4 5 
$ value : num 3 1 3.5 3 3.5 1 3.5 4.5 2 3.5