2015-01-13 30 views
1

我与矩阵的工作从包含数字和字符的CSV读入。这是一个较小的矩阵,但基本上什么我的工作:数值列顺序字符矩阵

[,1] [,2] [,3]   [,4] [,5] [,6] [,7] [,8] [,9] 
V2 "A" "1" "Sample X1" "34712" "39390" "38858" "38574" "38660" 
V3 "A" "2" "Sample X2" "35333" "39940" "40533" "39936" "40669" 
V4 "A" "3" "Sample X3" "33612" "39601" "38658" "39220" "39465" 
V5 "A" "4" "Sample X4" "34309" "39200" "38597" "39820" "40081" 
V6 "A" "5" "Sample X5" "33637" "39404" "40497" "39388" "40033" 
V7 "A" "6" "Sample X6" "35314" "39522" "40345" "38624" "40306" 
V8 "A" "7" "Sample X7" "35548" "39000" "41408" "38310" "39849" 
V9 "A" "8" "Sample X8" "33972" "39930" "39777" "39582" "39570" 
V10 "A" "9" "Sample X9" "34808" "39857" "39252" "39248" "38465" 
V11 "A" "10" "Sample X10" "34316" "39798" "39776" "39516" "38812" 
V12 "A" "11" "Sample X11" "34476" "38581" "39672" "38997" "38794" 
V13 "A" "12" "Sample X12" "36246" "38809" "37872" "38100" "36925" 
V14 "B" "1" "Sample X13" "33642" "40201" "40202" "39320" "40426" 
V15 "B" "2" "Sample X14" "33381" "40624" "40349" "41350" "40490" 
V16 "B" "3" "Sample X15" "34465" "42096" "41194" "40613" "40416" 
V17 "B" "4" "Sample X16" "33957" "41905" "42273" "40710" "40681" 
V18 "B" "5" "Sample X17" "33877" "42040" "42226" "40788" "41261" 
V19 "B" "6" "Sample X18" "33970" "41860" "41149" "41093" "40877" 
V20 "B" "7" "Sample X19" "34745" "42040" "40186" "40862" "41044" 
V21 "B" "8" "Sample X20" "34140" "41274" "39880" "40356" "40496" 
V22 "B" "9" "Sample X21" "33929" "40652" "41410" "40760" "40718" 
V23 "B" "10" "Sample X22" "33684" "39220" "40478" "41500" "40094" 
V24 "B" "11" "Sample X23" "33141" "41446" "41121" "40726" "41020" 
V25 "B" "12" "Sample X24" "33405" "38481" "37716" "38562" "38218" 
V26 "C" "1" "Sample X25" "71560" "86402" "85614" "84273" "83264" 
V27 "C" "2" "Sample X26" "72144" "86266" "88082" "87672" "87356" 
V28 "C" "3" "Sample X27" "71946" "90201" "89156" "88386" "88006" 
V29 "C" "4" "Sample X28" "71758" "89108" "88225" "86006" "88654" 
V30 "C" "5" "Sample X29" "71144" "86558" "88614" "87028" "88809" 
V31 "C" "6" "Sample X30" "70504" "89230" "88869" "86653" "86356" 
V32 "C" "7" "Sample X31" "67874" "88405" "84878" "84914" "85425" 
V33 "C" "8" "Sample X32" "70273" "87865" "87529" "87945" "86172" 

我想没有标题排序的第二列的矩阵如此这般:

A 1 . . . 
B 1 
C 1 
A 2 
B 2 
C 2 
A 3 
. 
. 
. 
A 12 
B 12 
C 12 . . . 

我环顾四周并且发现,你可以使用命令:

data <- data[order(data[,2],] 

但它出来是这样的:

A 1 . . . 
B 1 
c 1 
A 10 
B 10 
C 10 
A 11 
B 11 
C 11 
A 12 
B 12 
C 12 
A 2 
B 2 
C 2 
. 
. 
. 
A 9 
B 9 
C 9 . . . 

是不是因为这个矩阵是一个字符矩阵?我如何才能做出仅第二列数字,所以我可以根据它进行分类?

感谢

回答

1

具有呈矩阵数据是一个坏主意,当你想拥有类的混合物(例如数字和字符)跨列。相反,你应该使用数据框。

理想情况下,读出的数据到一个数据帧与read.csvread.table。否则,强制您的矩阵与as.data.frame的数据框。

鉴于矩阵m(你的情况data):

d <- as.data.frame(m, stringsAsFactors=FALSE) 
d[, 3] <- as.numeric(d[, 3]) # coerce the relevant column to numeric 
d[order(d[, 3]), ] 

请注意,您以便根据需要与m[order(as.numeric(m[, 3])), ],但由此产生的列仍然都将character矩阵。

注意:对您所看到的排序行为的解释是,对于字符向量,任何以1(例如10)开头的内容都会出现在2之前。

+0

感谢关于data.frame的提示,我很困惑如何获取多个类的数据。无论如何改变它的排序方式吗?成为1,2,3,4,...,10,11,12。或者最好的办法就是将这些行删除并放到最后? –

+0

@IlyaLederman不确定你的意思。我提供的代码(d < - as.data.frame(数据); d [顺序(d [1,3]),]'),或'数据[顺序(as.numeric(数据[1,3])), ]',都应该按照你的意愿订购。 – jbaums

+0

我仍然得到1,10,11,12,2,3,4,5,9,7,8,9的订单。我做了lapply(数据,课),它说一切都是一个因素。我真的不明白什么是因素。我在CSV读象 数据<-t(read.csv( “data.csv”,首标= FALSE,则跳过= 5))[ - 1,] 然后执行 数据< - as.data.frame(数据) –