2015-04-27 27 views
7

我有以下的载体:不区分大小写的排序字符串的向量在R中

mylist <- c("MBT.LN.ID", "ISA51VG.LN.ID", "R848.LN.ID", "sHz.LN.ID", "FK565.LN.ID", 
    "bCD.LN.ID", "MALP2s.LN.ID", "ADX.LN.ID", "AddaVax.LN.ID", "FCA.LN.ID", 
    "Pam3CSK4.LN.ID", "D35.LN.ID", "ALM.LN.ID", "K3.LN.ID", "K3SPG.LN.ID", 
    "MPLA.LN.ID", "DMXAA.LN.ID", "cGAMP.LN.ID", "Poly_IC.LN.ID", 
    "cdiGMP.LN.ID") 

我想他们不区分大小写字母顺序进行排序。

预期输出是这样的:

[1] "AddaVax.LN.ID" "ADX.LN.ID"  "ALM.LN.ID"  "bCD.LN.ID"  "cdiGMP.LN.ID" "cGAMP.LN.ID" 
[7] "D35.LN.ID"  "DMXAA.LN.ID" "FCA.LN.ID"  "FK565.LN.ID" "ISA51VG.LN.ID" "K3.LN.ID"  
[13] "K3SPG.LN.ID" "MALP2s.LN.ID" "MBT.LN.ID"  "MPLA.LN.ID"  "Pam3CSK4.LN.ID" "Poly_IC.LN.ID" 
[19] "R848.LN.ID"  "sHz.LN.ID" 

我试过,但失败了(使用R.3.2.0阿尔法):

> sort(mylist) 
[1] "ADX.LN.ID"  "ALM.LN.ID"  "AddaVax.LN.ID" "D35.LN.ID" 
[5] "DMXAA.LN.ID" "FCA.LN.ID"  "FK565.LN.ID" "ISA51VG.LN.ID" 
[9] "K3.LN.ID"  "K3SPG.LN.ID" "MALP2s.LN.ID" "MBT.LN.ID" 
[13] "MPLA.LN.ID"  "Pam3CSK4.LN.ID" "Poly_IC.LN.ID" "R848.LN.ID" 
[17] "bCD.LN.ID"  "cGAMP.LN.ID" "cdiGMP.LN.ID" "sHz.LN.ID" 
+3

sort'的'输出与您的语言:http://stackoverflow.com/a/7229428/3710546 –

+1

我与'排序(MYLIST)'预期的输出。你的语言环境是什么? – Cath

+0

@CathG:'LANG =的en_US.UTF-8 LC_CTYPE = “C” LC_NUMERIC = “C” LC_TIME = “C” LC_COLLATE = “C” LC_MONETARY = “C” LC_MESSAGES = “C” LC_PAPER = “C” LC_NAME = “C” LC_ADDRESS = “C” LC_TELEPHONE = “C” LC_MEASUREMENT = “C” LC_IDENTIFICATION = “C” LC_ALL = C' – pdubois

回答

12

尝试

mylist[order(tolower(mylist))] 
+1

@DavidArenburg,是的,我不想改变我的本地设置,但我想知道它是否会工作(谢谢你这样做)。因此Collat​​e = C使命令和排序区分大小写 – Cath

6

正如指出的@Pascal,记录在help(Comparison)sort是本地特定的。一个选项是切换您的本地(例如Sys.setlocale("LC_TIME", "us")),但这可能是不方便的。另一种选择可能是使用gtools::mixedsort,这也可能是有用的,因为你的字符串也包含数字。

library(gtools) 
mixedsort(mylist) 

# [1] "AddaVax.LN.ID" "ADX.LN.ID"  "ALM.LN.ID"  "bCD.LN.ID"  "cdiGMP.LN.ID" "cGAMP.LN.ID" "D35.LN.ID"  "DMXAA.LN.ID" "FCA.LN.ID"  "FK565.LN.ID" 
# [11] "ISA51VG.LN.ID" "K3.LN.ID"  "K3SPG.LN.ID" "MALP2s.LN.ID" "MBT.LN.ID"  "MPLA.LN.ID"  "Pam3CSK4.LN.ID" "Poly_IC.LN.ID" "R848.LN.ID"  "sHz.LN.ID" 
3
> library(searchable) 
> sort(ignore.case(mylist)) 
[1] "AddaVax.LN.ID" "ADX.LN.ID"  "ALM.LN.ID"  "bCD.LN.ID"  "cdiGMP.LN.ID" 
[6] "cGAMP.LN.ID" "D35.LN.ID"  "DMXAA.LN.ID" "FCA.LN.ID"  "FK565.LN.ID" 
[11] "ISA51VG.LN.ID" "K3.LN.ID"  "K3SPG.LN.ID" "MALP2s.LN.ID" "MBT.LN.ID"  
[16] "MPLA.LN.ID"  "Pam3CSK4.LN.ID" "Poly_IC.LN.ID" "R848.LN.ID"  "sHz.LN.ID" 
相关问题