2015-11-02 42 views
2
sessionInfo() 
R version 3.2.2 (2015-08-14) 
Platform: x86_64-w64-mingw32/x64 (64-bit) 
Running under: Windows 7 x64 (build 7601) Service Pack 1 

locale: 
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C     
[5] LC_TIME=German_Germany.1252  

attached base packages: 
[1] stats  graphics grDevices utils  datasets methods base  

other attached packages: 
[1] dplyr_0.4.3  plyr_1.8.3  tidyr_0.3.1  gridExtra_2.0.0 scales_0.3.0 
[6] ggplot2_1.0.1 RPostgreSQL_0.4 DBI_0.3.1  

loaded via a namespace (and not attached): 
[1] Rcpp_0.12.1  lubridate_1.3.3 assertthat_0.1 digest_0.6.8  MASS_7.3-44  
[6] R6_2.1.1   grid_3.2.2  gtable_0.1.2  magrittr_1.5  stringi_0.5-5 
[11] reshape2_1.4.1 proto_0.3-10  tools_3.2.2  stringr_1.0.0 munsell_0.4.2 
[16] parallel_3.2.2 colorspace_1.2-6 memoise_0.2.1 

例如,我在一列中有n个字符串,如下所示。我想根据最后一个字词对字符串进行排序。根据r中的最后一个字对字符串进行排序

dput(dsp) 
c("handlingstation/cropping/ forward/Linie 1", "handlingstation/cropping/ forward/Linie 2", 
"conveyorstation/Linie 1", "conveyorstation/Linie 2", "soft/handling/cleaning/backward/Linie 3", 
"jumper/doublejumper/Linie 1", "jumper/doublejumper/Linie 2" 
) 



dsp 
[1] "handlingstation/cropping/ forward/Linie 1" 
[2] "handlingstation/cropping/ forward/Linie 2" 
[3] "conveyorstation/Linie 1"      
[4] "conveyorstation/Linie 2"      
[5] "soft/handling/cleaning/backward/Linie 3" 
[6] "jumper/doublejumper/Linie 1"     
[7] "jumper/doublejumper/Linie 2" 

所需的输出

dsp_sorted 
[1] "handlingstation/cropping/ forward/Linie 1" 
[2] "conveyorstation/Linie 1"      
[3] "jumper/doublejumper/Linie 1"     
[4] "handlingstation/cropping/ forward/Linie 2" 
[5] "conveyorstation/Linie 2"      
[6] "jumper/doublejumper/Linie 2"     
[7] "soft/handling/cleaning/backward/Linie 3" 

我想在prticular列中的所有字符串基于硬道理订购。这里应该以Linie 1,Linie 2等为基础。

有人能告诉我怎么做到这些。

回答

4

你可以尝试的东西,如下

dsp[order(sub(".*/ ", "", dsp))] 
# [1] "handlingstation/cropping/ forward/Linie 1" "conveyorstation/Linie 1"      
# [3] "jumper/doublejumper/Linie 1"     "handlingstation/cropping/ forward/Linie 2" 
# [5] "conveyorstation/Linie 2"      "jumper/doublejumper/Linie 2"     
# [7] "soft/handling/cleaning/backward/Linie 3" 

这基本上是使用正则表达式的/最后一个出场之前删除一切和排序您的载体,根据你的情况,这个词


虽然使用混合订单操作可能会更安全(因为您在单个值中包含数字和字符)

library(gtools) 
dsp[mixedorder(sub(".*/ ", "", dsp))] 
# [1] "handlingstation/cropping/ forward/Linie 1" "conveyorstation/Linie 1"      
# [3] "jumper/doublejumper/Linie 1"     "handlingstation/cropping/ forward/Linie 2" 
# [5] "conveyorstation/Linie 2"      "jumper/doublejumper/Linie 2"     
# [7] "soft/handling/cleaning/backward/Linie 3" 

另一种选择(取决于您的真实数据)是从字符串末尾抽取的数量和种类相应

dsp[order(as.numeric(sub(".*(\\d+$)", "\\1", dsp)))] 

显然,stringi封装具有混合顺序选项也是通过指定opts_collator = list(numeric = TRUE),同时提取一个字符串的最后一个单词,所以你也可以这样做

library(stringi) 
dsp[stri_order(stri_extract_last_words(dsp), opts_collator = list(numeric = TRUE))] 
# [1] "handlingstation/cropping/ forward/Linie 1" "conveyorstation/Linie 1"      
# [3] "jumper/doublejumper/Linie 1"     "handlingstation/cropping/ forward/Linie 2" 
# [5] "conveyorstation/Linie 2"      "jumper/doublejumper/Linie 2"     
# [7] "soft/handling/cleaning/backward/Linie 3" 
+0

非常感谢。它运作良好。在我的实际数据框(“函数”)中,dsp是一列。你能告诉我如何通过在dsp列上应用上面的混合顺序(gtools)来对数据框“函数”进行排序。 – Chanti

+0

我认为这只是'function [mixedorder(sub(“。* /”,“”,function $ dsp)),]'。数据集btw的错误名称。 –

+1

谢谢大卫。我完全同意你关于数据集的名称。我改变了它 – Chanti

相关问题