排序二进制序列有R

想象一下如下因素序列：排序二进制序列有R

我想是因为相似的排序顺序的序列：

2,3,4,5号线与第1行具有相同的相似性，因为它们仅相差一位。所以第2,3,4,5行的顺序也可以是3,2,5,4。

接下来是第6行，因为它与第1行相差2位。

这可以用R来完成吗？

来源

2016-07-14 Hans-Christian Willibald

让

x <- c("0000", "0001", "0010", "0011", "0100", "0101", "0110", "0111", 
     "1000", "1001", "1010", "1011", "1100", "1101", "1110", "1111")

1）使用digitsum功能从this答案：

digitsum <- function(x) sum(floor(x/10^(0:(nchar(x) - 1))) %% 10) 
x[order(sapply(as.numeric(x), digitsum))] 
# [1] "0000" "0001" "0010" "0100" "1000" "0011" "0101" "0110" "1001" "1010" "1100" 
# [12] "0111" "1011" "1101" "1110" "1111"

2）使用正则表达式：

x[order(gsub(0, "", x))] 
# [1] "0000" "0001" "0010" "0100" "1000" "0011" "0101" "0110" "1001" "1010" "1100" 
# [12] "0111" "1011" "1101" "1110" "1111"

来源

2016-07-14 20:34:07 Julius

而不是digitum函数，难道你不这样做：'x [order（sapply（strsplit（x，“”），function（x）sum（x == 1）））] ' – eipi10

@ eipi10，当然，但可能正则表达式的解决方案将会比其他涉及数字求和的任何其他解决方案更加整洁。 – Julius

我同意。但是，找出所有第二好的方式去做R的事情确实很有趣。 – eipi10

嗯，这是我的尝试。试试看看它是否适合你的需求。它不依赖于stringr包

library('stringr') 
# Creates a small test data frame to mimic the data you have. 
df <- data.frame(numbers = c('0000', '0001', '0010', '0011', '0100', '0101', '0111', '1000'), stringsAsFactors = FALSE) 
df$count <- str_count(df$numbers, '1') # Counts instances of 1 occurring in each string 
df[with(df, order(count)), ] # Orders data frame by number of counts. 

    numbers count 
1 0000  0 
2 0001  1 
3 0010  1 
5 0100  1 
8 1000  1 
4 0011  2 
6 0101  2 
7 0111  3

来源

2016-07-14 20:32:03 Sam

这只能如果第一个条目是'0000'。 OP可能需要更通用的解决方案 –

因为我们正在谈论串的距离，你可能想使用stringdist功能从stringdist包来完成：

library(stringdist) 
x <- c("0000", "0001", "0010", "0011", "0100", "0101", "0110", "0111", 
     "1000", "1001", "1010", "1011", "1100", "1101", "1110", "1111") 

#stringdistmatrix(x) will calculate the pairwise distances from the lowest value 
#0000 in this case 
distances <- stringdistmatrix(x, '0000') 

#use the distances to order the vector 
x[order(distances)] 
#[1] "0000" "0001" "0010" "0100" "1000" "0011" "0101" "0110" 
# "1001" "1010" "1100" "0111" "1011" "1101" "1110" "1111"

或者一气呵成：

x[order(stringdist(x, '0000'))]

来源

2016-07-14 21:22:36 LyzandeR

排序二进制序列有R

回答

相关问题