得到一个序列

我创建了以下矩阵R的具体子：得到一个序列

positions = cbind(seq(from = 20, to = 68, by = 4),seq(from = 22, to = 70, by = 4))

我也有以下字符串：

"SEQRES 1 L 36 THR PHE GLY SER GLY GLU ALA ASP CYS GLY LEU ARG PRO   "

我试图使用应用功能制作第一个索引来自位置[，1]，第二个来自位置[，2]的子字符串列表（mystring，start.position，end.position）。我可以使用for循环轻松完成此操作，但我认为应用会更快。

我能得到它的工作如下，但我不知道是否有一个更清洁的方式：

parse.me = cbind(seq(from = 20, to = 68, by = 4),seq(from = 22, to = 70, by = 4), input) 
apply(parse.me, MARGIN = 1, get.AA.seqres) 

get.AA.seqres <- function(items){ 
start.position = as.numeric(items[1]) 
end.position = as.numeric(items[2]) 
string = items[3] 
return (substr(string, start.position, end.position) ) 
}

来源

2012-05-28 user1357015

你为什么不分配空白空间并丢弃前三个元素？ – Andrie

PDB文件元素由不是由空白的列定义。因此，当规范特别提及列数时，我很犹豫是否会将空白分割出来。虽然感谢虽然！ – user1357015

试试这个：

> substring(input, positions[, 1], positions[, 2]) 
[1] "THR" "PHE" "GLY" "SER" "GLY" "GLU" "ALA" "ASP" "CYS" "GLY" "LEU" "ARG" "PRO"

来源

2012-05-28 18:29:49

我喜欢Andrie的切实可行的建议，但如果你需要走这条路线的一些其他原因，你的问题听起来像它可以通过Vectorize()解决：

#Your data 
positions = cbind(seq(from = 20, to = 68, by = 4),seq(from = 22, to = 70, by = 4)) 
input <- "SEQRES 1 L 36 THR PHE GLY SER GLY GLU ALA ASP CYS GLY LEU ARG PRO   " 

#Vectorize the function substr() 
vsubstr <- Vectorize(substr, USE.NAMES = FALSE) 
vsubstr(input, positions[,1], positions[,2]) 
#----- 
[1] "THR" "PHE" "GLY" "SER" "GLY" "GLU" "ALA" "ASP" "CYS" "GLY" "LEU" "ARG" "PRO" 

#Or, read the help page on ?substr about the bit for recycling in the first paragraph of details 

substr(rep(input, nrow(positions)), positions[,1], positions[,2]) 
#----- 
[1] "THR" "PHE" "GLY" "SER" "GLY" "GLU" "ALA" "ASP" "CYS" "GLY" "LEU" "ARG" "PRO"

来源

2012-05-28 18:07:09 Chase

得到一个序列

回答

相关问题