在linux中,你可以使用awk
与fread
或者它可以与read.table
用管道输送。在这里,我用awk
pth <- '/home/akrun/file.txt' #change it to your path
v1 <- sprintf("awk '/^(ID_REF|LMN)/{ matched = 1} matched {$1=$1; print}' OFS=\",\" %s", pth)
与fread
library(data.table)
fread(v1)
# ID_REF 1688628068_A.AVG_Signal 1688628068_A.Avg_NBEADS
#1: ILMN_1343291 62821.840 135
#2: ILMN_1343292 3255.167 131
#3: ILMN_1343293 42924.910 152
#4: ILMN_1343294 55255.210 100
# 1688628068_A.BEAD_STDERR 1688628068_A.Detection_Pval
#1: 413.93990 0
#2: 47.76587 0
#3: 539.30260 0
#4: 746.14570 0
或者使用read.table
read.table(pipe(v1), header=TRUE, sep=',', check.names=FALSE)
# ID_REF 1688628068_A.AVG_Signal 1688628068_A.Avg_NBEADS
#1 ILMN_1343291 62821.840 135
#2 ILMN_1343292 3255.167 131
#3 ILMN_1343293 42924.910 152
#4 ILMN_1343294 55255.210 100
# 1688628068_A.BEAD_STDERR 1688628068_A.Detection_Pval
#1 413.93990 0
#2 47.76587 0
#3 539.30260 0
#4 746.14570 0
注意阅读改变了分隔符,
:我从1688628068_A.Detection Pval
改变了列名1688628068_A.Detection_Pval
由于某种原因,多余的空格会造成fread
问题。与read.table
这不是一个问题。因此,以下工作也可以正常使用read.table
v2 <- sprintf("awk '/^(ID_REF|ILMN)/{ matched = 1} matched { print}' %s", pth)
read.table(pipe(v2), header=TRUE, check.names=FALSE)
# ID_REF 1688628068_A.AVG_Signal 1688628068_A.Avg_NBEADS
#1 ILMN_1343291 62821.840 135
#2 ILMN_1343292 3255.167 131
#3 ILMN_1343293 42924.910 152
#4 ILMN_1343294 55255.210 100
# 1688628068_A.BEAD_STDERR 1688628068_A.Detection_Pval
#1 413.93990 0
#2 47.76587 0
#3 539.30260 0
#4 746.14570 0
看起来您的列名比列多。 '1688628068_A.Detection Pval'是单列吗?如果文件有'#'需要跳过,'read.table('yourfile.txt',header = TRUE,fill = TRUE'')应该读取它。 – akrun
@akrun是的,这是一个单列 – Hashim
一个选项是将文件中的列名更改为“1688628068_A.Detection_Pval”,并且没有使用'fill = TRUE'来读取 – akrun