ValueError异常：无效与基座10字面INT（）： '3" 为\ r'

我的以下csv文件（test.csv）含量样品：注意：我test.csv文件大约60MBValueError异常：无效与基座10字面INT（）： '3" 为 r'

"Position","Value" 
"2545600","19" 
"2545601","19" 
"2545602","19" 
"2545603","19" 
"2545604","20" 
"2545605","20" 
"2545606","21" 
"2545607","22" 
"2545608","21" 
"2545609","20" 
"2545610","21" 
"2545611","18" 
"2545612","19" 
"2545613","21" 
"2545614","21" 
"2545615","21" 
"2545616","21" 
"2545617","22" 
"2545618","25" 
"2545619","25"

我下面的Python代码（test.py）：

#!/usr/bin/python 
import sys 

txt = open(sys.argv[1], 'r') 
out = open(sys.argv[2], 'w') 
mil = float(sys.argv[3]) 

out.write('chr\tstart\tend\tfeature\t'+sys.argv[2]+'\n') 

for line in txt: 
    if 'Position' not in line: 
     line = line.strip('",\n') 
     line = line.split('","') 

     line[1] = str(int(line[1])/mil) 

     out.write('gi|255767013|ref|NC_000964.3|\t'+line[0]+'\t'+line[0]+'\t\t'+line[1]+'\n') 

txt.close() 
out.close()

我的命令行：

python test.py test.csv test.igv 5

后，我跑我得到了一个错误的命令：

Traceback (most recent call last): 
    File "test.py", line 15, in <module> 
    line[1] = str(int(line[1])/mil) 
ValueError: invalid literal for int() with base 10: '3"\r'

但是，如果我创建一个新的空csv文件，即small.csv并从我的test.csv文件复制/粘贴只有几行（如上面的示例）。然后它成功运行该命令。

python test.py small.csv small.igv 5

输入small.csv：

"Position","Value" 
"2545600","19" 
"2545601","19" 
"2545602","19" 
"2545603","19" 
"2545604","20" 
"2545605","20" 
"2545606","21" 
"2545607","22" 
"2545608","21" 
"2545609","20"

输出small.igv：

chr start end feature small.igv 
gi|255767013|ref|NC_000964.3| 2545600 2545600  3.8 
gi|255767013|ref|NC_000964.3| 2545601 2545601  3.8 
gi|255767013|ref|NC_000964.3| 2545602 2545602  3.8 
gi|255767013|ref|NC_000964.3| 2545603 2545603  3.8 
gi|255767013|ref|NC_000964.3| 2545604 2545604  4.0 
gi|255767013|ref|NC_000964.3| 2545605 2545605  4.0 
gi|255767013|ref|NC_000964.3| 2545606 2545606  4.2 
gi|255767013|ref|NC_000964.3| 2545607 2545607  4.4 
gi|255767013|ref|NC_000964.3| 2545608 2545608  4.2 
gi|255767013|ref|NC_000964.3| 2545609 2545609  4.0

这就是我想要的。所以这个问题，为什么我不能在一个更大尺寸的csv文件上做到这一点？

来源

2013-01-21 Stickers

在这种情况下使用csv模块要好得多。从csv文件读取的每一行都以字符串列表形式返回。剥离空格的问题不会出现，您可以在csv.reader函数的参数中指定分隔符（此处不需要）。

import csv 
import sys 

out = open(sys.argv[2], 'w') 
mil = float(sys.argv[3]) 

out.write('chr\tstart\tend\tfeature\t'+sys.argv[2]+'\n') 
with open(sys.argv[1], 'rb') as f: 
    reader = csv.reader(f, delimiter=',') 
    headers = reader.next() # Consider headers separately 
    for line in reader: 
     line[1] = str(int(line[1])/mil) 
     out.write('gi|255767013|ref|NC_000964.3|\t'+line[0]+'\t'+line[0]+'\t\t'+line[1]+'\n') 
out.close()

python test.py test.csv test.igv 5 && cat test.igv应显示预期的输出。

来源

2013-01-21 19:48:25 sidi

尝试

for line in ..... : 
    line = line.strip()

这将从线串删除行结束。

更好的解决方案：使用Python的csv模块为您处理这些方面。

来源

2013-01-21 19:22:37

建议csv模块更有帮助。

例如：的

"Position","Value" 
"2545600","19" 
"2545601","19" 
"2545602","19" 
"2545603","19"

import csv 
f = open("ex.csv") 
for line in csv.reader(f): 
    print line

和数据给出的

['Position', 'Value'] 
['2545600', '19'] 
['2545601', '19'] 
['2545602', '19'] 
['2545603', '19']

输出其更易于管理。

此外，csv模块也会写入csv文件。

来源

2013-01-21 19:31:35 sotapme

ValueError异常：无效与基座10字面INT（）： '3" 为\ r'

回答

相关问题