2016-04-27 82 views
1

Python新手在这里。将CSV导入到列表Python中

我有一个包含在这种格式的数字的CSV文件

9143.680696, 427953.500000, 11919.104475, 11908.727555, 1.000871, 0.029506, 15.546608, 93, 121, 123, 7 
7704.773182, 330297.500000, 19186.759308, 19170.146116, 1.000867, 0.029426, 14.302257, 93, 121, 123, 7 

我需要阅读的文件,使得该列表会这样

[ 
[[9143.680696, 427953.500000, 11919.104475, 11908.727555, 1.000871, 0.029506, 15.546608, 93, 121, 123], [7]], 
[[7704.773182, 330297.500000, 19186.759308, 19170.146116, 1.000867, 0.029426, 14.302257, 93, 121, 123], [7]] 
] 

每一行的最后一个数字存储在不同的列表中,如7中的情况。

我已经研究了一些答案,但发现它们作为字符串存储到列表中,这与我正在处理的问题不兼容。

非常感谢您的帮助。


+0

你所期望的最后四个项目的每一行被视为整数或浮点数(93或93.0 )? – RafG

回答

0

没有最简单的方法使用外部模块:

更新:我取代了简单的float(...)转换与尝试生成一个浮动的新convert(...)方法,并返回原始的字符串(或可替换地可以做别的事情),而不是如果令牌抛出异常不是一个数字。

def convert(value_str): 
    try: # try to convert it to a float: 
     return float(value_str) 
    except ValueError: # if it is not a valid float literal, return the original string: 
     return value_str 

with open("file.csv") as csvfile: 
    split_lines = [line.split(",") for line in csvfile] 
    data = [[[convert(n) for n in line[:-1]], [convert(line[-1])]] for line in split_lines] 

    print(data) 

输出作为示例数据从问题(手动格式化):

[ 
    [ [9143.680696, 427953.5, 11919.104475, 11908.727555, 1.000871, 0.029506, 15.546608, 93.0, 121.0, 123.0], [7.0] ], 
    [ [7704.773182, 330297.5, 19186.759308, 19170.146116, 1.000867, 0.029426, 14.302257, 93.0, 121.0, 123.0], [7.0] ] 
] 
+0

如果列表中的值是-nan,它会显示错误吗? – ethanruan

+0

是的,这假设只有有效的浮点数字用逗号分隔。您可以添加一个支票,例如如果转换失败,则返回值作为字符串。添加到我的答案... –

+0

@ethanruan增加了一个convert()函数,用于处理令牌不是有效浮点数的情况。 –

3

你可以尝试这样的,

>>> csv = '''9143.680696, 427953.500000, 11919.104475, 11908.727555, 
1.000871, 0.029506, 15.546608, 93, 121, 123, 7 
7704.773182, 330297.500000, 19186.759308, 19170.146116, 1.000867, 0.029426, 14.302257, 93, 121, 123, 7''' 
>>> [[line.split(',')[0:-1], [line.split(',')[-1]]] for line in csv.splitlines()] 
[[['9143.680696', ' 427953.500000', ' 11919.104475', ' 11908.727555', ' 1.000871', ' 0.029506', ' 15.546608', ' 93', ' 121', ' 123'], [' 7']], [['7704.773182', ' 330297.500000', ' 19186.759308', ' 19170.146116', ' 1.000867', ' 0.029426', ' 14.302257', ' 93', ' 121', ' 123'], [' 7']]] 

如果你想float项目,你可以使用map

>>> data = csv.splitlines() 
>>> data = [map(float, line.split(',')) for line in csv.splitlines()] 
>>> [[items[:-1], items[-1]] for items in data] 
[[[9143.680696, 427953.5, 11919.104475, 11908.727555, 1.000871, 0.029506, 15.546608, 93.0, 121.0, 123.0], 7.0], [[7704.773182, 330297.5, 19186.759308, 19170.146116, 1.000867, 0.029426, 14.302257, 93.0, 121.0, 123.0], 7.0]] 

漂亮的印刷:

>>> import pprint 
>>> pprint.pprint([[items[:-1], items[-1]] for items in data]) 
[[[9143.680696, 
    427953.5, 
    11919.104475, 
    11908.727555, 
    1.000871, 
    0.029506, 
    15.546608, 
    93.0, 
    121.0, 
    123.0], 
    7.0], 
[[7704.773182, 
    330297.5, 
    19186.759308, 
    19170.146116, 
    1.000867, 
    0.029426, 
    14.302257, 
    93.0, 
    121.0, 
    123.0], 
    7.0]] 
+0

我认为ethanruan想要存储十进制数,所以添加转换? – Whysmerhill

+0

在Python 3上,您需要'list(map(...))'或列表理解。 – RafG

+0

@Whysmerhill我已经添加了一个解决方案。谢谢 –

0

刚使用[]操作以获取列表的左侧和右侧部分:

import csv 
... 
list = [] 
with open(filename, "rb") as fd: 
    reader = csv.reader(fd, delimiter = ",") 
    for row in reader: 
     left = list(map(lambda x: float(x), row[:-1])) 
     right = list(map(lambda x: float(x), row[-1:])) 
     list.append([ left, right ]) 
1

CSV libraries通常阅读领域的字符串,所以你需要的字段显式转换。从csv模块的文档:

从csv文件读取的每一行都以字符串列表形式返回。否 执行自动数据类型转换。

>>> import csv 
>>> with open('eggs.csv', 'rb') as csvfile: 
...  spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|') 
...  for row in spamreader: 
...   <process row> 

同样,CSV图书馆将平等对待所有领域,所以你需要明确地包装在列表中的最后一个字段。

例如:

a = ["1.23", "2.34", "10", "100", "1000"] 

>>> map(float, a[0:2]) + map(int, a[2:4]) + [[int(a[4])]] 

[1.23, 2.34, 10, 100, [1000]] 
0

您需要遍历列表,并将它们转换为整数。同样以你想要的列表格式存储它们。

例如:

import csv 
l = list() 
with open('data.csv', 'r') as csvfile: 
    reader = csv.reader(csvfile, delimiter=',') 
    for row in reader: 
    l.append([[float(row[:-1])]+[float(row[-1])]]) 
print(l) 
0

你可以试试这个, 考虑输入文件名input.csv

import csv 
new_list = [] 
with open('input.csv') as inp: 
    csv_reader = csv.reader(inp, delimiter=',') 
    for line in csv_reader: 
     new_list.append([map(float, line[:-1])] + [map(float, line[-1:])]) 

从IPython中演示,

In [1]: import csv 

In [2]: new_list = [] 

In [3]: with open('input.csv') as inp: 
    ...:  csv_reader = csv.reader(inp, delimiter=',') 
    ...:  for line in csv_reader: 
    ...:   new_list.append([line[:-1]] + [line[-1:]]) 
    ...:   

In [4]: new_list 
Out[4]: 


    [[[9143.680696, 
    427953.5, 
    11919.104475, 
    11908.727555, 
    1.000871, 
    0.029506, 
    15.546608, 
    93.0, 
    121.0, 
    123.0], 
    [7.0]], 
[[7704.773182, 
    330297.5, 
    19186.759308, 
    19170.146116, 
    1.000867, 
    0.029426, 
    14.302257, 
    93.0, 
    121.0, 
    123.0], 
    [7.0]]]