我使用这个数据集是很好的格式化 https://raw.githubusercontent.com/jpatokal/openflights/master/data/airports.dat处理逗号字符串CSV
一切练一些文本挖掘与Python,但某些条目,如:
6898,"RAAF Williams, Laverton Base","Laverton","Australia",\N,"YLVT",-37.86360168457031,144.74600219726562,18,10,"O","Australia/Hobart","airport","OurAirports"
6899,"Nowra Airport","Nowra","Australia","NOA","YSNW",-34.94889831542969,150.53700256347656,400,10,"O","Australia/Sydney","airport","OurAirports"
有他们的名字和逗号这使得不规则的列表,因为它创建了同一个核心元素(名称)的多个元素
我将代码分配给列表中的每一行:
with open (filename) as txt:
for line in txt:
linea = line.split(',')
linea[3]=linea[3].strip('"')
我的主要问题是,linea[3]
应该是在这种情况下,国家australia
,但它返回Laverton
。
我也试过csv库几乎没有区别。
也与此有关:我的代码返回此该条
['6898', 'RAAF Williams, Laverton Base', 'Laverton', 'Australia', '\\N', 'YLVT', '-37.86360168457031', '144.74600219726562', '18', '10', 'O', 'Australia/Hobart', 'airport', 'OurAirports']
你尝试熊猫read_csv? 'split(',')'根本不正确 –
您的输出与您的问题描述不符,''澳大利亚'在索引3处就像您想要的一样。 – timgeb