2017-02-28 97 views
1

我想读取csv文件,然后将其转换为数据帧,但我不知道为什么所有的列都显示在第一行,甚至与分隔符或分隔符或者没有他们我无法分开他们。我不知道如何更改代码才能获得正确的结果? 这里是文件熊猫read_csv不正确解析csv文件

1330-5235-5560-xxxxx,"Jan 1, 2017",12:35:13 AM PST,,Charge,,Smart Plan (Calling & Texting),com.xxx,1,unlimited_usca_tariff_and,astar-y3,US,NC,27288,USD,4.99,0.950333,EUR,9.49 

enter image description here

+0

如果去掉'delimeter',同样的问题? – jezrael

+0

因为默认分隔符'sep =','' – jezrael

+0

是的,它完全一样。我已经把确切的csv行尝试。 –

回答

0

需要在read_csv设置quotingQUOTE_NONE的一行:

import csv 

df = pd.read_csv('sample.csv', quoting=csv.QUOTE_NONE) 

#sum some columns 
df['Transaction Date'] = df['Description'] + df['Transaction Date'] 
#create column from index 
df['Description'] = df.index 

#remove " from values 
df['Description'] = df['Description'].str.strip('"') 
df['Transaction Date'] = df['Transaction Date'].str.strip('"') 
df['Amount (Merchant Currency)'] = df['Amount (Merchant Currency)'].str.strip('"') 
                    .astype(float) 

df = df.reset_index(drop=True) 
print (df.head(1)) 


      Description Transaction Date Transaction Time Tax Type \ 
0 8330-5235-5560-88882  Jan 8 2084 82:35:83 AM PST  NaN 

    Transaction Type Refund Type     Product Title Product id \ 
0   Charge   NaN Smart Plan (Calling & Texting) com.fight 

    Product Type    Sku Id Hardware Buyer Country Buyer State \ 
0    8 unlimited_usca_and astar-y3   US   NC 

    Buyer Postal Code Buyer Currency Amount (Buyer Currency) \ 
0    24288   USD      9.99 

    Currency Conversion Rate Merchant Currency Amount (Merchant Currency) 
0     0.95028    EUR      9.49