2016-07-26 98 views
0

我想使用NBA的数据。这就是为什么我必须进行比较。我需要获得主场胜利的比例。但它不能将字符串转换为int。使用python处理csv数据文件

results["HomeWin"]=int(results["Home Team"])<int(results["OT?"]) 
y_true=results["HomeWin"].values 
print("Home win percentage is{0:.1f}%".format(100*results["HomeWin"].sum()/results["HomeWin"].count())) 

错误是:不能转换的系列类型 '诠释'

+1

使用' astype(int)':'results [“HomeWin”] = results [“Home Team”]。astype(int) EdChum

回答

1

你需要Series.astypestring数字转换为int

results["HomeWin"] = results["Home Team"].astype(int) < results["OT?"].astype(int) 

样品:

import pandas as pd 

results = pd.DataFrame({'Home Team':['1','2','3'], 
        'OT?':['4','2','1']}) 

print (results) 
    Home Team OT? 
0   1 4 
1   2 2 
2   3 1 


results["HomeWin"] = results["Home Team"].astype(int) < results["OT?"].astype(int) 
print (results) 
    Home Team OT? HomeWin 
0   1 4  True 
1   2 2  False 
2   3 1  False 
+0

它给出的错误为:long()与基数为10的无效字面值:'November' –

+0

存在问题,如'November'等字段中的字符串数据无法转换编号。你可以通过print(results.ix [pd.to_numeric(results [“Home Team”],errors ='coerce')来检查这个有问题的值isnull()| pd.to_numeric(results [“OT?”],errors ='coerce')。isnull(),['Home Team','OT?']])' – jezrael

+0

您可以通过'results = pd.DataFrame({'Home Team':['1','2 '','3','November'], 'OT?':['4','September','1','5']})' – jezrael