2017-07-31 49 views
0

这里是代码,其中“LoanAmount”,“ApplicantIncome”,“CoapplicantIncome”是类型对象:对于大熊猫数据帧列中,类型错误:浮子()参数必须是字符串或数字

document=pandas.read_csv("C:/Users/User/Documents/train_u6lujuX_CVtuZ9i.csv") 


document.isnull().any() 
document = document.fillna(lambda x: x.median()) 

for col in ['LoanAmount', 'ApplicantIncome', 'CoapplicantIncome']: 
    document[col]=document[col].astype(float) 

document['LoanAmount_log'] = np.log(document['LoanAmount']) 
document['TotalIncome'] = document['ApplicantIncome'] + document['CoapplicantIncome'] 
document['TotalIncome_log'] = np.log(document['TotalIncome']) 

我得到以下错误在转换对象类型为float:

TypeError: float() argument must be a string or a number 

请帮助,因为我需要通过这些功能来训练我的分类模型。这里的CSV文件的一个片段 -

Loan_ID Gender Married Dependents Education Self_Employed ApplicantIncome CoapplicantIncome LoanAmount Loan_Amount_Term Credit_History Property_Area Loan_Status 
LP001002 Male No 0   Graduate  No    5849   0        360      1    Urban   Y 
LP001003 Male Yes 1   Graduate  No    4583   1508    128   360      1    Rural   N 
LP001005 Male Yes 0   Graduate  Yes    3000   0     66   360      1    Urban   Y 
LP001006 Male Yes 0   Not Graduate No    2583   2358    120   360      1    Urban   Y 
+0

可以添加CSV文件的片段?并添加了错误 – Dark

+0

的行号! @Bharathshetty –

+0

@Bharathshetty错误是在训练数据在分类器 –

回答

0

在你的代码document = document.fillna(lambda x: x.median())将返回一个功能不那么函数无法转换为浮动它应该是一个数字的字符串或整数值。

希望下面的代码可以帮助

median = document['LoanAmount'].median() 
document['LoanAmount'] = document['LoanAmount'].fillna(median) # Or document = document.fillna(method='ffill') 
for col in ['LoanAmount', 'ApplicantIncome', 'CoapplicantIncome']: 
    document[col]=document[col].astype(float) 

document['LoanAmount_log'] = np.log(document['LoanAmount']) 
document['TotalIncome'] = document['ApplicantIncome'] + document['CoapplicantIncome'] 
document['TotalIncome_log'] = np.log(document['TotalIncome']) 
+0

现在出现以下错误 - ValueError:输入包含NaN,无穷大或对于dtype('float32' )。 –

+0

我提供的代码与我的数据一起工作。 – Dark

+0

既然你有很多的列,最好使用'ffill',然后适合数据。 – Dark

相关问题