变量名中不能使用的欧元符号:
Identifiers (also referred to as names) are described by the following lexical definitions:
identifier ::= (letter|"_") (letter | digit | "_")*
letter ::= lowercase | uppercase
lowercase ::= "a"..."z"
uppercase ::= "A"..."Z"
digit ::= "0"..."9"
您将需要使用一个字符串:其实
df["price_€"] ...
大熊猫有没有问题,我与欧元符号:
import pandas as pd
df = pd.DataFrame([[1, 2]], columns=["£", "€"])
print(df["€"])
print(df["£"])
0 2
Name: €, dtype: int64
0 1
Name: £, dtype: int64
fil e是CP1252编码,所以你需要指定编码:
mport pandas as pd
iimport codecs
df = pd.read_csv("PPR-2015.csv",header=0,encoding="cp1252")
print(df.columns)
Index([u'Date of Sale (dd/mm/yyyy)', u'Address', u'Postal Code', u'County',
u'Price (€)', u'Not Full Market Price', u'VAT Exclusive', u'Description of Property', u'Property Size Description'], dtype='object')
print(df[u'Price (€)'])
0 €138,000.00
1 €270,000.00
2 €67,000.00
3 €900,000.00
4 €176,000.00
5 €155,000.00
6 €100,000.00
7 €120,000.00
8 €470,000.00
9 €140,000.00
10 €592,000.00
11 €85,000.00
12 €422,500.00
13 €225,000.00
14 €55,000.00
...
17433 €262,000.00
17434 €155,000.00
17435 €750,000.00
17436 €96,291.69
17437 €112,000.00
17438 €350,000.00
17439 €190,000.00
17440 €25,000.00
17441 €100,000.00
17442 €75,000.00
17443 €46,000.00
17444 €175,000.00
17445 €48,500.00
17446 €150,000.00
17447 €400,000.00
Name: Price (€), Length: 17448, dtype: object
然后改变浮动:
df[u'Price (€)'] = df[u'Price (€)'].str.replace(ur'[€,]'), '').astype('float')
print(df['Price (€)'.decode("utf-8")])
输出:
0 138000
1 270000
2 67000
3 900000
4 176000
5 155000
6 100000
7 120000
8 470000
9 140000
10 592000
11 85000
12 422500
13 225000
14 55000
...
17433 262000.00
17434 155000.00
17435 750000.00
17436 96291.69
17437 112000.00
17438 350000.00
17439 190000.00
17440 25000.00
17441 100000.00
17442 75000.00
17443 46000.00
17444 175000.00
17445 48500.00
17446 150000.00
17447 400000.00
Name: Price (€), Length: 17448, dtype: float64
你是说当你打印你看到的'data_ '数据框?如果是这样,那么你的问题是编码 –
嗨Padraic,是的,当我打印框架,我看到'price_ '。有没有办法解决这个问题,还是我需要手动更改输入文件? – Marcus
你是如何创建数据框的? –