2015-04-17 264 views
0

我是熊猫的新用户。我用我的titanic csv文件(我从教程中获得)运行以下代码,命令titanic.sex按预期返回性行。熊猫选择列错误

titanic = pd.read_csv('/Users/eflannery/Dropbox/titanic.csv') 
    titanic.Sex 
    0  male 
    1  female 
    2  female 
    3  female 
    4  male 
    5  male 
    6  male 
    7  male 
    8  female 
    9  female 
    10 female 
    11 female 
    12  male 
    13  male 
    14 female 
    ... 
    876  male 
    877  male 
    878  male 
    879 female 
    880 female 
    881  male 
    882 female 
    883  male 
    884  male 
    885 female 
    886  male 
    887 female 
    888 female 
    889  male 
    890  male 
    Name: Sex, Length: 891, dtype: object 

如果我用我的一个csv文件运行相同的代码,它不会按预期方式拉出cov列。我检查了我的文件是csv。有任何想法吗?

sam = pd.read_csv('/Users/eflannery/Dropbox/Cartika/CNV_data/samFlags_NK65EvolB20_baseCounts2.csv') 
sam.cov 
<bound method DataFrame.cov of    chrom pos cov 
0    berg02  1 0 
1    berg02  2 0 
2    berg02  3 0 
3    berg02  4 0 
4    berg02  5 2 
5    berg02  6 3 
6    berg02  7 3 
7    berg02  8 3 
8    berg02  9 4 
9    berg02  10 4 
10   berg02  11 4 
11   berg02  12 4 
12   berg02  13 4 
13   berg02  14 4 
14   berg02  15 4 
15   berg02  16 4 
16   berg02  17 5 
17   berg02  18 5 
18   berg02  19 5 
19   berg02  20 5 
20   berg02  21 5 
21   berg02  22 5 
22   berg02  23 5 
23   berg02  24 6 
24   berg02  25 6 
25   berg02  26 6 
26   berg02  27 6 
27   berg02  28 6 
28   berg02  29 6 
29   berg02  30 6 
...    ... ... ... 
18433379 PBANKA_API 30273 25 
18433380 PBANKA_API 30274 25 
18433381 PBANKA_API 30275 25 
18433382 PBANKA_API 30276 25 
18433383 PBANKA_API 30277 25 
18433384 PBANKA_API 30278 24 
18433385 PBANKA_API 30279 24 
18433386 PBANKA_API 30280 24 
18433387 PBANKA_API 30281 24 
18433388 PBANKA_API 30282 24 
18433389 PBANKA_API 30283 24 
18433390 PBANKA_API 30284 24 
18433391 PBANKA_API 30285 18 
18433392 PBANKA_API 30286 16 
18433393 PBANKA_API 30287 16 
18433394 PBANKA_API 30288 16 
18433395 PBANKA_API 30289 16 
18433396 PBANKA_API 30290 13 
18433397 PBANKA_API 30291 13 
18433398 PBANKA_API 30292 13 
18433399 PBANKA_API 30293 13 
18433400 PBANKA_API 30294 10 
18433401 PBANKA_API 30295 8 
18433402 PBANKA_API 30296 5 
18433403 PBANKA_API 30297 5 
18433404 PBANKA_API 30298 5 
18433405 PBANKA_API 30299 5 
18433406 PBANKA_API 30300 5 
18433407 PBANKA_API 30301 5 
18433408 PBANKA_API 30302 2 

[18433409 rows x 3 columns]> 
+0

请把dataframes小,包括每一个源文件的样本。 – dawg

回答

3

使用.访问语法就可以方便的方便,但你已经跨越了问题绊倒它:当有相同名称的方法,你的方法,而不是列。使用字典式的访问,而不是:

>>> df 
    chrom pos cov 
0 berg02 1 0 
1 berg02 2 0 
2 berg02 3 1 
>>> df.cov 
<bound method DataFrame.cov of  chrom pos cov 
0 berg02 1 0 
1 berg02 2 0 
2 berg02 3 1> 
>>> df["cov"] 
0 0 
1 0 
2 1 
Name: cov, dtype: int64 

或者我想你可以实际计算的协方差:-)

>>> df.cov() 
    pos  cov 
pos 1.0 0.500000 
cov 0.5 0.333333 
+0

哈哈!万分感谢!! – pinkvirus