2015-07-19 71 views
0

我用熊猫在桌子上,在这个环节预期:熊猫据帧未正常工作

http://sports.yahoo.com/nfl/stats/byposition?pos=QB&conference=NFL&year=season_2014&sort=49&timeframe=All 

我试图创建的球员对象从每个(相关的)行。所以,我想通过结束第3行,我用了一堆不同的领域,构建一个玩家对象,包括姓名,团队,传球码等

这里是我的尝试:

def getAllQBs(): 
    QBs = [] 
    table = pd.read_html(requests.get(QB_LINK).content)[5] 
    finalTable = table[2 : ] 
    print(finalTable) 

    for row in finalTable.iterrows(): 
     print(row) 
     name = row[0] 
     team = row[1] 
     passingYards = row[7] 
     passingTouchdowns = row[10] 
     interceptions = row[11] 
     rushingYards = row[13] 
     rushingTouchdowns = row[16] 
     rushingFumbles = row[19] 
     newQB = QB(name, team, rushingYards, rushingTouchdowns, rushingFumbles, passingYards, passingTouchdowns, interceptions) 
     QBs.append(newQB) 
     print(newQB.toString()) 
    return QBs 

传递码是行左边的第8个元素,所以我想我可以使用row[7]来访问它。然而,当我运行这个功能时,我得到:

Traceback (most recent call last): 
    File "main.py", line 66, in <module> 
    main() 
    File "main.py", line 64, in main 
    getAllQBs() 
    File "main.py", line 27, in getAllQBs 
    passingYards = row[7] 
IndexError: tuple index out of range 

看起来好像我无意中使用了列。然而,我用DataFrame.iterrows(),我认为会照顾这...

任何想法?

感谢, bclayman

回答

1

iterrows()生成表单(index, Series),其中系列是你想访问的行数据的元组。在这种情况下,如果索引无意义,可以将其解压缩为一个虚拟变量,如下所示。

for (_, row) in finalTable.iterrows(): 
    .....