2017-04-06 273 views
0

我想选择一个熊猫的数据帧中的专栏中,我正在读选择一个数据帧大熊猫蟒蛇的一列

tweets = pd.read_csv(r'C:\Users\PedroLuis\Documents\Manita\LASSO 20170219-20170402.csv', sep = " , ", engine='python') 
tweets = pd.DataFrame(tweets) 

当我列出我所看到的列是

list(tweets) 
Out: ['"","text","favorited","favoriteCount","replyToSN","created","truncated","replyToSID","id","replyToUID","statusSource","screenName","retweetCount","isRetweet","retweeted","longitude","latitude"'] 

我试图通过它的名称来选择第二列>

tweets['text'] 

而且我得到这个错误:

KeyError: 'text'

+0

这很奇怪。当你尝试'tweets.iloc [:,1]'时会发生什么? PS。你不需要'tweets = pd.DataFrame(tweets)'线,因为read_csv()已经返回一个数据帧 –

+0

tweets.columns的输出是什么? –

回答

2

。在你的月=“”,它使所有的列相结合的空间。

将其更改为

tweets = pd.read_csv(r'C:\Users\PedroLuis\Documents\Manita\LASSO 20170219-20170402.csv', sep = ",", engine='python') 

你应该能够调用鸣叫[“文本”]

1

如果在列表()的输出仔细观察,你会发现一个包含整个字符串用单引号括起来,每个头用双引号括起来,这意味着大熊猫没有像你期待的那样解释这一行。

Out: ['"","text","favorited","favoriteCount","replyToSN","created","truncated","replyToSID","id","replyToUID","statusSource","screenName","retweetCount","isRetweet","retweeted","longitude","latitude"'] 

尽管它看起来应该

Out: ['','text','favorited','favoriteCount','replyToSN','created','truncated','replyToSID','id','replyToUID','statusSource','screenName','retweetCount','isRetweet','retweeted','longitude','latitude'] 

我不知道你的输入是什么样子,但是,Niche.P说,清理你的分离器参数可能是一个解决方案。否则它可能是encoding issue