如何创建熊猫分类索引记录列表？

我有记录的CSV：如何创建熊猫分类索引记录列表？

name,credits,email 
bob,,[email protected] 
bob,6.0,[email protected] 
bill,3.0,[email protected] 
bill,4.0,[email protected] 
tammy,5.0,[email protected]

其中name是该指数。因为有相同名称的多个记录，我想整个行（减去名称）卷成列表创建窗体的JSON：

{ 
    "bob": [ 
     { "credits": null, "email": "[email protected]"}, 
     { "credits": 6.0, "email": "[email protected]" } 
    ], 
    // ... 
}

我目前的解决方案是有点kludgey因为它似乎用大熊猫仅作为阅读CSV的工具，但仍然是产生预期的我输出JSONish：

#!/usr/bin/env python3 

import io 
import pandas as pd 
from pprint import pprint 
from collections import defaultdict 

def read_data(): 
    s = """name,credits,email 
bob,,[email protected] 
bob,6.0,[email protected] 
bill,3.0,[email protected] 
bill,4.0,[email protected] 
tammy,5.0,[email protected] 
""" 

    data = io.StringIO(s) 
    return pd.read_csv(data) 

if __name__ == "__main__": 
    df = read_data() 
    columns = df.columns 
    index_name = "name" 
    print(df.head()) 

    records = defaultdict(list) 

    name_index = list(columns.values).index(index_name) 
    columns_without_index = [column for i, column in enumerate(columns) if i != name_index] 

    for record in df.values: 
     name = record[name_index] 
     record_without_index = [field for i, field in enumerate(record) if i != name_index] 
     remaining_record = {k: v for k, v in zip(columns_without_index, record_without_index)} 
     records[name].append(remaining_record) 
    pprint(dict(records))

有没有办法做到在本地大熊猫（和numpy的）是一回事吗？

来源

2017-07-25 erip

这就是你想要的吗？

cols = df.columns.drop('name').tolist()

或依@jezrael：

cols = df.columns.difference(['name'])

然后：

s = df.groupby('name')[cols].apply(lambda x: x.to_dict('r')).to_json()

让打印好听：

In [45]: print(json.dumps(json.loads(s), indent=2)) 
{ 
    "bill": [ 
    { 
     "credits": 3.0, 
     "email": "[email protected]" 
    }, 
    { 
     "credits": 4.0, 
     "email": "[email protected]" 
    } 
    ], 
    "bob": [ 
    { 
     "credits": null, 
     "email": "[email protected]" 
    }, 
    { 
     "credits": 6.0, 
     "email": "[email protected]" 
    } 
    ], 
    "tammy": [ 
    { 
     "credits": 5.0, 
     "email": "[email protected]" 
    } 
    ] 
}

来源

2017-07-25 12:53:12 MaxU

差不多！如果我不需要明确列出“groupby”后面的列，那很好，但我认为这很简单。 – erip

@erip，我已更新我的文章 - 请检查... – MaxU

完美！非常感谢你的帮助！ – erip

如何创建熊猫分类索引记录列表？

回答

相关问题