1
我有记录的CSV:如何创建熊猫分类索引记录列表?
name,credits,email
bob,,[email protected]
bob,6.0,[email protected]
bill,3.0,[email protected]
bill,4.0,[email protected]
tammy,5.0,[email protected]
其中name
是该指数。因为有相同名称的多个记录,我想整个行(减去名称)卷成列表创建窗体的JSON:
{
"bob": [
{ "credits": null, "email": "[email protected]"},
{ "credits": 6.0, "email": "[email protected]" }
],
// ...
}
我目前的解决方案是有点kludgey因为它似乎用大熊猫仅作为阅读CSV的工具,但仍然是产生预期的我输出JSONish:
#!/usr/bin/env python3
import io
import pandas as pd
from pprint import pprint
from collections import defaultdict
def read_data():
s = """name,credits,email
bob,,[email protected]
bob,6.0,[email protected]
bill,3.0,[email protected]
bill,4.0,[email protected]
tammy,5.0,[email protected]
"""
data = io.StringIO(s)
return pd.read_csv(data)
if __name__ == "__main__":
df = read_data()
columns = df.columns
index_name = "name"
print(df.head())
records = defaultdict(list)
name_index = list(columns.values).index(index_name)
columns_without_index = [column for i, column in enumerate(columns) if i != name_index]
for record in df.values:
name = record[name_index]
record_without_index = [field for i, field in enumerate(record) if i != name_index]
remaining_record = {k: v for k, v in zip(columns_without_index, record_without_index)}
records[name].append(remaining_record)
pprint(dict(records))
有没有办法做到在本地大熊猫(和numpy的)是一回事吗?
差不多!如果我不需要明确列出“groupby”后面的列,那很好,但我认为这很简单。 – erip
@erip,我已更新我的文章 - 请检查... – MaxU
完美!非常感谢你的帮助! – erip