Python - 合并两个表中的数据

我想用python（python 3.4）合并表中的数据。我的示例数据如下所示，我想获得这种结果表。Python - 合并两个表中的数据

[表1]

Name1 Name2 
AAAA XXXX 
BBBB YYYY 
CCCC ZZZZ

[表2]

Index1 Sample1 Sample2 Sample3 
AAAA 10 20 30 
BBBB 25 25 25 
CCCC 30 31 32 
XXXX 27 29 31 
YYYY 45 21 56 
ZZZZ 48 24 10

[结果表]

Index2 Sample1 Sample2 Sample3 
AAAA+XXXX 37 49 61 
BBBB+YYYY 70 46 81 
CCCC+ZZZZ 78 55 42

虽然似乎是一个简单的问题，我找不到好的解决方案因为我是一个Python新手，我不熟悉python库。如果我在数据库上使用SQL，可能很容易，但我想在没有数据库的情况下解决它。有没有人有好主意？

来源

2015-10-07 ToBeSpecific

我建议先将“表2”读入关系数据类型，如Python字典。由此你有你的关键价值对。然后，您可以解析“表1”文件以查看要将哪些值添加在一起。 – enpenax

看看[Pandas]（http://pandas.pydata.org/）。特别是[DataFrame加入和合并]一节（http://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging）。 – Evert

表格中的数据如何存储？在'.txt'文件中？ – ZdaR

以下csv方法会为你的样本数据的工作：

import csv 

with open('table2.txt', 'r') as f_table2: 
    csv_table2 = csv.reader(f_table2, delimiter=' ', skipinitialspace=True) 
    table2_header = next(csv_table2) 
    table2_data = {cols[0] : cols[1:] for cols in csv_table2} 

with open('table1.txt', 'r') as f_table1, open('output.csv', 'w', newline='\n') as f_output: 
    csv_table1 = csv.reader(f_table1, delimiter=' ', skipinitialspace=True) 
    table1_header = next(csv_table1) 
    csv_output = csv.writer(f_output) 
    csv_output.writerow(table2_header) 

    csv_output.writerows(
     ['{}+{}'.format(cols[0], cols[1])] + [int(x) + int(y) for x, y in zip(table2_data[cols[0]], table2_data[cols[1]])] for cols in csv_table1)

这会给你一个输出CSV文件如下：

Index1,Sample1,Sample2,Sample3 
AAAA+XXXX,37,49,61 
BBBB+YYYY,70,46,81 
CCCC+ZZZZ,78,55,42

使用Python 3.4.3进行测试

来源

2015-10-07 09:20:57

csv模块似乎对于处理txt文件也非常有用。你的代码适合我的样本。我想用更大的数据集来测试csv方法和熊猫方法。感谢您的帮助。 – ToBeSpecific

如果使用纯Python的工作（不包括第三方库，如numpy的），这将有可能做这种方式：

class Entry: 
    def __init__(self, index, sample1, sample2, sample3): 
     self.index = index 
     self.sample1 = sample1 
     self.sample2 = sample2 
     self.sample3 = sample3 

    def __add__(self, other): 
     return '{index2} {sample1} {sample2} {sample3}'.format(
      index2=self.index + '+' + other.index, 
      sample1=self.sample1 + other.sample1, 
      sample2=self.sample2 + other.sample2, 
      sample3=self.sample3 + other.sample3, 
     ) 


def read_table(path_to_data): 
    def extract_body(content): 
     return [e.strip().split(' ') for e in content[1:]] 

    with open(path_to_data, 'r') as f: 
     content = f.readlines() 
    return extract_body(content) 


content1 = read_table('data1.txt') 
content2 = read_table('data2.txt') 

entries = [Entry(e[0], int(e[1]), int(e[2]), int(e[3])) for e in content2] 

# output 
print('Index2 Sample1 Sample2 Sample3') 

for line in content1: 
    entry1 = next(e for e in entries if e.index == line[0]) 
    entry2 = next(e for e in entries if e.index == line[1]) 

    print(entry1 + entry2)

来源

2015-10-07 08:53:39 ewilazarus

由于我是python的新手，我想尽可能地使用现成的库。然而，我很惊讶你使用没有数据处理库的纯Python，并且学会了一些关于如何处理这类工作的知识。感谢您的帮助。 – ToBeSpecific

@ToBeSpecific https：// xkcd。com/353/ – ewilazarus

我想我必须学习更多与python飞行。 – ToBeSpecific

Python - 合并两个表中的数据

回答

相关问题