1
我有两个数据文件a.csv
和b.csv
可从引擎收录获得方式有4列和一些评论:一个合并两个文件具有相同的“列名”和“不同行”用大熊猫在python
# coating file for detector A/R
# column 1 is the angle of incidence (degrees)
# column 2 is the wavelength (microns)
# column 3 is the transmission probability
# column 4 is the reflection probability
14.2 531.0 0.0618 0.9382
14.2 532.0 0.07905 0.92095
14.2 533.0 0.09989 0.90011
14.2 534.0 0.12324 0.87676
14.2 535.0 0.14674 0.85326
14.2 536.0 0.16745 0.83255
14.2 537.0 0.1837 0.8163
#
# 171 lines, 5 comments, 166 data
第二个文件b.csv有不同数量的行的一个共同的列两列:
# Version 2.0 - nm, [email protected] to 1, burrows+2006c91.21_T1350_g4.7_f100_solar
# Wavelength(nm) Flambda(ergs/cm^s/s/nm)
300.0 1.53345164121e-32
300.1 1.53345164121e-32
300.2 1.53345164121e-32
# total lines = 20003, comment lines = 2, data lines = 20001
现在,我想合并这两个文件与第二列公共(两个文件中的波长应该是相同的)。
输出看起来像:
# coating file for detector A/R
# column 1 is the angle of incidence (degrees)
# column 2 is the wavelength (microns)
# column 3 is the transmission probability
# column 4 is the reflection probability
# Version 2.0 - nm, [email protected] to 1, burrows+2006c91.21_T1350_g4.7_f100_solar
# Wavelength(nm) Flambda(ergs/cm^s/s/nm)
14.2 531.0 0.0618 0.9382 1.14325276212
14.2 532.0 0.07905 0.92095 1.14557732058
注:的意见也被合并。
在文件b.csv
中,波长是行号= 2313.
我们如何在python中这样做?
我最初的尝试是这样的:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Author : Bhishan Poudel
# Date : Jun 17, 2016
# Imports
from __future__ import print_function
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# read in dataframes
#======================================================================
# read in a file
#
infile = 'a.csv'
colnames = ['angle', 'wave','trans','refl']
print('{} {} {} {}'.format('\nreading file : ', infile, '',''))
df1 = pd.read_csv(infile,sep='\s+', header = None,skiprows = 0,
comment='#',names=colnames,usecols=(0,1,2,3))
print('{} {} {} {}'.format('df.head \n', df1.head(),'',''))
#------------------------------------------------------------------
#======================================================================
# read in a file
#
infile = 'b.csv'
colnames = ['wave', 'flux']
print('{} {} {} {}'.format('\nreading file : ', infile, '',''))
df2 = pd.read_csv(infile,sep='\s+', header = None,skiprows = 0,
comment='#',names=colnames,usecols=(0,1))
print('{} {} {} {}'.format('df.head \n', df2.head(),'','\n'))
#----------------------------------------------------------------------
result = df1.append(df2, ignore_index=True)
print(result.head())
print("\n")
一些有用的链接如下:
How to merge data frame with same column names
http://pandas.pydata.org/pandas-docs/stable/merging.html