2014-05-20 56 views
3

我有一个Fortran格式的文本文件(这里是第3行):如何阅读Python中的Fortran固定宽度格式化文本文件?

00033+3251 A B  C?  6.96 5.480" 358 9.12 F0V 0.00  2.28s 1.00: 2MASS, dJ=1.3 
00033+3251 Aa Ab Aab S1,E 0.62 0.273m 0 9.28 F0V 11.28 K2  1.68* 0.32* SB 1469 
00033+3251 Aab Ac A E*  4.26 0.076" 0 9.12 F0V 0.00  2.00s 0.28* 2008MNRAS.383.1506 

和文件格式描述:

-------------------------------------------------------------------------------- 
Bytes Format Units Label  Explanations 
-------------------------------------------------------------------------------- 
1- 10 A10 ---  WDS  WDS(J2000) 
12- 14 A3 ---  Primary Designation of the primary 
16- 18 A3 ---  Secondary Designation of the secondary component 
20- 22 A3 ---  Parent Designation of the parent (1) 
24- 29 A6 ---  Type  Observing technique/status (2) 
31- 35 F5.2 d  logP  ? Logarithm (10) of period in days 
37- 44 F8.3 ---  Sep  Separation or axis 
    45 A1 ---  x_Sep  ['"m] Units of sep. (',",m) 
47- 49 I3 deg  PA  Position angle 
51- 55 F5.2 mag  Vmag1  V-magnitude of the primary 
57- 61 A5 ---  SP1  Spectral type of the primary 
63- 67 F5.2 mag  Vmag2  V-magnitude of the secondary 
69- 73 A5 ---  SP2  Spectral type of the secondary 
75- 79 F5.2 solMass Mass1  Mass of the primary 
    80 A1 ---  MCode1 Mass estimation code for primary (3) 
82- 86 F5.2 solMass Mass2  Mass of the secondary 
    87 A1 ---  MCode2 Mass estimation code for secondary (3) 
89-108 A20 ---  Rem  Remark 

如何阅读我在Python文件。我发现pandas函数库中只有read_fwf函数。

import pandas as pd 

filename = 'systems' 
columns = ((0,10),(11,14),(15,18),(19,22),(23,29),(30,35),(36,44),(45,45),(46,49),(50,55),(56,61),(62,67),(68,73),(74,79),(80,80),(81,86),(87,87),(88,108)) 
data = pd.read_fwf(filename, colspecs = columns, header=None) 

这是唯一可行和有效的方法吗?我希望我能在没有pandas的情况下做到这一点。你有什么建议吗?

+0

那你试试这么远吗?你能告诉我们一些代码... –

+0

我已经尝试了熊猫read_fwf函数。它的工作原理,但我不想在我的程序中使用额外的模块。我想用NumPy来解决我的任务。 – drastega

+0

你能告诉我们一些代码吗? –

回答

2
 columns = ((0,10),(11,14),(15,18),(19,22),(23,29),(30,35), 
       (36,44),(44,45),(46,49),(50,55),(56,61),(62,67), 
       (68,73),(74,79),(79,80),(81,86),(86,87),(88,108)) 
    string=file.readline() 
    dataline = [ string[c[0]:c[1]] for c in columns ] 

注列指数(startbyte-1,endbyte),使得单个字符字段 如:(44,45)

这个给你留下一个字符串列表。您可能想要对浮点数,整数等进行转换。关于该主题,此处有许多问题。

+0

单列索引应该是例如'(44,45)'。否则,他们会返回一个空列表。 – Stefan

+0

好抓。我会修复并注意 – agentp

1

有一个FortranRecordReader模块,但它与现代fortran文件包含的星号,注释等相比较弱。尽管如此,对于一个不错的文件,它与namedtuple结合使用会很有用。例如:

from fortranformat import FortranRecordReader 
fline=FortranRecordReader('(a1,i3,i5,i5,i5,1x,a3,a4,1x,f13.5,f11.5,f11.3,f9.3,1x,a2,f11.3,f9.3,1x,i3,1x,f12.5,f11.5)') 
from collections import namedtuple 
record=namedtuple('nucleo','cc NZ N Z A el o  massexcess uncmassex binding uncbind  B beta uncbeta am_int am_float uncatmass') 

f=open('AME2012.mas12.ff','r') 
for line in f: 
    nucl=record._make(fline.read(line)) 

您也可以尝试在模块“解析”,或写你的

相关问题