2016-06-12 38 views
-2

file1.txt搜索含有用户名,即字符串(FILE1.TXT)从FILE2.TXT

tony 
peter 
john 
... 

file2.txt包含用户的详细信息,只有一行对每个用户的详细信息,即

alice 20160102 1101 abc 
john 20120212 1110 zjc9 
mary 20140405 0100 few3 
peter 20140405 0001 io90 
tango 19090114 0011 n4-8 
tony 20150405 1001 ewdf 
zoe 20000211 0111 jn09 
... 

我想从file2.txt获得用户提供的短名单file1.txt用户提供,即

john 20120212 1110 zjc9 
peter 20140405 0001 io90 
tony 20150405 1001 ewdf 

如何使用python来做到这一点?

+0

如果你开始有四个空间它得到的代码格式呈现的每一行 - 或者你可以使用{}按钮在降价编辑器中设置突出显示的代码的格式。 – AlBlue

+4

SO既不是代码编写,也不是教程服务。请学习[问]。 – jonrsharpe

+0

请阅读[python文件](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files)和[strings](http://www.learnpython.org/) EN/Basic_String_Operations)。如果编程时出现错误,请提问。 – ravigadila

回答

0

您可以使用.split(' '),屁股uming认为总是会有的名称,并在file2.txt

这里其他的信息之间的空间是一个例子:

UserList = [] 

with open("file1.txt","r") as fuser: 
     UserLine = fuser.readline() 
     while UserLine!='': 
      UserList.append(UserLine.split("\n")[0]) # Separate the user name from the new line command in the text file. 
      UserLine = fuser.readline() 

InfoUserList = [] 
InfoList = [] 

with open("file2.txt","r") as finfo: 
     InfoLine = finfo.readline() 
     while InfoLine!='': 
      InfoList.append(InfoLine) 
      line1 = InfoLine.split(' ') 
      InfoUserList.append(line1[0]) # Take just the user name to compare it later 
      InfoLine = finfo.readline() 

for user in UserList: 
    for i in range(len(InfoUserList)): 
     if user == InfoUserList[i]: 
      print InfoList[i] 
0
import pandas as pd 

df1 = pd.read_csv('df1.txt', header=None) 
df2 = pd.read_csv('df2.txt', header=None) 
df1[0] = df1[0].str.strip() # remove the 2 whitespace followed by the feild 
df2 = df2[0].str[0:-2].str.split(' ').apply(pd.Series) # split the word and remove whitespace 
df = df1.merge(df2) 

Out[26]: 
     0   1  2  3 
0 tony 20150405 1001 ewdf 
1 peter 20140405 0001 io90 
2 john 20120212 1110 zjc9 
0

您可以使用pandas

import pandas as pd 

file1 = pd.read_csv('file1.txt', sep =' ', header=None) 
file2 = pd.read_csv('file2.txt', sep=' ', header=None) 

shortlist = file2.loc[file2[0].isin(file1.values.T[0])] 

它会给你以下结果:

 0   1  2  3 
1 john 20120212 1110 zjc9 
3 peter 20140405  1 io90 
5 tony 20150405 1001 ewdf 

上面是DataFrame将其转换回一个数组只使用shortlist.values

相关问题