2017-08-30 29 views
1

我想修改一堆邮件列表。每个邮件列表包含一个电子邮件地址列表(每行一个),我称之为“旧”地址。对于给定的电子邮件地址,旧文件会与新文件一起在.xlsx文件中引用。如果旧地址未被引用,则表示它已过时并且必须删除。有时邮件列表中的电子邮件地址已经是很好的了。在这种情况下,它必须保持不变。根据Excel文件重新格式化一些邮件列表

我在python中做过。我并没有真正遇到问题,但我意识到它并不那么明显,所以我想分享我的工作。首先,因为它看起来像我已经看到的一些帖子,它可能是有帮助的;其次,因为我的代码是绝对没有优化的(我不需要优化它,因为在我的情况下,这需要大约0.5秒),并且我会好奇的想知道在10^8邮件列表。

回答

0

这里是Python代码我终于实现:

import xlrd 
import os 
path_old = 'toto' 
path_new = 'tata' 
mailing_lists = os.listdir(path_old) 
good_domain = 'gooddomain.fr' 
printing_level = 3 

# reading of the excel file 
xlsfilename = 'adresses.xlsx' 
xlsfile = xlrd.open_workbook(xlsfilename) 
number_of_persons = 250 
number_column_old_mail = 7 
number_column_new_mail = 5 
newmail = [] 
oldmail = [] 
for count in range(number_of_persons): 
    oldmail.append(xlsfile.sheets()[0].cell(count,number_column_old_mail).value) 
    newmail.append(xlsfile.sheets()[0].cell(count,number_column_new_mail).value) 
############ 

for mailinglist_name in mailing_lists: 
    if printing_level > 0: 
     print('* dealing with mailing list ',mailinglist_name) 
    new_mailinglist = [] 
    new_name = mailinglist_name + '_new' 

    with open(path_old+'/'+mailinglist_name,'r') as inputfile: 
     for line in inputfile: 
      if len(line)<2: # to ignore blank lines. This length of 2 is completly arbitrary 
       continue 
      line = line.rstrip('\n') 
      ok = False 

# case 1: the address inside the old mailing list is ok ==> copied in the new mailing list 
      if '@' in line: 
       if line[line.index('@')+1:] == good_domain: 
        new_mailinglist.append(line) 
        if printing_level > 1: 
         print(' --> address ',line,' already ok ==> kept unmodified') 
        ok = True 

# case 2: the address inside the old mailing list is not ok ==> must be treated 
      if not ok: 
       if printing_level > 1: 
         print(' --> old address ',line,' must be treated') 
       try: 
# case 2a: the old address is in the excel file ==> replaced 
        ind = oldmail.index(line) 
        if printing_level > 2: 
         print(' --> old address found in the excel file and replaced by ',newmail[ind]) 
        new_mailinglist.append(newmail[ind]) 
       except ValueError: 
# case 2b: the old address is obsolete ==> removed 
        if printing_level > 2: 
         print(' --> old address removed') 

    with open(path_new+'/'+new_name,'w') as outputfile: 
     for address in new_mailinglist: 
      outputfile.write(address+'\n')