2017-03-22 85 views
1

下午好,复制文件 - 环蟒蛇

我这里有一些代码,是为了从源文件夹复制约200 CSV年代进入基于“部门”,他们属于目标文件夹。他们所属的部门通过“符号列表”进行标识,其中包含csvs和一系列代号。该代码适用于大部分部分,不同之处在于它将符号列表中最后csv中的代号列表复制并将所有这些CSV复制到所有目标文件夹中。基本上我需要以某种方式将第一个for循环与下面的3 for循环组合起来,但是我很难做到这一点。任何意见,不胜感激。

import os,sys,shutil 
import glob 
import pandas as pd 

source_dir = 'C:\TS' 
dest_dir = 'C:\TS\Combined\Groups\Cross Asset Class' 
#dest_dir = 'C:\TS\Combined\copytest' 
base = 'C:\TS\Combined\Groups' 

dest_dirlist = (base +'/Cross Asset Class', base+'/Bonds', base + '/Commodities', \ 
     base + '/Countries', base + '/Currencies', base + '/Industry Sectors', base + '/Segments and Styles', \ 
     base + '/Us Sectors', base + '\Volatilities') 
print(dest_dirlist) 

symbolslist = (base+'/Cross Asset Class.csv', base+'/Bonds.csv',base+'/Commodities.csv' \ 
       ,base+'/Currencies.csv', base+'/Industry Sectors.csv', base+'/Segments and Styles.csv', \ 
        base+'/US Sectors.csv') 


for file in symbolslist: 
    print(file) 
    df_symbolslist = pd.read_csv(file) 
    print(df_symbolslist) 


for ticker_file in glob.glob(os.path.join(source_dir, '*.csv*')): 
    for ticker in df_symbolslist['Ticker']: 
     print(ticker) 
     if ticker in ticker_file: 
      for path in dest_dirlist: 
       shutil.copy(ticker_file, path) 
       print(ticker + ' File Copied') 

非常感谢您的时间。

+0

'df_symbolslist = pd.read_csv(文件)'保持overwritting'df_symobolslist'。你只处理最后一个。我对这里的格式感到困惑...... symbollist csv的外观如何?他们如何编码部门和符号? – tdelaney

+0

这就是符号列表csv的外观:http://prntscr.com/en97is – user7669093

+0

这就是要复制的csvs的样子。 http://prntscr.com/en98d2 – user7669093

回答

0

如果我正确理解问题,我认为你应该遵循一个稍微不同的方法。您可以创建一个dict,将代号映射到它所属的资产类别列表。然后你映射到复制文件。我不认为熊猫可以帮助你 - 你可以使用标准的csv模块逐行建立地图。

我没有测试此代码,因为我没有正确的数据集,但考虑到这样做:

import os,sys,shutil 
import glob 
import csv 
import collections 

source_dir = r'C:\TS' 
dest_dir = r'C:\TS\Combined\Groups\Cross Asset Class' 
#dest_dir = r'C:\TS\Combined\copytest' 
base = r'C:\TS\Combined\Groups' 

# asset classes of interest 
asset_classes = ('Cross Asset Class', 'Bonds', 'Commodities', 
    'Countries', 'Currencies', 'Industry Sectors', 'Segments and Styles', 
    'Us Sectors', 'Volatilities') 

# asset class directories indexed by class 
dest_dir_index = {asset_class.upper():os.path.join(base, asset_class) 
    for asset_class in asset_classes} 
print(dest_dir_index) 

# make sure destination dirs exist 
for dir_name in dest_dir_index.values(): 
    if not os.path.isdir(dir_name): 
     os.mkdir(dir_name) 

# dict that creates key:[classes...] item when first accessed, used to keep 
# list of asset classes for each ticker symbol. 
ticker_to_class_index = collections.defaultdict(list) 

for asset_class in asset_classes: 
    symbolcsv = "{}.csv".format(os.path.join(base, asset_class)) 
    print(symbolcsv) 
    with open(symbolcsv, newline='') as fp: 
     reader = csv.reader(fp) 
     next(fp) # skip header 
     for sectors, ticker in reader: 
      ticker_to_class_index[ticker.upper()].append(asset_class) 

# split ticker out of csv filenames and copy file to all asset classes 
# mapped for that ticker. 
for ticker_file in glob.glob(os.path.join(source_dir, '*.csv*')): 
    ticker = os.path.splitext(os.path.basename(ticker_file)).upper() 
    print(ticker) 
    for asset_class in ticker_to_class_map[ticker.upper()]: 
     dest_dir = dest_dir_index[asset_class] 
     shutil.copy(ticker_file, dest_dir) 
     print("{} copied to {}".format(ticker_file, dest_dir)) 
+0

非常感谢tdelaney。让我试试看。此外,这里是一个保管箱链接到我的csvs的样子.https://www.dropbox.com/s/iqbt3pa6qiz70vg/Cross%20Asset%20Class.csv?dl = 0 – user7669093

+0

他是我想要的文件类型复制。 https://www.dropbox.com/s/vusdpn8oizcda82/XLY.csv?dl=0 – user7669093

+0

给它一个镜头,但它似乎并不喜欢49或51行中的“上层”。给我一个元组错误: ticker = os.path.splitext(os.path.basename(ticker_file))。upper() AttributeError:'元组'对象没有属性'upper' – user7669093