2012-09-12 142 views
0

我正在编写一个简短脚本来清理上传到SharePoint的文件夹和文件名。由于SharePoint很繁琐,并且有一些文件名规则超出了简单的不允许使用的字符(例如不允许多个连续的句点),看起来正则表达式是要走的路,而不是简单地替换单个字符。似乎并不不过是工作的一个表现是:在一个字符串中多次匹配一个字符类

[/<>*?|:"~#%&{}\\]+ 

举一个简单的字符类的比赛我本来期望这做工精细,并出现在记事本这样做++。我的期望是像

St\r/|ng 

与上述正则表达式匹配\,/和|。然而,不管我做什么,我只能得到匹配第一个反斜杠的字符串,或者它遇到的那个类中的第一个字符。这是用Python re库完成的。有谁知道这里的问题是什么? 进口操作系统,SYS,shutil,重新

def cleanPath(path): 
    #Compiling regex... 
    multi_dot = re.compile(r"[\.]{2,}") 
    start_dot = re.compile(r"^[\.]") 
    end_dot = re.compile(r"[\.]$") 
    disallowed_chars = re.compile(r'[/<>*?|:"~#%&{}\\]+') 
    dis1 = re.compile(r'\.files$') 
    dis2 = re.compile(r'_files$') 
    dis3 = re.compile(r'-Dateien$') 
    dis4 = re.compile(r'_fichiers$') 
    dis5 = re.compile(r'_bestanden$') 
    dis5 = re.compile(r'_file$') 
    dis6 = re.compile(r'_archivos$') 
    dis7 = re.compile(r'-filer$') 
    dis8 = re.compile(r'_tiedostot$') 
    dis9 = re.compile(r'_pliki$') 
    dis10 = re.compile(r'_soubory$') 
    dis11 = re.compile(r'_elemei$') 
    dis12 = re.compile(r'_ficheiros$') 
    dis13 = re.compile(r'_arquivos$') 
    dis14 = re.compile(r'_dosyalar$') 
    dis15 = re.compile(r'_datoteke$') 
    dis16 = re.compile(r'_fitxers$') 
    dis17 = re.compile(r'_failid$') 
    dis18 = re.compile(r'_fails$') 
    dis19 = re.compile(r'_bylos$') 
    dis20 = re.compile(r'_fajlovi$') 
    dis21 = re.compile(r'_fitxategiak$') 
    regxlist = [multi_dot,start_dot,end_dot,disallowed_chars,dis1,dis2,dis3,dis4,dis5,dis5,dis6,dis7,dis8,dis9,dis10,dis11,dis12,dis13,dis14,dis15,dis16,dis17,dis18,dis19,dis20,dis21] 
    print("************************************\n\n"+path+"\n\n************************************\n") 
    for x in regxlist: 
     match = x.search(path) 
     if match: 
      print("\n") 
      print("MATCHED") 
      print(match.group()) 

    print("___________________________________________________________________________") 
    return path 


#testlist of conditions that should be found, some OK, some bad 
testlist = ["string","str....ing","str..ing","str.ing",".string","string.",".string.","$tring",r"st\r\ing","st/r/ing",r"st\r/|ng","/str<i>ng","str.filesing","string.files"] 
testlist_ans = ["OK","Match ....","Match ..","OK","Match .","Match .","Match . .","OK",r"Match \ ","Match /",r"Match \/|","Match/< >","OK","Match .files"] 
count = 0 
for i in testlist: 
    print(testlist_ans[count]) 
    count = count + 1 

    cleanPath(i) 
+0

我不认为这个问题是正则表达式,但与Python函数所使用。你能告诉我们Python代码吗? – KRyan

回答

1
re.sub(pattern,new_txt,subject) #replace all instinces of pattern with new_txt 
re.findall(pattern,subject) #find all instances 
相关问题