无法比较Python中的字符串

我有这个代码应该打开并阅读两个文本文件，并匹配两个字中都存在的字。通过打印“SUCESS”并将该单词写入temp.txt文件来表示匹配。作为无法比较Python中的字符串

/teetetet 
/eteasdsa 
/asdasdfsa 
/asdsafads 
. 
. 
...etc

paths.txt被格式化为

/asdadasd.php/asdadas/asdad/asd 
/adadad.html/asdadals/asdsa/asd 
. 
. 
...etc

因此我使用分割功能，以获得第一/ asadasda（路径内

dir = open('listac.txt','r') 
path = open('paths.txt','r') 
paths = path.readlines() 
paths_size = len(paths) 
matches = open('temp.txt','w') 
dirs = dir.readlines() 

for pline in range(0,len(paths)): 
     for dline in range(0,len(dirs)): 
       p = paths[pline].rstrip('\n').split(".")[0].replace(" ", "") 
       dd = dirs[dline].rstrip('\n').replace(" ", "") 
       #print p.lower() 
       #print dd.lower() 
       if (p.lower() == dd.lower()): 
         print "SUCCESS\n" 
         matches.write(str(p).lower() + '\n')

listac.txt被格式化.txt）在点之前。问题是，这些词从来不匹配，我甚至在每个IF语句前打印出每个比较结果，并且它们是相等的，在比较字符串之前Python还有其他的东西吗？

=======

感谢大家的帮助。正如你所说，我清理代码，以便它弄成这个样子：

dir = open('listac.txt','r') 
path = open('paths.txt','r') 
#paths = path.readlines() 
#paths_size = len(paths) 

for line in path: 
     p = line.rstrip().split(".")[0].replace(" ", "") 
     for lines in dir: 
       d = str(lines.rstrip()) 
       if p == d: 
         print p + " = " + d

显然，具有p申报并进入第二个for循环之前进行初始化，使在比较的道路的差异。当我在第二个for循环中声明p和d时，它不起作用。我不知道原因，但如果有人这样做，我在听:)

再次感谢！

来源

2012-09-11 user1663160

在您的示例中，没有匹配。 –

太复杂了。不要在'range（0，len（paths））'中使用''''''''，只要用''''''''''''''''<！为什么'rstrip（'\ n'）'。可能有一个额外的'\ r'。只需使用'rstrip（）'。 – Matthias

您也可以将'p = ...'移动到inner for循环之外，因为它每次都执行相同的计算。 – mgilson

我不得不看到更多的数据集，看看为什么你没有得到匹配。我已经重构了一些代码，以便更多pythonic。

dirFile = open('listac.txt','r') 
pathFile = open('paths.txt','r') 
paths = pathFile.readlines() 
dirs = dirFile.readlines() 

matches = open('temp.txt','w') 

for pline in paths: 
    p = pline.rstrip('\n').split(".")[0].replace(" ", "") 
    for dline in dirs: 
     dd = dline.rstrip('\n').replace(" ", "") 
     #print p.lower() 
     #print dd.lower() 
     if p.lower() == dd.lower(): 
      print "SUCCESS\n" 
      matches.write(str(p).lower() + '\n')

来源

2012-09-11 14:50:32 desimusxvii

+1，但是你可以通过首先将'dirs'转换成一个集合（'dirs = {line.lower（）for line in dirFile}'），然后检查'if p.lower（） '），并直接遍历文件，完全避免'readlines（）'和所有这些'rstrip（）。 –

@TimPietzcker当然。我会写完全不同的。我认为让初学者进入这种情况会更有帮助。 – desimusxvii

虽然我们在读取数据文件全部到内存中，无论如何，为什么不尝试使用sets并得到交集？：

def format_data(x): 
    return x.rstrip().replace(' ','').split('.')[0].lower() 

with open('listac.txt') as dirFile: 
    dirStuff = set(format_data(dline) for dline in dirFile) 

with open('paths.txt') as pathFile: 
    intersection = dirStuff.intersection(format_data(pline) for pline in pathFile) 

for elem in intersection: 
    print "SUCCESS\n" 
    matches.write(str(elem)+"\n")

我用同样的format_data功能两个数据集，因为它们看起来差不多，但如果你愿意，你可以使用多个功能。另请注意，该解决方案仅将两个文件中的一个读入内存。与另一个的交点应该被延迟计算。

正如在评论中指出的那样，这不会做任何维持秩序的尝试。但是，如果您确实需要保留订单，请尝试以下操作：

<snip> 
... 
</snip> 

with open('paths.txt') as pathFile: 
    for line in pathFile: 
     if format_line(line) in dirStuff: 
      print "SUCCESS\n" 
      #...

来源

2012-09-11 14:55:57 mgilson

虽然这将导致不同的输出，因为线的顺序将会丢失。这很可能无关紧要。 –

我更喜欢'a.intersection（b）'之前的'a＆b'。 “什么是a＆b”==“什么是a＆b”。 – Alfe

像∩这样的操作符不是ASCII，因此还不是Python的一部分;-) – Alfe

无法比较Python中的字符串

回答

相关问题