从用户输入Unicode字符的文件进行比较，以Unicode字符

所以我有这样的代码，这样我可以由用户输入从用户输入Unicode字符的文件进行比较，以Unicode字符

print "Enter a nepali string" 
split_string=raw_input().decode(sys.stdin.encoding or locale.getpreferredencoding(True))

Unicode字符串，我已经在文件中的一些unicode字符串，如果是Unicode的字符串匹配为用户输入字符串中的子字符串，那么我必须拆分该字符串。假设我有“सुर”的文件，如果是“सुरक्षा”匹配是通过用户输入的话，我只希望“क्षा”输出

with codecs.open("prefixnepali.txt","rw","utf-8") as prefix: 
    for line in prefix: 
      line=ud.normalize('NFC',line) 
      if line in split_string: 
      prefixy=split_string[len(line):len(split_string)] 
      print prefixy 
      else: 
      print line

但是当我运行该程序，我得到

दि

सुर

रु

哪些是在文件中的Unicode字符串时，我输入 “सुरक्षा” 在终端中。我可以知道这里有什么问题吗？

来源

2015-08-19 Bishal Gautam

问题可能很简单：从文件读取的行在其末尾有换行符。使用splitlines诚如在Reading a file without newlines和Getting rid of \n when using .readlines()

with codecs.open("prefixnepali.txt","rw","utf-8") as prefix: 
    for line in prefix.read().splitlines(): 
      line=ud.normalize('NFC',line) 
      if line in split_string: 
      prefixy=split_string[len(line):len(split_string)] 
      print prefixy 
      else: 
      print line

而且顺便说一句，line in split_string将寻找的line发生内split_string任何地方。如果您要查找完全匹配的前缀，则应使用split_string.find(line) == 0或split_string[0:len(line)] == line。

来源

2015-08-19 09:03:37 Rishi

从用户输入Unicode字符的文件进行比较，以Unicode字符

回答

相关问题