2016-06-08 60 views
2

我有一个包含输入列表的输入文件,每行一个。每行输入都用双引号括起来。输入有时会在封闭的双引号内使用反斜杠或少量双引号(请查看下面的示例)。Python中的Unescape字符串

样品输入 -

"each line is enclosed in double-quotes" 
"Double quotes inside a \"double-quoted\" string!" 
"This line contains backslashes \\not so cool\\" 
"too many double-quotes in a line \"\"\"too much\"\"\"" 
"too many backslashes \\\\\\\"horrible\"\\\\\\" 

我想借上述输入和简单地在线路逃脱双引号的那些转换为回剔`

我假设有一个简单的单线解决方案。我尝试了以下,但它不起作用。任何其他单线解决方案或修复下面的代码将不胜感激。

def fix(line): 
    return re.sub(r'\\"', '`', line) 

它未能对输入线和。

"each line is enclosed in double-quotes" 
"Double quotes inside a `double-quoted` string!" 
"This line contains backslashes \\not so cool\` 
"too many double-quotes in a line ```too much```" 
"too many backslashes \\\\\\`horrible`\\\\\` 

我能想到的任何修补程序都会打破其他行。请帮忙!

回答

2

这是不太你要的,因为它与"而不是'代替,但我会提到它......你总是可以利用关csv\"转换为您正确:

>>> for line in csv.reader(["each line is enclosed in double-quotes", 
...       "Double quotes inside a \"double-quoted\" string!", 
...       "This line contains backslashes \\not so cool\\", 
...       "too many double-quotes in a line \"\"\"too much\"\"\"", 
...       "too many backslashes \\\\\\\"horrible\"\\\\\\", 
...       ]): 
...   print(line) 
...  
['each line is enclosed in double-quotes'] 
['Double quotes inside a "double-quoted" string!'] 
['This line contains backslashes \\not so cool\\'] 
['too many double-quotes in a line """too much"""'] 
['too many backslashes \\\\\\"horrible"\\\\\\'] 

如果重要的是它们是实际的,那么您可以简单地替换csv模块返回的文本。

1

在反斜杠后加+

return re.sub(r'\\+"', '`', line) 
+0

输入行仍然中断* 3 * – Bala