Unicode字符串在Python

我Unicode字符串在Python

(Pdb) email 
'\x00t\x00e\x00s\x00t\[email protected]\x00g\x00m\x00a\x00i\x00l\x00.\x00c\x00o\x00m\x00' 
(Pdb) print email 
[email protected]

我需要验证thie值是否为电子邮件格式，但是，我怎么能转换这个字符串实际ASCII字符串？

来源

2013-11-09 CIF

似乎它是用utf-16编码编码的。

>>> '\x00t\x00e\x00s\x00t\[email protected]\x00g\x00m\x00a\x00i\x00l\x00.\x00c\x00o\x00m\x00'.decode('utf-16') 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "C:\Python27\lib\encodings\utf_16.py", line 16, in decode 
    return codecs.utf_16_decode(input, errors, True) 
UnicodeDecodeError: 'utf16' codec can't decode byte 0x00 in position 28: truncated data

和截短：

>>> '\x00t\x00e\x00s\x00t\[email protected]\x00g\x00m\x00a\x00i\x00l\x00.\x00c\x00o\x00m\x00'[1:].decode('utf-16') 
u'[email protected]' 

>>> '\x00t\x00e\x00s\x00t\[email protected]\x00g\x00m\x00a\x00i\x00l\x00.\x00c\x00o\x00m\x00'[1:].decode('utf-16-le') 
u'[email protected]' 
>>> '\x00t\x00e\x00s\x00t\[email protected]\x00g\x00m\x00a\x00i\x00l\x00.\x00c\x00o\x00m\x00'.decode('utf-16-be', 'ignore') 
u'[email protected]'

来源

2013-11-09 09:24:44 falsetru

截断在一个或另一个方向;小端或大端。 –

如果我不得不猜测截断，我会责怪分裂空白的东西。根据字节顺序，空格字符的UTF-16编码将是''\ x00''或''\ x00''，直接的'split'会被破坏。或者其他任何ascii空格，当然。您无法安全地分割编码的字符串，特别是不是UTF-16。 “strip”与破坏事物具有相似的可能性。 –

Unicode字符串在Python

回答

相关问题