打印编码的字符串

我正在开发我自己的使用python的mp3解码器，但我有点卡住解码ID3标签。我不想使用现有的库，如诱变剂或eyeD3，但遵循ID3v2规范。打印编码的字符串

的问题是，帧数据以某种格式，我不能打印编码，使用调试器我看到了价值的“隐居”，但它是由一些奇怪的字符之前，你可以在这里看到：

'data': '\\x00Hideaway'

我有以下问题：那是什么样的编码？我怎样才能解码和打印该字符串？你觉得其他的mp3文件在ID3标签中使用不同的编码吗？

顺便说一句，我使用的是UTF-8声明我的文件

# -*- coding: utf-8 -*-

的顶部，我读使用python中正常的I/O方法（该文件读（））

来源

2014-06-24 Jorge Zapata

特性\\x00表示值为零的单个字节在H之前。所以，你的字符串看起来是这样的：

Zero - H - i - d - e ...

通常字符串有字母或数字在他们，而不是零。也许这个用法是特定于ID3v2的？

考虑的IDC3v2标准（http://id3.org/id3v2.4.0-structure），我们看到它是：

Frames that allow different types of text encoding contains a text 
encoding description byte. Possible encodings: 

$00 ISO-8859-1 [ISO-8859-1]. Terminated with $00. 
$01 UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All 
     strings in the same frame SHALL have the same byteorder. 
     Terminated with $00 00. 
$02 UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM. 
     Terminated with $00 00. 
$03 UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.

所以，我们看到的是零字节表示ISO-8859-1编码，直到下一个零字节。

你的程序可能会解决这个问题，像这样：

title = fp.read(number_of_bytes) 
if(title[0] == '\x00') 
    title = title[1:].decode('iso8859-1') 
elif(title[0] == ... something else ...) 
    title = title[1:].decode('some-other-encoding') 
...

来源

2014-06-24 19:00:10

大，感谢的人 –

打印编码的字符串

回答

相关问题