我不得不将任何外部python库(最好是wave
和/或audioop
)的wav文件从44100Hz下采样到16000Hz。我尝试使用setframerate
函数将wav文件的帧率更改为16000,但这只会减慢整个录制的速度。我怎样才能将音频文件下采样到16kHz,并保持相同的音频长度?Python - 向下采样wav音频文件
非常感谢你提前
我不得不将任何外部python库(最好是wave
和/或audioop
)的wav文件从44100Hz下采样到16000Hz。我尝试使用setframerate
函数将wav文件的帧率更改为16000,但这只会减慢整个录制的速度。我怎样才能将音频文件下采样到16kHz,并保持相同的音频长度?Python - 向下采样wav音频文件
非常感谢你提前
可以在scipy
使用重采样。这有点令人头疼,因为在python的本地代码bytestring
与scipy
中需要的数组之间需要进行一些类型转换。还有一个令人头疼的问题,因为在Python中的wave模块中,没有办法确定数据是否被签名(仅限于8位或16位)。它可能(应该)为两者工作,但我没有测试它。
这是一个小程序,它将(无符号)8位和16位单声道从44.1转换为16位。如果您有立体声或使用其他格式,则不应该很难适应。在代码的开头编辑输入/输出名称。永远不要使用命令行参数。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# downsample.py
#
# Copyright 2015 John Coppens <[email protected]>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
# MA 02110-1301, USA.
#
#
inwave = "sine_44k.wav"
outwave = "sine_16k.wav"
import wave
import numpy as np
import scipy.signal as sps
class DownSample():
def __init__(self):
self.in_rate = 44100.0
self.out_rate = 16000.0
def open_file(self, fname):
try:
self.in_wav = wave.open(fname)
except:
print("Cannot open wav file (%s)" % fname)
return False
if self.in_wav.getframerate() != self.in_rate:
print("Frame rate is not %d (it's %d)" % \
(self.in_rate, self.in_wav.getframerate()))
return False
self.in_nframes = self.in_wav.getnframes()
print("Frames: %d" % self.in_wav.getnframes())
if self.in_wav.getsampwidth() == 1:
self.nptype = np.uint8
elif self.in_wav.getsampwidth() == 2:
self.nptype = np.uint16
return True
def resample(self, fname):
self.out_wav = wave.open(fname, "w")
self.out_wav.setframerate(self.out_rate)
self.out_wav.setnchannels(self.in_wav.getnchannels())
self.out_wav.setsampwidth (self.in_wav.getsampwidth())
self.out_wav.setnframes(1)
print("Nr output channels: %d" % self.out_wav.getnchannels())
audio = self.in_wav.readframes(self.in_nframes)
nroutsamples = round(len(audio) * self.out_rate/self.in_rate)
print("Nr output samples: %d" % nroutsamples)
audio_out = sps.resample(np.fromstring(audio, self.nptype), nroutsamples)
audio_out = audio_out.astype(self.nptype)
self.out_wav.writeframes(audio_out.copy(order='C'))
self.out_wav.close()
def main():
ds = DownSample()
if not ds.open_file(inwave): return 1
ds.resample(outwave)
return 0
if __name__ == '__main__':
main()
谢谢大家的回答。我已经找到了一个解决方案,它的工作非常好。这是整个功能。
def downsampleWav(src, dst, inrate=44100, outrate=16000, inchannels=2, outchannels=1):
if not os.path.exists(src):
print 'Source not found!'
return False
if not os.path.exists(os.path.dirname(dst)):
os.makedirs(os.path.dirname(dst))
try:
s_read = wave.open(src, 'r')
s_write = wave.open(dst, 'w')
except:
print 'Failed to open files!'
return False
n_frames = s_read.getnframes()
data = s_read.readframes(n_frames)
try:
converted = audioop.ratecv(data, 2, inchannels, inrate, outrate, None)
if outchannels == 1:
converted = audioop.tomono(converted[0], 2, 1, 0)
except:
print 'Failed to downsample wav'
return False
try:
s_write.setparams((outchannels, 2, outrate, 0, 'NONE', 'Uncompressed'))
s_write.writeframes(converted)
except:
print 'Failed to write wav'
return False
try:
s_read.close()
s_write.close()
except:
print 'Failed to close wav files'
return False
return True
我知道这是旧的,但我只是有同样的问题,所以我尝试了代码,我认为它有一个微妙的错误。如果我的inchannels = 1和outchannels = 1,那么tomono函数将会被调用,这会扰乱我的音频信号(长度会减半)。当写帧时,你不应该只写转换后的[0](这取决于是否显式调用了tomono),因为ratecv返回的新状态是不相关的? – user667804
如果你去为11025Hz它会更容易,只是低通滤波器,然后采取每4个样品 – samgak
是audioop的ratecv之后你在做什么? https://docs.python.org/2/library/audioop.html#audioop.ratecv –
它需要16kHz,因为我们的管线工具需要将它导出为Unity项目。你介意给我一个使用audioop.ratecv函数的例子吗?因为我对该函数的fragment参数感到困惑。我如何得到它? @JimJeffries – d3cr1pt0r