2015-06-03 122 views
2

我不得不将任何外部python库(最好是wave和/或audioop)的wav文件从44100Hz下采样到16000Hz。我尝试使用setframerate函数将wav文件的帧率更改为16000,但这只会减慢整个录制的速度。我怎样才能将音频文件下采样到16kHz,并保持相同的音频长度?Python - 向下采样wav音频文件

非常感谢你提前

+0

如果你去为11025Hz它会更容易,只是低通滤波器,然后采取每4个样品 – samgak

+0

是audioop的ratecv之后你在做什么? https://docs.python.org/2/library/audioop.html#audioop.ratecv –

+0

它需要16kHz,因为我们的管线工具需要将它导出为Unity项目。你介意给我一个使用audioop.ratecv函数的例子吗?因为我对该函数的fragment参数感到困惑。我如何得到它? @JimJeffries – d3cr1pt0r

回答

1

可以在scipy使用重采样。这有点令人头疼,因为在python的本地代码bytestringscipy中需要的数组之间需要进行一些类型转换。还有一个令人头疼的问题,因为在Python中的wave模块中,没有办法确定数据是否被签名(仅限于8位或16位)。它可能(应该)为两者工作,但我没有测试它。

这是一个小程序,它将(无符号)8位和16位单声道从44.1转换为16位。如果您有立体声或使用其他格式,则不应该很难适应。在代码的开头编辑输入/输出名称。永远不要使用命令行参数。

#!/usr/bin/env python 
# -*- coding: utf-8 -*- 
# 
# downsample.py 
# 
# Copyright 2015 John Coppens <[email protected]> 
# 
# This program is free software; you can redistribute it and/or modify 
# it under the terms of the GNU General Public License as published by 
# the Free Software Foundation; either version 2 of the License, or 
# (at your option) any later version. 
# 
# This program is distributed in the hope that it will be useful, 
# but WITHOUT ANY WARRANTY; without even the implied warranty of 
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 
# GNU General Public License for more details. 
# 
# You should have received a copy of the GNU General Public License 
# along with this program; if not, write to the Free Software 
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, 
# MA 02110-1301, USA. 
# 
# 

inwave = "sine_44k.wav" 
outwave = "sine_16k.wav" 

import wave 
import numpy as np 
import scipy.signal as sps 

class DownSample(): 
    def __init__(self): 
     self.in_rate = 44100.0 
     self.out_rate = 16000.0 

    def open_file(self, fname): 
     try: 
      self.in_wav = wave.open(fname) 
     except: 
      print("Cannot open wav file (%s)" % fname) 
      return False 

     if self.in_wav.getframerate() != self.in_rate: 
      print("Frame rate is not %d (it's %d)" % \ 
        (self.in_rate, self.in_wav.getframerate())) 
      return False 

     self.in_nframes = self.in_wav.getnframes() 
     print("Frames: %d" % self.in_wav.getnframes()) 

     if self.in_wav.getsampwidth() == 1: 
      self.nptype = np.uint8 
     elif self.in_wav.getsampwidth() == 2: 
      self.nptype = np.uint16 

     return True 

    def resample(self, fname): 
     self.out_wav = wave.open(fname, "w") 
     self.out_wav.setframerate(self.out_rate) 
     self.out_wav.setnchannels(self.in_wav.getnchannels()) 
     self.out_wav.setsampwidth (self.in_wav.getsampwidth()) 
     self.out_wav.setnframes(1) 

     print("Nr output channels: %d" % self.out_wav.getnchannels()) 

     audio = self.in_wav.readframes(self.in_nframes) 
     nroutsamples = round(len(audio) * self.out_rate/self.in_rate) 
     print("Nr output samples: %d" % nroutsamples) 

     audio_out = sps.resample(np.fromstring(audio, self.nptype), nroutsamples) 
     audio_out = audio_out.astype(self.nptype) 

     self.out_wav.writeframes(audio_out.copy(order='C')) 

     self.out_wav.close() 

def main(): 
    ds = DownSample() 
    if not ds.open_file(inwave): return 1 
    ds.resample(outwave) 
    return 0 

if __name__ == '__main__': 
    main() 
3

谢谢大家的回答。我已经找到了一个解决方案,它的工作非常好。这是整个功能。

def downsampleWav(src, dst, inrate=44100, outrate=16000, inchannels=2, outchannels=1): 
    if not os.path.exists(src): 
     print 'Source not found!' 
     return False 

    if not os.path.exists(os.path.dirname(dst)): 
     os.makedirs(os.path.dirname(dst)) 

    try: 
     s_read = wave.open(src, 'r') 
     s_write = wave.open(dst, 'w') 
    except: 
     print 'Failed to open files!' 
     return False 

    n_frames = s_read.getnframes() 
    data = s_read.readframes(n_frames) 

    try: 
     converted = audioop.ratecv(data, 2, inchannels, inrate, outrate, None) 
     if outchannels == 1: 
      converted = audioop.tomono(converted[0], 2, 1, 0) 
    except: 
     print 'Failed to downsample wav' 
     return False 

    try: 
     s_write.setparams((outchannels, 2, outrate, 0, 'NONE', 'Uncompressed')) 
     s_write.writeframes(converted) 
    except: 
     print 'Failed to write wav' 
     return False 

    try: 
     s_read.close() 
     s_write.close() 
    except: 
     print 'Failed to close wav files' 
     return False 

    return True 
+1

我知道这是旧的,但我只是有同样的问题,所以我尝试了代码,我认为它有一个微妙的错误。如果我的inchannels = 1和outchannels = 1,那么tomono函数将会被调用,这会扰乱我的音频信号(长度会减半)。当写帧时,你不应该只写转换后的[0](这取决于是否显式调用了tomono),因为ratecv返回的新状态是不相关的? – user667804