转换音频立体声为音频字节

我试图做一些音频处理，我真的坚持立体声单声道转换。我在互联网上查看有关立体声转换为单声道。转换音频立体声为音频字节

据我所知，我可以把左声道，右声道，总和它们除以2.但是当我把结果再次转储到WAV文件中时，我得到了很多前景噪声。我知道处理数据时可能会引起噪声，字节变量中会出现一些溢出。

这是从一个MP3文件中检索字节[]数据块我的课：

公共类InputSoundDecoder {

private int BUFFER_SIZE = 128000; 
private String _inputFileName; 
private File _soundFile; 
private AudioInputStream _audioInputStream; 
private AudioFormat _audioInputFormat; 
private AudioFormat _decodedFormat; 
private AudioInputStream _audioInputDecodedStream; 

public InputSoundDecoder(String fileName) throws UnsuportedSampleRateException{ 
    this._inputFileName = fileName; 
    this._soundFile = new File(this._inputFileName); 
    try{ 
     this._audioInputStream = AudioSystem.getAudioInputStream(this._soundFile); 
    } 
    catch (Exception e){ 
     e.printStackTrace(); 
     System.err.println("Could not open file: " + this._inputFileName); 
     System.exit(1); 
    } 

    this._audioInputFormat = this._audioInputStream.getFormat(); 

    this._decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 2, 1, 44100, false); 
    this._audioInputDecodedStream = AudioSystem.getAudioInputStream(this._decodedFormat, this._audioInputStream); 

    /** Supported sample rates */ 
    switch((int)this._audioInputFormat.getSampleRate()){ 
     case 22050: 
       this.BUFFER_SIZE = 2304; 
      break; 

     case 44100: 
       this.BUFFER_SIZE = 4608; 
      break; 

     default: 
      throw new UnsuportedSampleRateException((int)this._audioInputFormat.getSampleRate()); 
    } 

    System.out.println ("# Channels: " + this._decodedFormat.getChannels()); 
    System.out.println ("Sample size (bits): " + this._decodedFormat.getSampleSizeInBits()); 
    System.out.println ("Frame size: " + this._decodedFormat.getFrameSize()); 
    System.out.println ("Frame rate: " + this._decodedFormat.getFrameRate()); 

} 

public byte[] getSamples(){ 
    byte[] abData = new byte[this.BUFFER_SIZE]; 
    int bytesRead = 0; 

    try{ 
     bytesRead = this._audioInputDecodedStream.read(abData,0,abData.length); 
    } 
    catch (Exception e){ 
     e.printStackTrace(); 
     System.err.println("Error getting samples from file: " + this._inputFileName); 
     System.exit(1); 
    } 

    if (bytesRead > 0) 
     return abData; 
    else 
     return null; 
}

}

这意味着，每次我打电话getSamples时间，它返回一个数组，如：

buff = {Lchannel，Rchannel，Lchannel，Rchannel，Lchannel，Rchannel，Lchannel，Rchannel ...}

的处理例行程序的转换到单声道的样子：

byte[] buff = null; 
     while((buff = _input.getSamples()) != null){ 

      /** Convert to mono */ 
      byte[] mono = new byte[buff.length/2]; 

      for (int i = 0 ; i < mono.length/2; ++i){ 
       int left = (buff[i * 4] << 8) | (buff[i * 4 + 1] & 0xff); 
       int right = (buff[i * 4 + 2] <<8) | (buff[i * 4 + 3] & 0xff); 
       int avg = (left + right)/2; 
       short m = (short)avg; /*Mono is an average between 2 channels (stereo)*/ 
       mono[i * 2] = (byte)((short)(m >> 8)); 
       mono[i * 2 + 1] = (byte)(m & 0xff); 
      }

}

和写入到使用wav文件：

 public static void writeWav(byte [] theResult, int samplerate, File outfile) { 
     // now convert theResult into a wav file 
     // probably should use a file if samplecount is too big! 
     int theSize = theResult.length; 


     InputStream is = new ByteArrayInputStream(theResult); 
     //Short2InputStream sis = new Short2InputStream(theResult); 

     AudioFormat audioF = new AudioFormat(
       AudioFormat.Encoding.PCM_SIGNED, 
       samplerate, 
       16, 
       1,   // channels 
       2,   // framesize 
       samplerate, 
       false 
     ); 

     AudioInputStream ais = new AudioInputStream(is, audioF, theSize); 

     try { 
      AudioSystem.write(ais, AudioFileFormat.Type.WAVE, outfile); 
     } catch (IOException ioe) { 
      System.err.println("IO Exception; probably just done with file"); 
      return; 
     } 


    }

随着44100作为采样率。

考虑采取实际的byte []数组，我已经得到它已经PCM，所以MP3 - > PCM转换它通过指定

this._decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 2, 1, 44100, false); 
this._audioInputDecodedStream = AudioSystem.getAudioInputStream(this._decodedFormat, this._audioInputStream);

当我做说，写入Wav文件时，我听到很多噪音。我假装对每一个字节应用一个FFT，但我认为由于噪声很大，结果是不正确的。

因为我拍了两首歌，其中一首是另一首20秒的作品，当比较作品的fft结果与原始的20秒子集时，它完全不匹配。

我认为这是不正确的转换stereo-> mono的原因。

希望有人知道这件事，

问候。

来源

2013-05-09 Mario

如果是由溢出引起的，为什么不除以2然后求和？ – James 2013-05-09 16:26:55

您可能会错误地获取数据的字节序。试着做一些像没有转换的读写操作，或者更好的办法是通过一个已知的干净的数据源（也许是一个只使用2个不同振幅值的方波）并检查输出的原始字节。有了一点经验，如果音频软件中的信号图表可能会很快被识别出来。 – 2013-05-09 16:29:05

如果我不转换，所有我从一个MP3文件它是原始编码字节。转换它不是一个可选的步骤，它必须完成才能将真实的声音值输入到数组中。划分和求和有相同的结果... – Mario 2013-05-09 16:33:43

正如在评论中指出的，排序可能是错误的。另外，转换为有符号的short并将其移位可能会导致第一个字节为0xFF。

尝试：

int HI = 0; int LO = 1; 
int left = (buff[i * 4 + HI] << 8) | (buff[i * 4 + LO] & 0xff); 
int right = (buff[i * 4 + 2 + HI] << 8) | (buff[i * 4 + 2 + LO] & 0xff); 
int avg = (left + right)/2; 
mono[i * 2 + HI] = (byte)((avg >> 8) & 0xff); 
mono[i * 2 + LO] = (byte)(avg & 0xff);

然后切换HI和LO的值，看它是否变得更好。

来源

2013-05-09 19:04:07

非常感谢！，问题是关于endian !,我用过HI = 1，LO = 0，并且像一个魅力一样工作！ – Mario 2013-05-09 19:18:34

转换音频立体声为音频字节

回答

相关问题