音频编码与PCM转换问题32位哟PCM 16位

我在通用Windows应用程序中使用C＃编写沃森语音文本服务。现在，我没有使用Watson服务，而是写入该文件，然后在Audacity中读取它，以确认它的格式正确，因为Watson服务没有向我返回正确的响应，下面解释了原因。音频编码与PCM转换问题32位哟PCM 16位

由于某些原因，当我创建16位PCM编码属性和读取缓冲区时，我只能读取数据为32位PCM，并且运行良好，但如果我在16位PCM中读取它，是慢动作，所有的讲话基本上是腐败的。

我真的不知道到底需要怎样做才能从32位转换为16位，但这里是我在我的C＃应用程序：

//Creating PCM Encoding properties 
var pcmEncoding = AudioEncodingProperties.CreatePcm(16000, 1, 16); 
var result = await AudioGraph.CreateAsync(
    new AudioGraphSettings(AudioRenderCategory.Speech) 
    { 
     DesiredRenderDeviceAudioProcessing = AudioProcessing.Raw, 
     AudioRenderCategory = AudioRenderCategory.Speech, 
     EncodingProperties = pcmEncoding 
    } 
); 
graph = result.Graph; 

//Initialize microphone 
var microphone = await DeviceInformation.CreateFromIdAsync(MediaDevice.GetDefaultAudioCaptureId(AudioDeviceRole.Default)); 
var micInputResult = await graph.CreateDeviceInputNodeAsync(MediaCategory.Speech, pcmEncoding, microphone); 

//Create frame output node 
frameOutputNode = graph.CreateFrameOutputNode(pcmEncoding); 

//Callback function to fire when buffer is filled with data 
graph.QuantumProcessed += (s, a) => ProcessFrameOutput(frameOutputNode.GetFrame()); 
frameOutputNode.Start(); 

//Make the microphone write into the frame node 
micInputResult.DeviceInputNode.AddOutgoingConnection(frameOutputNode); 
micInputResult.DeviceInputNode.Start(); 

graph.Start();

初始化步骤是在这方面做得阶段。现在，实际上从缓冲区读取数据并写入文件只有在使用具有以下功能的32位PCM编码（注释为导致慢动作语音输出的PCM 16位代码）时才起作用：

private void ProcessFrameOutput(AudioFrame frame) 
{ 
    //Making a copy of the audio frame buffer 
    var audioBuffer = frame.LockBuffer(AudioBufferAccessMode.Read); 
    var buffer = Windows.Storage.Streams.Buffer.CreateCopyFromMemoryBuffer(audioBuffer); 
    buffer.Length = audioBuffer.Length; 

    using (var dataReader = DataReader.FromBuffer(buffer)) 
    { 
     dataReader.ByteOrder = ByteOrder.LittleEndian; 

     byte[] byteData = new byte[buffer.Length]; 
     int pos = 0; 

     while (dataReader.UnconsumedBufferLength > 0) 
     { 
      /*Reading Float -> Int 32*/ 
      /*With this code I can import raw wav file into the Audacity 
       using Signed 32-bit PCM Encoding, and it is working well*/ 
      var singleTmp = dataReader.ReadSingle(); 
      var int32Tmp = (Int32)(singleTmp * Int32.MaxValue); 
      byte[] chunkBytes = BitConverter.GetBytes(int32Tmp); 
      byteData[pos++] = chunkBytes[0]; 
      byteData[pos++] = chunkBytes[1]; 
      byteData[pos++] = chunkBytes[2]; 
      byteData[pos++] = chunkBytes[3]; 

      /*Reading Float -> Int 16 (Slow Motion)*/ 
      /*With this code I can import raw wav file into the Audacity 
       using Signed 16-bit PCM Encoding, but when I play it, it's in 
       a slow motion*/ 
      //var singleTmp = dataReader.ReadSingle(); 
      //var int16Tmp = (Int16)(singleTmp * Int16.MaxValue); 
      //byte[] chunkBytes = BitConverter.GetBytes(int16Tmp); 
      //byteData[pos++] = chunkBytes[0]; 
      //byteData[pos++] = chunkBytes[1]; 
     } 

     WriteBytesToFile(byteData); 
    } 
}

任何人都可以想到为什么会发生这种情况？是因为Int32 PCM尺寸较大，当我使用Int16时，它扩展了它并使声音更长？或者我没有正确抽样？

注意：我试着直接从缓冲区中读取字节，然后用它作为原始数据，但它不是以这种方式编码为PCM。直接从缓冲区中读取Int16/32也不起作用。在上面的例子中，我只使用了帧输出节点。如果我创建了一个自动写入原始文件的文件输出节点，它的效果非常好，因为它是16位PCM，所以我的回调函数中出现了一些错误，导致它慢动作。

感谢

来源

2016-06-07 Dima Rudeshko

对于未来，如果你提供一个已损坏的原始数据样本，你的问题就容易解决。 –

//Creating PCM Encoding properties 
var pcmEncoding = AudioEncodingProperties.CreatePcm(16000, 1, 16); 
var result = await AudioGraph.CreateAsync(
    new AudioGraphSettings(AudioRenderCategory.Speech) 
    { 
     DesiredRenderDeviceAudioProcessing = AudioProcessing.Raw, 
     AudioRenderCategory = AudioRenderCategory.Speech, 
     EncodingProperties = pcmEncoding 
    } 
); 
graph = result.Graph;

pcmEncoding并没有太大的意义，因为这里只浮法编码由AudioGraph支持。

 byte[] byteData = new byte[buffer.Length];

应该buffer.Length/2，因为你从float数据转换每个样品4个字节的数据INT16每个样品

 /*Reading Float -> Int 16 (Slow Motion)*/ 
     /*With this code I can import raw wav file into the Audacity 
      using Signed 16-bit PCM Encoding, but when I play it, it's in 
      a slow motion*/ 
     var singleTmp = dataReader.ReadSingle(); 
     var int16Tmp = (Int16)(singleTmp * Int16.MaxValue); 
     byte[] chunkBytes = BitConverter.GetBytes(int16Tmp); 
     byteData[pos++] = chunkBytes[0]; 
     byteData[pos++] = chunkBytes[1];

这是正确的代码，它应该工作2个字节。您的“慢动作”很可能与之前错误设置的缓冲区大小有关。

我必须承认，微软需要有人来审查其臃肿的API

来源

2016-06-07 20:18:44

谢谢你的评论。我实际上是以同样的方式在我的Int32代码下进行转换。对不起，它只是被注释掉，表明它是慢动作。更新1：我加快了音频文件2x，它听起来不像原来的声音 - 它看起来像缺少一些音频块。 –

我更新了答案 –

我花了2天的时间试图解决这个问题，只需添加“/ 2”就可以实现。我只是用沃森服务进行测试，并且像魅力一样工作。非常感谢！ –

音频编码与PCM转换问题32位哟PCM 16位

回答

相关问题