2017-04-18 56 views
1

我正在通过谷歌语音API提供的代码片段found here。代码应该足以将.wav文件转换为转录文本。谷歌云语音API python代码示例有可能的bug

关注的块是在这里:

def transcribe_file(speech_file): 
    """Transcribe the given audio file.""" 
    from google.cloud import speech 
    speech_client = speech.Client() 

    with io.open(speech_file, 'rb') as audio_file: 
     content = audio_file.read() 
     audio_sample = speech_client.sample(
      content=content, 
      source_uri=None, 
      encoding='LINEAR16', 
      sample_rate_hertz=16000) 

    alternatives = audio_sample.recognize('en-US') 
    for alternative in alternatives: 
     print('Transcript: {}'.format(alternative.transcript)) 

首先,我想也许代码是老了,sample_rate_hertz=16000不得不改为sample_rate=16000

在那之后,我得到一个错误这条线:
alternatives = audio_sample.recognize('en-US')
其内容
AttributeError: 'Sample' object has no attribute 'recognize'

我很好奇如何纠正这一点。我似乎无法找到有关此方法的任何文档。也许它也需要被替换。

+0

请看看[这里](http://stackoverflow.com/questions/38703853/how-to-use-google-speech-recognition-api-in-python/38788928#38788928),因为有一个类似的工作例子 –

回答

1

您NEAD阅读文件为二进制,然后用service.speech().syncrecognize论点一(字典),其中包含所有必需的参数,如:

  • 编码,
  • 采样率
  • 语言)

愿你尝试类似:

with open(speech_file, 'rb') as speech: 
    speech_content = base64.b64encode(speech.read()) 

service = get_speech_service() 
service_request = service.speech().syncrecognize(
    body={ 
     'config': { 
      'encoding': 'LINEAR16', # raw 16-bit signed LE samples 
      'sampleRate': 16000, # 16 khz 
      'languageCode': 'en-US', # a BCP-47 language tag 
     }, 
     'audio': { 
      'content': speech_content.decode('UTF-8') 
      } 
     }) 
response = service_request.execute() 
print(json.dumps(response)) 

请看看here,因为有一个类似的工作示例。

1

您使用github quickstart.py示例,所以我不知道这与文档Google Cloud Speech API class sample不同步。但它仍然是BETA

假设isinstance(audio_sample, <class Sample(object)>) == True
然后.recognize

alternatives = audio_sample.recognize('en-US') 

应该是

async_recognize, streaming_recognize, sync_recognize