1

我开发了一个应用程序,该应用程序使用语音到文本将音频转录为文本。准确度低。有些句子没有意义。有没有办法提高语音到文本的准确性?IBM Watson语音识别的准确性不高

下面是一个例子:

http://book.vidalab.co/books/alice-in-wonderland

爱丽丝梦游仙境,在第2节:

“在家里去白典当这种方式,你看广告” 应该是“在家里去白典当这种方式,你看到爱丽丝”

‘老鼠白’ 应该是‘红与白’

“白军试图赢得和红色的特里斯双胞胎” 应该是“和白军试图赢得和红军试图赢得”

+0

它不是人工智能。看看它如何处理这首诗:http://www.waylink-english.co.uk/?page=16100 –

+0

我不希望它解析诗。但与文学不太一样。也许文学也是无边界的? –

回答

1

你可以尝试不同的服务,例如语音学,它不是在得到扬声器很好,但话是不是从沃森精确得多,结果是这样的:

Credits of Alice in Wonderland by Alice girs Timberg this is a box recording all of her vocal recordings are in the public domain for more information or volunteer. Please visit libber Vox dot org. 
I just listed stage directions read by McKayla Curtis Lewis Carroll. 
Read by Shannon Brown Alice read by Amanda Friday the Red Queen read by Shauna canat White Queen read by Elizabeth Klatt White Rabbit read by Todd Humpty Dumpty read by Jeff Machado written read by Brett Hirsch. 
The Mock Turtle read by Ted the alarm Mad Hatter read by Elliot gage the March Hare by Charlotte Duckett's dormouse read by Kimberly Krauss frog read by Larry Wilson Duchess read by L.A. Cheshire Cat read by Sarah Herschell Tweedle-Dee read. 
By Charlotte Brown. 
Do you do do I read by the sea a solo the King of Hearts read by Ted alarm the Queen of Hearts read by eating Ray Headrick knave by glorious Joe Carter pillar back at 2 loss to spot read by Dave Harris. 
Five Spot read by Dave Harith. Seven of spades read by Dave Hereth end of credits. 

姓氏识别是非常复杂的工作,并不是很多企业都在做正确的。

0

任何STT系统都有两个主要部分:声学模型和语言模型。第一个是关于音频和扬声器,并处理诸如:噪音,发音,口音等等。语言模型是关于给定语言的结构和语音中使用的词语。

如果您想测试STT,请使用尽可能接近目标语音的录音。对于一般言语或医学记录表现非常好的系统可能不适合处理关于考古学或诗歌的演讲.e