0

我正在做一个关于识别语言(英语,印地语,马拉地语等)的项目,这取决于源语言代码并将其翻译成另一种语言取决于输入的目标语言代码。使用Python将印地语语翻译为英语

所有事情都是用Python语言完成的。

Google API可识别语言并以文本格式解释,然后使用Microsoft API将其翻译为另一种语言。

但我面临的一个错误,这是

Traceback(most recent call last): 
    File "pitranslate.py", line 60, in <module> 
    translation_result = requests.get(translation_url + urllib.urlencode(translation_args), headers = headers) 
File "/usr/lib/python2.7/urllib.py", line 1332, in urlencode 
v = quote_plus(str(v)) 
UnicodeEncodeError: 'ascii' codec can 't encode characters in position 0-3: ordinal not in range(128) 

我输入:क्या कर रहे हो

下面是完整的代码:

import json 
import requests 
import urllib 
import subprocess 
import argparse 
import speech_recognition as sr 
from subprocess import call 

parser = argparse.ArgumentParser(description='This is a demo script by DaveConroy.com.') 
parser.add_argument('-o','--origin_language', help='Origin Language',required=True) 
parser.add_argument('-d','--destination_language', help='Destination Language', required=True) 
#parser.add_argument('-t','--text_to_translate', help='Text to Translate', required=True) 
args = parser.parse_args() 

## show values ## 
print ("Origin: %s" % args.origin_language) 
print ("Destination: %s" % args.destination_language) 
#print ("Text: %s" % args.text_to_translate) 

# obtain audio from the microphone 
r = sr.Recognizer() 
with sr.Microphone() as source: 
    print("Say something!") 

    audio = r.listen(source) 
args.text_to_translate = r.recognize_google(audio, language=args.origin_language) 
text = args.text_to_translate 
#text=r.recognize_google(audio) 
print text 
origin_language=args.origin_language 
destination_language=args.destination_language 


def speakOriginText(phrase): 
    googleSpeechURL = "http://translate.google.com/translate_tts?tl="+ origin_language +"&q=" + phrase 
    subprocess.call(["mplayer",googleSpeechURL], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 

def speakDestinationText(phrase): 
    googleSpeechURL = "http://translate.google.com/translate_tts?tl=" + destination_language +"&q=" + phrase 
    print googleSpeechURL 
    subprocess.call(["mplayer",googleSpeechURL], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 

args = { 
     'client_id': 'create and enter your client id', 
     'client_secret': 'create id and enter here',#your azure secret here 
     'scope': 'http://api.microsofttranslator.com', 
     'grant_type': 'client_credentials' 
    } 

oauth_url = 'https://datamarket.accesscontrol.windows.net/v2/OAuth2-13' 
oauth_junk = json.loads(requests.post(oauth_url,data=urllib.urlencode(args)).content) 
translation_args = { 
     'text': text, 
     'to': destination_language, 
     'from': origin_language 
     } 

headers={'Authorization': 'Bearer '+oauth_junk['access_token']} 
translation_url = 'http://api.microsofttranslator.com/V2/Ajax.svc/Translate?' 
translation_result = requests.get(translation_url+urllib.urlencode(translation_args),headers=headers) 
translation=translation_result.text[2:-1] 

speakOriginText('Translating ' + translation_args["text"]) 
speakDestinationText(translation) 

如何克服这个问题?

+2

的可能的复制[UnicodeEncodeError: 'ASCII' 编解码器不能编码字符U在位置20 '\ XA0':序数不在范围内(128)](HTTP ://stackoverflow.com/questions/9942594/unicodeencodeerror-ascii-codec-can t-encode-character-u-xa0-in-position-20) – Carpetsmoker

+0

你得到了这个答案吗? – Manvi

回答

1

对于这个错误你有一个像例如 你有你的文字在UTF-8解码文本在其他语言 My_input =क्याकररहेहो 现在使用这个文本转换或翻译,你必须使用解码

My_input=क्या कर रहे हो 
My_input.decode("utf-8") 

这样可以解码和编码串

相关问题