2013-05-16 72 views
2

我试图创建类似于随机词语放入其中的句子。具体而言,我有这样的:Python - 分词,替换词

"The weather today is [weather_state]." 

,并能够执行类似的发现在方括号[]所有标记,比从字典或列表,让我换他们随机配对用:

"The weather today is warm." 
"The weather today is bad." 

"The weather today is mildly suiting for my old bones." 

记住的是,[托架]标记的位置不会总是以相同的位置,并且将有多个括号吨okens在我的字符串,如:

"[person] is feeling really [how] today, so he's not going [where]." 

我真的不知道从哪里开始的这或者是这甚至使用记号化或令牌模块,这个最佳的解决方案。任何暗示将指向我正确的方向非常赞赏!

编辑:只是为了澄清,我并不需要使用方括号,任何非标准字符都可以。

+0

可能是一个愚蠢的建议,但你看着字符串格式化'{}单曲? – akaIDIOT

回答

4

你正在寻找一个回调函数应用re.sub:

words = { 
    'person': ['you', 'me'], 
    'how': ['fine', 'stupid'], 
    'where': ['away', 'out'] 
} 

import re, random 

def random_str(m): 
    return random.choice(words[m.group(1)]) 


text = "[person] is feeling really [how] today, so he's not going [where]." 
print re.sub(r'\[(.+?)\]', random_str, text) 

#me is feeling really stupid today, so he's not going away. 

注意与format方法,这使得占位符的更复杂的处理,例如

[person:upper] got $[amount if amount else 0] etc 

基本上,你可以在此基础之上构建自己的“模板引擎”。

+0

这很棒,我喜欢我如何清洁和高效。它可以工作,并成为一名Python初学者,理解它给了我一个优势。 :)聪明的事情是写一个字典文件,将它保存在光盘上,并将其加载到这里的“字词”字典中......字典文件语法如何在文件中看起来像任何指针?非常感谢! – bitworks

+0

@bitworks:最简单和最方便的选择是json:http://docs.python.org/2/library/json.html – georg

2

您可以使用format方法。

>>> a = 'The weather today is {weather_state}.' 
>>> a.format(weather_state = 'awesome') 
'The weather today is awesome.' 
>>> 

另外:

>>> b = '{person} is feeling really {how} today, so he\'s not going {where}.' 
>>> b.format(person = 'Alegen', how = 'wacky', where = 'to work') 
"Alegen is feeling really wacky today, so he's not going to work." 
>>> 

当然,这种方法只适用IF你可以从方括号来卷曲那些切换。

0

如果您使用大括号而不是括号,那么您的字符串可以用作string formatting template。你可以使用itertools.product大量换人与填充:

import itertools as IT 

text = "{person} is feeling really {how} today, so he's not going {where}." 
persons = ['Buster', 'Arthur'] 
hows = ['hungry', 'sleepy'] 
wheres = ['camping', 'biking'] 

for person, how, where in IT.product(persons, hows, wheres): 
    print(text.format(person=person, how=how, where=where)) 

产生

Buster is feeling really hungry today, so he's not going camping. 
Buster is feeling really hungry today, so he's not going biking. 
Buster is feeling really sleepy today, so he's not going camping. 
Buster is feeling really sleepy today, so he's not going biking. 
Arthur is feeling really hungry today, so he's not going camping. 
Arthur is feeling really hungry today, so he's not going biking. 
Arthur is feeling really sleepy today, so he's not going camping. 
Arthur is feeling really sleepy today, so he's not going biking. 

生成随机的句子,你可以使用random.choice

for i in range(5): 
    person = random.choice(persons) 
    how = random.choice(hows) 
    where = random.choice(wheres) 
    print(text.format(person=person, how=how, where=where)) 

如果必须使用括号在您的格式没有大括号,你 可以取代用大括号括号,然后执行上述操作:

text = "[person] is feeling really [how] today, so he's not going [where]." 
text = text.replace('[','{').replace(']','}') 
+0

这个'person = person,how = how,where = where'thing could get really stupid如果他们有数百个。 – georg

+0

我决定远离'format(** locals())'这里,因为它不能清楚地说明替换是如何进行的。但是,如果你确实有数百个变量,'format(** locals())'就是要走的路。 – unutbu