2012-11-24 66 views
0

你好:只是一个简单的问题..我希望。 我想用这个程序从一个语料库中生成随机文本..在这种情况下是一本书的一部分。属性错误?计划即将开始

我有一个文本文件,它是我的文集:(这是前奏,也不会在这里发表整件事)

The Project Gutenberg EBook of My Man Jeeves, by P. G. Wodehouse 
#27 in our series by P. G. Wodehouse 

Copyright laws are changing all over the world. Be sure to check the 
copyright laws for your country before downloading or redistributing 
this or any other Project Gutenberg eBook. 

This header should be the first thing seen when viewing this Project 
Gutenberg file. Please do not remove it. Do not change or edit the 
header without written permission. 

Please read the "legal small print," and other information about the 
eBook and Project Gutenberg at the bottom of this file. Included is 
important information about your specific rights and restrictions in 
how the file may be used. You can also find out about how to make a 
donation to Project Gutenberg, and how to get involved. 

etc etc etc 

接下来,我已经我想这个类的使用方法:

import random 

class Markov(object): 

    def __init__(self, open_file): 
     self.cache = {} 
     self.open_file = open_file 
     self.words = self.file_to_words() 
     self.word_size = len(self.words) 
     self.database() 


def file_to_words(self): 
    self.open_file.seek(0) 
    data = self.open_file.read() 
    words = data.split() 
    return words 


def triples(self): 
    """ Generates triples from the given data string. So if our string were 
      "What a lovely day", we'd generate (What, a, lovely) and then 
      (a, lovely, day). 
    """ 

    if len(self.words) < 3: 
     return 

    for i in range(len(self.words) - 2): 
     yield (self.words[i], self.words[i+1], self.words[i+2]) 

def database(self): 
    for w1, w2, w3 in self.triples(): 
     key = (w1, w2) 
     if key in self.cache: 
      self.cache[key].append(w3) 
     else: 
      self.cache[key] = [w3] 

def generate_markov_text(self, size=25): 
    seed = random.randint(0, self.word_size-3) 
    seed_word, next_word = self.words[seed], self.words[seed+1] 
    w1, w2 = seed_word, next_word 
    gen_words = [] 
    for i in xrange(size): 
     gen_words.append(w1) 
     w1, w2 = w2, random.choice(self.cache[(w1, w2)]) 
    gen_words.append(w2) 
    return ' '.join(gen_words) 

最后主要是给出了错误:“‘马氏’对象有没有属性‘file_to_words’”

import Class 
file_ = open('derp.txt') 
markov = Class.Markov(file_) 
markov.generate_markov_text() 

什么这里错了吗?谢谢。

+2

您file_to_words不缩进,使其成为马尔可夫类的一部分。这是一个裸体功能。 – Keith

回答

2

您需要缩进file_to_words方法,以使其成为Markov类的一部分。你现在拥有它的方式是Class函数中的模块级函数。将file_to_words方法(包括def行)中的所有内容移动到右侧4个空格处。

更新:对于所有其他方法也是如此。 Python使用空格/缩进来表示范围。

+0

谢谢..现在我觉得自己像一个白痴。 – user1378618

+0

嘿,我们都做了一次。这就是你学习的方式。 – Whatang

+0

最后一个问题你能解释为什么现在我运行它的程序不会生成单词输出吗?我基本上只是在这里的程序,并试图运行它的乐趣,但我无法匹配他的输出? http://agiliq.com/blog/2009/06/generating-pseudo-random-text-with-markov-chains-u/ – user1378618

1

从您发布的代码中,除init以外的所有方法由于缩进而不属于Markov类。