我从gutenberg.org以文本格式拍摄了一本书,并且正在尝试阅读文本,但是跳过文件的开始部分,然后使用我编写的过程函数来解析其余部分。我怎样才能做到这一点?阅读文件并跳过Python中文本文件的标题部分
这是文本文件的开始。
> The Project Gutenberg EBook of The Kama Sutra of Vatsyayana, by Vatsyayana
This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever. You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.net
Title: The Kama Sutra of Vatsyayana
Translated From The Sanscrit In Seven Parts With Preface,
Introduction and Concluding Remarks
Author: Vatsyayana
Translator: Richard Burton
Bhagavanlal Indrajit
Shivaram Parashuram Bhide
Release Date: January 18, 2009 [EBook #27827]
Language: English
*** START OF THIS PROJECT GUTENBERG EBOOK THE KAMA SUTRA OF VATSYAYANA ***
Produced by Bruce Albrecht, Carla Foust, Jon Noring and
the Online Distributed Proofreading Team at
http://www.pgdp.net
和我的代码,当前处理整个文件。
import string
def process_file(filename):
""" opens a file and passes back a list of its words"""
h = dict()
fin = open(filename)
for line in fin:
process_line(line, h)
return h
def process_line(line, h):
line = line.replace('-', ' ')
for word in line.split():
word = word.strip(string.punctuation + string.whitespace)
word = word.lower()
h[word] = h.get(word,0)+1
不要忘记关闭文件。您可能想要使用'with'关键字。也就是'open(filename)as fin:'当你退出with context时,上下文管理器会为你关闭这个文件。和upvoted nightcraker的答案。 – 2011-04-13 19:51:18