使用Python,我必须编写一个基本上“清理”数据文本文件的脚本。到目前为止,我已经取出了所有不需要的字符或将它们替换为可接受的字符(例如,可以用空格替换破折号-
)。现在我已经到了必须分开加在一起的单词的地步。这里是文本的第15行的代码段文件用大写字母分隔连接词
AccessibleComputing Computer accessibility
AfghanistanHistory History of Afghanistan
AfghanistanGeography Geography of Afghanistan
AfghanistanPeople Demographics of Afghanistan
AfghanistanCommunications Communications in Afghanistan
AfghanistanMilitary Afghan Armed Forces
AfghanistanTransportations Transport in Afghanistan
AfghanistanTransnationalIssues Foreign relations of Afghanistan
AssistiveTechnology Assistive technology
AmoeboidTaxa Amoeba
AsWeMayThink As We May Think
AlbaniaHistory History of Albania
AlbaniaPeople Demographics of Albania
AlbaniaEconomy Economy of Albania
AlbaniaGovernment Politics of Albania
我想要做的是独立的是在其中大写字母出现点相连接的话。例如,我希望第一行看起来像这样:
Accessible Computing Computer accessibility
脚本必须接受文件输入并将结果写入输出文件。这是我目前所拥有的,根本不起作用! (不知道如果我在正确的轨道或没有在任)
import re
input_file = open("C:\\Users\\Lucas\\Documents\\Python\\pagelinkSample_10K_cleaned2.txt",'r')
output_file = open("C:\\Users\\Lucas\\Documents\\Python\\pagelinkSample_10K_cleaned3.txt",'w')
for line in input_file:
if line.contains('A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'):
newline = line.
output_file.write(newline)
input_file.close()
output_file.close()
我想要做的是在连接到前一个单词的大写字母之前插入一个空格。我早些时候看到了这个话题,但我无法弄清楚文件输入:( – lsch91