0
- 消除标点符号
- 话结识新线和空间分割时,然后存储在阵列
- 检查文本文件有错误或不符合checkSpelling.m的函数文件
- 总和向上的误差该文章中的总数假定
- 没有建议是没有错误,则返回-1
- 误差的总和> 20,返回1
- 总和误差< = 20,返回的-1
我想检查某个段落的拼写错误,我面临的问题摆脱了标点符号。它可能有问题的其他原因,我返回如下错误:如何摆脱标点符号?并检查拼写错误
我DATA2文件是:
checkSpelling.m
function suggestion = checkSpelling(word)
h = actxserver('word.application');
h.Document.Add;
correct = h.CheckSpelling(word);
if correct
suggestion = []; %return empty if spelled correctly
else
%If incorrect and there are suggestions, return them in a cell array
if h.GetSpellingSuggestions(word).count > 0
count = h.GetSpellingSuggestions(word).count;
for i = 1:count
suggestion{i} = h.GetSpellingSuggestions(word).Item(i).get('name');
end
else
%If incorrect but there are no suggestions, return this:
suggestion = 'no suggestion';
end
end
%Quit Word to release the server
h.Quit
f19.m
for i = 1:1
data2=fopen(strcat('DATA\PRE-PROCESS_DATA\F19\',int2str(i),'.txt'),'r')
CharData = fread(data2, '*char')'; %read text file and store data in CharData
fclose(data2);
word_punctuation=regexprep(CharData,'[`[email protected]#$%^&*()-_=+[{]}\|;:\''<,>.?/','')
word_newLine = regexp(word_punctuation, '\n', 'split')
word = regexp(word_newLine, ' ', 'split')
[sizeData b] = size(word)
suggestion = cellfun(@checkSpelling, word, 'UniformOutput', 0)
A19(i)=sum(~cellfun(@isempty,suggestion))
feature19(A19(i)>=20)=1
feature19(A19(i)<20)=-1
end