用下面的代码(有点乱,我承认)我用逗号分隔了一个字符串,但条件是当它不分隔时字符串中包含逗号分隔的单个词,例如: 它没有分开"Yup, there's a reason why you want to hit the sack just minutes after climax"
,但它分离成"The increase in heart rate, which you get from masturbating, is directly beneficial to the circulation, and can reduce the likelihood of a heart attack"
['The increase in heart rate', 'which you get from masturbating', 'is directly beneficial to the circulation', 'and can reduce the likelihood of a heart attack']
用逗号分隔字符串,但有条件(忽略用逗号分隔的单个词)
的问题是当它与这样的字符串遇到代码的目的失败:"When men ejaculate, it releases a slew of chemicals including oxytocin, vasopressin, and prolactin, all of which naturally help you hit the pillow."
我不想催产素后分离,但催乳素后。我需要一个正则表达式来做到这一点。
import os
import textwrap
import re
import io
from textblob import TextBlob
string = str(input_string)
listy= [x.strip() for x in string.split(',')]
listy = [x.replace('\n', '') for x in listy]
listy = [re.sub('(?<!\d)\.(?!\d)', '', x) for x in listy]
listy = filter(None, listy) # Remove any empty strings
newstring= []
for segment in listy:
wc = TextBlob(segment).word_counts
if listy[len(listy)-1] != segment:
if len(wc) > 3: # len(segment.split(' ')) > 7:
newstring.append(segment+"&&")
else:
newstring.append(segment+",")
else:
newstring.append(segment)
sep = [x.strip() for x in (' '.join(newstring)).split('&&')]
尽管我相信正确的英文用法是'a,b和c'而不是'a,b和c'。因此,如果适当的英语然后只是',(?!\ s + \ w +,)'会起作用。 – kaza
当然,谢谢你的详细解答。 Upvoting你。 –
优秀的答案。 –