2013-12-10 54 views
2

在标准输入,我提供以下文件:蟒蛇代替单词按条件

#123  595739778  "neutral"  Won the match #getin 
    #164  595730008  "neutral"  Good girl 

数据2号看起来像这样:

labels 1 0 -1 
    -1 0.272653 0.139626 0.587721 
    1 0.0977782 0.0748234 0.827398 

我想看看它在-1数据#2文件,然后用负,1则正,替换0,则中性

以下是我的问题:

  1. 启动数据#2文件在第二行
  2. 我正面临着替换的麻烦。我想像下面一样替换它,但是它显示了一个错误,它期望另外1个参数,但是我已经有2个参数了。
  3. 如果我这样做,类似下面(注意print语句):

    if binary == "-1": 
        senti = str.replace(senti.strip('"'),"negative") 
    elif binary == "1": 
        senti = str.replace(senti.strip('"'),"positive") 
    elif binary == "0": 
        senti = str.replace(senti.strip('"'),"neutral") 
    print id, "\t", num, "\t", senti, "\t", sent 
    

    ,但如果我这样做(注意打印),那么它不会在 '如果条件' 走出去:

    if binary == "-1": 
        senti = str.replace(senti.strip('"'),"negative") 
    elif binary == "1": 
        senti = str.replace(senti.strip('"'),"positive") 
    elif binary == "0": 
        senti = str.replace(senti.strip('"'),"neutral") 
    

    打印ID, “\ t” 的,NUM, “\ t” 的,senti, “\ t” 的,送

如何打印即可。 输出,我得到: #123 595739778 “中性” 赢得了比赛#getin #164 595730008 “中立” 好女孩

output expected (replace just replaces the negative, positive & neutral as per data# file: 

    #123  595739778  negative  Won the match #getin 
    #164  595730008  positive  Good girl 

错误:

Traceback (most recent call last): 
    File "./combine.py", line 17, in <module> 
    senti = str.replace(senti.strip('"'),"negative") 
TypeError: replace() takes at least 2 arguments (1 given) 

这里是我的代码:

for line in sys.stdin: 
    (id,num,senti,sent) = re.split("\t+",line.strip()) 
    tweet = re.split("\s+", sent.strip().lower()) 
    f = open("data#2.txt","r") 
    for line1 in f: 
     (binary,rest,rest1,test2) = re.split("\s", line1.strip()) 
     if binary == "-1": 
      senti = str.replace(senti.strip('"'),"negative") 
     elif binary == "1": 
      senti = str.replace(senti.strip('"'),"positive") 
     elif binary == "0": 
      senti = str.replace(senti.strip('"'),"neutral") 
     print id, "\t", num, "\t", senti, "\t", sent 
+0

你可以发布你收到的错误吗? – qmorgan

+0

@qmorgan检查我的编辑 – fscore

回答

3

你实际上错过了一个替换的论点;因为它是字符串本身的方法,你可以做两种:

In [72]: str.replace('one','o','1') 
Out[72]: '1ne' 

In [73]: 'one'.replace('o','1') 
Out[73]: '1ne' 

在代码中,你可能会想,例如

if binary == "-1": 
     senti = senti.strip('"').replace("-1","negative") 

要跳过数据#2文件的第一行,一个选择是

f = open("data#2.txt","r") 
for line1 in f.readlines()[1:]: # skip the first line 
    #rest of your code here 

编辑:聊天对话后,你想要什么,我觉得更像是以下几点:

f = open("data#2.txt","r") 
datalines = f.readlines()[1:] 

count = 0 

for line in sys.stdin: 
    if count == len(datalines): break # kill the loop if we've reached the end 
    (tweetid,num,senti,tweets) = re.split("\t+",line.strip()) 
    tweet = re.split("\s+", tweets.strip().lower()) 
    # grab the right index from our list 
    (binary,rest,rest1,test2) = re.split("\s", datalines[count].strip()) 
    if binary == "-1": 
     sentiment = "negative" 
    elif binary == "1": 
     sentiment = "positive" 
    elif binary == "0": 
     sentiment = "neutral" 
    print tweetid, "\t", num, "\t", sentiment, "\t", tweets 
    count += 1 # add to our counter 
+0

嗨那工作,但检查#3在我的编辑 – fscore

+0

我无法理解你在这里说什么。你能改说吗? – qmorgan

+0

检查我的编辑请 – fscore