2017-05-15 40 views
0

对不起,我真的很基本的问题,我知道有关于这个问题的帖子无处不在,但我似乎无法绕过它所有的帮助其他网页。根据其他列中的值计算一列中字符串的出现次数Python

对于初学者来说,我是一个初学者,非常抱歉模糊的代码。但我只想要计算第2列中某个字符串出现的次数,第1列中的值保持不变。如果此值更改,则循环应重新开始。这听起来很简单,但我很困惑python阅读我的文本文件作为一个字符串(给我带钢和拆分等问题)。我似乎无法得到此代码的工作。请有人帮忙解决这个苦恼的小菜!

输入:

6 ABMV 
    6 ABMV 
    6 FOOD 
    6 FOOD 
    6 IDLE 
    10 IDLE 
    10 ABMV 
    10 IDLE 

代码:

#! /usr/bin/env python 

    from collections import Counter 

    outfile = open ("counts_outfile.txt", "w") 

    with open("test_counts.txt", "r") as infile: 
     lines = infile.readlines() 
     for i, item in enumerate(lines): 
     lines[i] = item.rstrip().split('\t') 
     last_chimp = lines[0][0] 
     behavior = lines[0][1] 
     nr_ABMV = 0 
     nr_FOOD = 0 
     nr_IDLE = 0 

     for lines in infile: 
      chimp = lines[0][0] 
      behavior = lines[0][1] 
      if chimp == last_chimp: 
       if behavior == "ABMV": 
        nr_ABMV += 1 
       elif behavior == "FOOD": 
        nr_FOOD += 1 
       elif behavior == "IDLE": 
        nr_IDLE += 1 
       else: 
        continue 
     else: 
      outline = "chimp_header %s\t%s\t%s\t%s" % (last_chimp, nr_ABMV, nr_FOOD, nr_IDLE) 
      outfile.write(outline) 
      last_chimp == lines[0][0] 
      nr_ABMV = 0 
      nr_FOOD = 0 
      nr_IDLE = 0 

    outfile.close() 

谢谢你在前进,你会帮我,显然很多 '黑猩猩'(黑猩猩)很多!

问候,

+1

你能包括预期的输出吗? –

回答

1

下面是一个例子,非常类似于代码:

outfile = open ("counts_outfile.txt", "w") 
outfile.write("chimp_header {:>4} {:4} {:4} {:4}\r\n".format('chimp', 'ABMV', 'FOOD', 'IDLE')) 

with open("test_counts.txt", "r") as infile: 
    lines = [ line.strip() for line in infile if line.strip() ] 

last_chimp = lines[0].split()[0] 
behavior = { "ABMV":0, "FOOD":0, "IDLE":0 } 

for line in lines : 
    line_split = line.strip().split() 
    chimp = line_split[0] 

    if chimp != last_chimp : 
     outfile.write("chimp_header {:>4} {:4} {:4} {:4}\r\n".format(last_chimp, behavior["ABMV"], behavior["FOOD"], behavior["IDLE"])) 
     last_chimp = chimp 
     behavior = { "ABMV":0, "FOOD":0, "IDLE":0 } 
    behavior[line_split[1]] += 1 

outfile.write("chimp_header {:>4} {:4} {:4} {:4}\r\n".format(last_chimp, behavior["ABMV"], behavior["FOOD"], behavior["IDLE"])) 
outfile.close() 

下面是使用Counter和字典又如:

from collections import Counter 

with open("test_counts.txt", "r") as infile: 
    lines = [ tuple(line.strip().split()) for line in infile if line.strip() ] 

chimps = { line[0] : { "ABMV":0, "FOOD":0, "IDLE":0 } for line in lines } 
for k, v in Counter(lines).items() : 
    chimps[k[0]][k[1]] = v 

with open("counts_outfile.txt", "w") as outfile : 
    outfile.write("chimp_header {:>4} {:4} {:4} {:4}\r\n".format('chimp', 'ABMV', 'FOOD', 'IDLE')) 
    for chimp in chimps : 
     outfile.write("chimp_header {:>4} {:4} {:4} {:4}\r\n".format(chimp, chimps[chimp]["ABMV"], chimps[chimp]["FOOD"], chimps[chimp]["IDLE"])) 

两个例子都产生相同结果:

chimp_header chimp ABMV FOOD IDLE 
chimp_header 6 2 2 1 
chimp_header 10 1 0 2 

我希望这给你一些想法。

+0

非常感谢! :) – visse226

相关问题