2016-03-14 616 views
3
import csv 
with open('Class1scores.csv') as inf: 
    for line in inf: 
     parts = line.split() 
     if len(parts) > 1: 
      print (parts[4]) 


f = open('Class1scores.csv') 
csv_f = csv.reader(f) 
newlist = [] 
for row in csv_f: 

    row[1] = int(row[1]) 
    row[2] = int(row[2]) 
    row[3] = int(row[3]) 

    maximum = max(row[1:3]) 
    row.append(maximum) 
    average = round(sum(row[1:3])/3) 
    row.append(average) 
    newlist.append(row[0:4]) 

averageScore = [[x[3], x[0]] for x in newlist] 
print('\nStudents Average Scores From Highest to Lowest\n') 

此处代码旨在读取CSV文件,并在前三行(第0行是用户名)中添加所有三个分数和除以三,但它不计算适当的平均值,它只取最后一列的分数。在python中格式化CSV文件中的数据(计算平均值)

csv file

+1

您能发布您的CSV文件的前几行吗? – Igor

+0

打开文件两次有什么意义? – Seekheart

+0

比利,看看我的答案。您可以切出不需要的零件,并根据自己的需要实施零件。 – Igor

回答

3

基本上你想每一行的统计数据。一般来说,你应该这样做:

import csv 

with open('data.csv', 'r') as f: 
    rows = csv.reader(f) 
    for row in rows: 
     name = row[0] 
     scores = row[1:] 

     # calculate statistics of scores 
     attributes = { 
      'NAME': name, 
      'MAX' : max(scores), 
      'MIN' : min(scores), 
      'AVE' : 1.0 * sum(scores)/len(scores) 
     } 

     output_mesg ="name: {NAME:s} \t high: {MAX:d} \t low: {MIN:d} \t ave: {AVE:f}" 
     print(output_mesg.format(**attributes)) 

尽量不要考虑如果做特定的事情本地效率低下。一个好的Pythonic脚本应该尽可能对每个人都可读。

在代码中,我发现了两个错误:

  1. 追加到row不会改变任何东西,因为排在循环中的局部变量,会得到垃圾收集。

  2. row[1:3]只给出第二和第三个元素。 row[1:4]给出你想要的,以及row[1:]。索引在Python通常是最终排他性的。

而且为你思考一些问题:

如果我可以打开Excel文件,它不是那么大,为什么不只是做在Excel?我可以利用所有必要的工具尽快完成工作吗?我能在30秒内完成这项任务吗?

+1

您的代码包含错误,因为它不输出任何内容。打印行不需要括号,因为它的python 3? – Billy

+0

@Billy我在https://docs.python.org/3.1/tutorial/inputoutput.html查找了Python3的打印示例,并添加了一个括号和一个格式函数调用。希望是对的。 – Mai

+0

因为我对Python很陌生,所以我很好奇你为什么将sum(分数)/ len(分数)乘以1.0。是否要防止使用整数数据类型? –

2

下面是做到这一点的方法之一。见两部分。首先,我们创建一个名称作为关键字和结果列表作为值的字典。

import csv 


fileLineList = [] 
averageScoreDict = {} 

with open('Class1scores.csv', newline='') as csvfile: 
    reader = csv.reader(csvfile) 
    for row in reader: 
     fileLineList.append(row) 

for row in fileLineList: 
    highest = 0 
    lowest = 0 
    total = 0 
    average = 0 
    for column in row: 
     if column.isdigit(): 
      column = int(column) 
      if column > highest: 
       highest = column 
      if column < lowest or lowest == 0: 
       lowest = column 
      total += column  
    average = total/3 
    averageScoreDict[row[0]] = [highest, lowest, round(average)] 

print(averageScoreDict) 

输出:

{'Milky': [7, 4, 5], 'Billy': [6, 5, 6], 'Adam': [5, 2, 4], 'John': [10, 7, 9]}

现在,我们有我们的字典,我们可以通过列表进行排序,创建你想要的最终输出。看到这个更新的代码:

import csv 
from operator import itemgetter 


fileLineList = [] 
averageScoreDict = {} # Creating an empty dictionary here. 

with open('Class1scores.csv', newline='') as csvfile: 
    reader = csv.reader(csvfile) 
    for row in reader: 
     fileLineList.append(row) 

for row in fileLineList: 
    highest = 0 
    lowest = 0 
    total = 0 
    average = 0 
    for column in row: 
     if column.isdigit(): 
      column = int(column) 
      if column > highest: 
       highest = column 
      if column < lowest or lowest == 0: 
       lowest = column 
      total += column  
    average = total/3 
    # Here is where we put the emtpy dictinary created earlier to good use. 
    # We assign the key, in this case the contents of the first column of 
    # the CSV, to the list of values. 
    # For the first line of the file, the Key would be 'John'. 
    # We are assigning a list to John which is 3 integers: 
    # highest, lowest and average (which is a float we round) 
    averageScoreDict[row[0]] = [highest, lowest, round(average)] 

averageScoreList = [] 

# Here we "unpack" the dictionary we have created and create a list of Keys. 
# which are the names and single value we want, in this case the average. 
for key, value in averageScoreDict.items(): 
    averageScoreList.append([key, value[2]]) 

# Sorting the list using the value instead of the name. 
averageScoreList.sort(key=itemgetter(1), reverse=True)  

print('\nStudents Average Scores From Highest to Lowest\n') 
print(averageScoreList) 

输出:

Students Average Scores From Highest to Lowest [['John', 9], ['Billy', 6], ['Milky', 5], ['Adam', 4]]

+0

你可以评论每一行,所以我有一个更好的理解字典的概念? – Billy

+0

我会为你做一个更好的,在这里阅读。我认为他解释这个概念的确很出色。 [实践Python教程> 1.12。字典](http://anh.cs.luc.edu/python/hands-on/3.1/handsonHtml/dictionaries.html#dictionaries) – Igor

+0

此外,他们以任何方式逐行打印数据,因为它存储在数组,因为你不能只使用“\ n”。 – Billy