2016-04-21 171 views
0

我有一个带有键值对数据的JSON文件。我的JSON文件看起来像这样。格式化JSON输出

{ 
    "professors": [ 
     { 
      "first_name": "Richard", 
      "last_name": "Saykally", 
      "helpfullness": "3.3", 
      "url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=111119", 
      "reviews": [ 
       { 
        "attendance": "N/A", 
        "class": "CHEM 1A", 
        "textbook_use": "It's a must have", 
        "review_text": "Tests were incredibly difficult (averages in the 40s) and lectures were essentially useless. I attended both lectures every day and still was unable to grasp most concepts on the midterms. Scope out a good GSI to get help and ride the curve." 
       }, 
       { 
        "attendance": "N/A", 
        "class": "CHEMISTRY1A", 
        "textbook_use": "Essential to passing", 
        "review_text": "Saykally really isn't as bad as everyone made him out to be. If you go to his lectures he spends about half the time blowing things up, but if you actually read the texts before his lectures and pay attention to what he's writing/saying, you'd do okay. He posts practice tests that were representative of actual tests and curves the class nicely!" 
       }] 
     { 
     { 
     "first_name": "Laura", 
     "last_name": "Stoker", 
     "helpfullness": "4.1", 
     "url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=536606", 
     "reviews": [ 
      { 
       "attendance": "N/A", 
       "class": "PS3", 
       "textbook_use": "You need it sometimes", 
       "review_text": "Stoker is by far the best professor. If you put in the effort, take good notes, and ask questions, you will be fine in the class. As far as her lecture, she does go a bit fast, but her lecture is in the form of an outline. As long as you take good notes, you will have everything you need for exams. She is funny and super nice if you speak with her" 
      }, 
      { 
       "attendance": "Mandatory", 
       "class": "164A", 
       "textbook_use": "Barely cracked it open", 
       "review_text": "AMAZING professor. She has a good way of keeping lectures interesting. Yes, she can be a little everywhere and really quick with her lecture, but the GSI's are useful to make sure you understand the material. Oh, and did I mention she's hilarious!" 
      }] 
    }] 

所以我想要做很多事情。 我试图在评论中得到最多提及['class']键。然后获得班级名称和所提及的时间。 然后我想以这种方式输出我的格式。也在教授阵列。这只是教授的信息,例如CHEM 1A,CHEMISTRY1A - 它是Richard Saykally。

{ 
    courses:[ 
    { 
     "course_name" : # class name 
     "course_mentioned_times" : # The amount of times the class was mentioned 
     professors:[ #The professor array should have professor that teaches this class which is in my shown json file 
     { 
       'first_name' : 'professor name' 
       'last_name' : 'professor last name' 
     } 
    } 

所以我想对我的json文件键值进行排序,我把max设置为minimum。到目前为止,所有我已经能够找出ISD

if __name__ == "__main__": 
     open_json = open('result.json') 
     load_as_json = json.load(open_json)['professors'] 
     outer_arr = [] 
     outer_dict = {} 
     for items in load_as_json: 

      output_dictionary = {} 
      all_classes = items['reviews'] 
      for classes in all_classes: 
       arr_info = [] 
       output_dictionary['class'] = classes['class'] 
       output_dictionary['first_name'] = items['first_name'] 
       output_dictionary['last_name'] = items['last_name'] 
       #output_dictionary['department'] = items['department'] 
       output_dictionary['reviews'] = classes['review_text'] 
       with open('output_info.json','wb') as outfile: 
        json.dump(output_dictionary,outfile,indent=4) 
+0

可能的重复http://stackoverflow.com/questions/18871217/how-to-custom-sort-a-list-of-dict-to-use-in-json-dumps –

+1

您的问题标题提到格式化,但它听起来像是关于在json文件中排序数据秒。那是对的吗?您还需要更清楚地(更明确地)了解您的输入和期望的输出。 – martineau

+0

Benji,堆栈溢出是一个问题和答案网站。读者如自己提问,其他读者试图回答。你的文章中有很多信息,但是它缺少一个让Stack Overflow工作的东西:一个问题。你有特定的编程问题吗? –

回答

0

我觉得这个节目你想要做什么:

import json 


with open('result.json') as open_json: 
    load_as_json = json.load(open_json) 

courses = {} 
for professor in load_as_json['professors']: 
    for review in professor['reviews']: 
     course = courses.setdefault(review['class'], {}) 
     course.setdefault('course_name', review['class']) 
     course.setdefault('course_mentioned_times', 0) 
     course['course_mentioned_times'] += 1 
     course.setdefault('professors', []) 
     prof_name = { 
      'first_name': professor['first_name'], 
      'last_name': professor['last_name'], 
     } 
     if prof_name not in course['professors']: 
      course['professors'].append(prof_name) 

courses = { 
    'courses': sorted(courses.values(), 
         key=lambda x: x['course_mentioned_times'], 
         reverse=True) 
} 
with open('output_info.json', 'w') as outfile: 
    json.dump(courses, outfile, indent=4) 

结果,使用问题的例子输入:

{ 
    "courses": [ 
     { 
      "professors": [ 
       { 
        "first_name": "Laura", 
        "last_name": "Stoker" 
       } 
      ], 
      "course_name": "PS3", 
      "course_mentioned_times": 1 
     }, 
     { 
      "professors": [ 
       { 
        "first_name": "Laura", 
        "last_name": "Stoker" 
       } 
      ], 
      "course_name": "164A", 
      "course_mentioned_times": 1 
     }, 
     { 
      "professors": [ 
       { 
        "first_name": "Richard", 
        "last_name": "Saykally" 
       } 
      ], 
      "course_name": "CHEM 1A", 
      "course_mentioned_times": 1 
     }, 
     { 
      "professors": [ 
       { 
        "first_name": "Richard", 
        "last_name": "Saykally" 
       } 
      ], 
      "course_name": "CHEMISTRY1A", 
      "course_mentioned_times": 1 
     } 
    ] 
} 
+0

现在我的输出看起来像这样。但我有教授名字被骗。 “课程”: { “教授”: { “FIRST_NAME”: “理查”, “姓氏”: “Saykally” }, { “FIRST_NAME”: “理查”, “姓氏“:”Saykally“ }, 我想只打印一个教授名字而不是翻译只有一个Richard Saykally在教授阵列中为那一个特定的类。像多位教授一样,但没有他们的名字。 – Benji

+0

@Benji - 我已经更新了我的答案。 –

+0

你是一个拯救生命的人。最后一个问题。所以我把我的课程名称格式化为我有字母和数字的地方。我想比较一下这些字母,然后打印出最常提到的课程。有CHEM 1A和CHEM 214 - >我比较CHEMACE和CHEM的第一个字母 - >它们是相同的。所以我只是将这2部分中提到的最多的课程添加到我的字典中 – Benji