如何遍历具有多个页面的json

我已经创建了一个遍历多页json对象的程序。如何遍历具有多个页面的json

def get_orgs(token,url): 
    part1 = 'curl -i -k -X GET -H "Content-Type:application/json" -H "Authorization:Bearer ' 
    final_url = part1 + token + '" ' + url 
    pipe = subprocess.Popen(final_url, shell=False,stdout=subprocess.PIPE,stdin=subprocess.PIPE) 
    data = pipe.communicate()[0] 
    for line in data.split('\n'): 
     print line 
     try: 
      row = json.loads(line) 
      print ("next page url ",row['next']) 
     except : 
      pass 
    return row 
my_data = get_orgs(u'MyBeearerToken',"https://data.ratings.com/v1.0/org/576/portfolios/36/companies/")

JSON对象是如下：

[{results: [{"liquidity":"Strong","earningsPerformance":"Average"}] 
,"next":"https://data.ratings.com/v1.0/org/576/portfolios/36/companies/?page=2"}]

我使用“下一个”键进行迭代，但有时它指向“无效页面”（一个不存在页）。 JSON对象是否有关于每个页面上有多少记录的规则？在这种情况下，我会用它来估计可能有多少页面。

编辑：添加更多细节 json只有2个键['results'，'next']。如果有多个页面，则“下一个”键具有下一页的网址（as you can see in the output above）。否则，它包含“无”。但问题是，有时，而不是'无'，它指向下一页（不存在）。因此，我想查看是否可以计算Json中的行数并除以数字以知道循环需要循环多少页。

来源

2016-09-22 Tammy

对我来说，目前还不清楚你试图达到什么目的。你的问题似乎是你从服务器请求了一些JSON。 JSON包含一个到下一个数据集的URL，因为缺少更好的单词。您在提取正确的URL时遇到问题，或者您从响应中提取的URL不正确？在后面的情况下，问题不在你的代码中。为什么使用curl而不是像[urllib.request]（https://docs.python.org/3.5/library/urllib.request.html）这样的内置python解决方案？ – Maurice

嗨，莫里斯，谢谢你的回复。我坐在公司的代理人后面，并且卷曲工作正常。对于urllib2或请求，我收到验证错误。 – Tammy

@Maurice，我编辑了这个问题来给出关于这个问题的更多细节 – Tammy

在我看来使用的urllib2或urllib.request里会比卷曲一个更好的选择，以使代码更容易理解，但如果这是一个约束 - 我与;-)

假设工作JSON的反应都在一个行（否则你json.loads会抛出异常），任务是非常简单的，这将让你获取结果背后的关键项目的金额：

row = [{'next': 'https://data.ratings.com/v1.0/org/576/portfolios/36/companies/?page=2', 'results': [{'earningsPerformance':'Average','liquidity': 'Strong'}, {'earningsPerformance':'Average','liquidity': 'Strong'}]}] 
result_count = len(row[0]["results"])

的使用httplib2替代解决方案应该看起来像这样（我没有测试这个）：

import httplib2 
import json 
h = httplib2.Http('.cache') 
url = "https://data.ratings.com/v1.0/org/576/portfolios/36/companies/" 
token = "Your_token" 
try: 
    response, content = h.request(
     url, 
     headers = {'Content-Type': 'application/json', 'Authorization:Bearer': token} 
    ) 
    # Convert the response to a string 
    content = content.decode('utf-8') # You could get the charset from the header as well 
    try: 
     object = json.loads(content) 
     result_count = len(object[0]["results"]) 
     # Yay, we got the result count! 
    except Exception: 
     # Do something if the server responds with garbage 
     pass 
except httplib2.HttpLib2Error: 
    # Handle the exceptions, here's a list: https://httplib2.readthedocs.io/en/latest/libhttplib2.html#httplib2.HttpLib2Error 
    pass

有关httplib2的更多信息，以及为什么它令人惊叹，我建议阅读Dive Into Python。

来源

2016-09-22 17:18:33 Maurice

如何遍历具有多个页面的json

回答

相关问题