如何使用GitHub V3 API获取repo的提交计数？

我正在尝试使用API来计算许多大型github 回购的提交，所以我想避免获取整个提交列表（这种方式作为示例：api.github.com/repos/jasonrudolph/keyboard/提交）并对它们进行计数。如何使用GitHub V3 API获取repo的提交计数？

如果我有第一个（初始）提交的散列，我可以use this technique to compare the first commit to the latest，它愉快地报告中间的total_commits（所以我需要添加一个）。不幸的是，我看不到如何优雅地使用API进行第一次提交。

基地回购网址确实给了我created_at（这个网址是一个例子：api.github.com/repos/jasonrudolph/keyboard），所以我可以通过限制提交，直到创建日期（这个网址是一个例子：api.github.com/repos/jasonrudolph/keyboard/commits?until=2013-03-30T16:01:43Z）和使用最早的一个（总是列出最后？）或者也许用一个空的父级（不确定分叉的项目是否有初始父级提交）。

任何更好的方式来获得回购的第一个提交哈希？

更好的是，这整个事情似乎令人费解的简单统计，我不知道我是否错过了一些东西。任何更好的想法使用API来获得回购提交计数？

编辑：这somewhat similar question正试图过滤某些文件（“和其中的特定文件”），所以有不同的答案。

来源

2015-01-13 SteveCoffman

可能重复[github api：如何高效地查找存储库的提交数量？]（http://stackoverflow.com/questions/15919539/github-api-how-to-efficiently-find数据库提交的数量） –

不是同一个问题。虽然谢谢！ – SteveCoffman

您可以考虑使用GraphQL API v4在同一时间使用aliases对多个存储库执行提交计数。下面将取指令提交计数为3个不同的库中的所有分支（高达每回购100个分支）：

{ 
    gson: repository(owner: "google", name: "gson") { 
    ...RepoFragment 
    } 
    martian: repository(owner: "google", name: "martian") { 
    ...RepoFragment 
    } 
    keyboard: repository(owner: "jasonrudolph", name: "keyboard") { 
    ...RepoFragment 
    } 
} 

fragment RepoFragment on Repository { 
    name 
    refs(first: 100, refPrefix: "refs/heads/") { 
    edges { 
     node { 
     name 
     target { 
      ... on Commit { 
      id 
      history(first: 0) { 
       totalCount 
      } 
      } 
     } 
     } 
    } 
    } 
}

Try it in the explorer

RepoFragment是fragment这有助于避免重复的查询字段为每个的回购

如果你只需要提交的默认枝数，它更直截了当：

{ 
    gson: repository(owner: "google", name: "gson") { 
    ...RepoFragment 
    } 
    martian: repository(owner: "google", name: "martian") { 
    ...RepoFragment 
    } 
    keyboard: repository(owner: "jasonrudolph", name: "keyboard") { 
    ...RepoFragment 
    } 
} 

fragment RepoFragment on Repository { 
    name 
    defaultBranchRef { 
    name 
    target { 
     ... on Commit { 
     id 
     history(first: 0) { 
      totalCount 
     } 
     } 
    } 
    } 
}

Try it in the explorer

来源

2017-11-27 19:53:25

如果您正在寻找默认分支中的提交总数，您可能会考虑采用不同的方法。

使用回购贡献者API来获取所有贡献者名单：

https://developer.github.com/v3/repos/#list-contributors

列表中的每个项目将包含一个contributions字段，告诉你有多少犯默认分支撰写用户。总结所有贡献者的这些字段，你应该得到默认分支中的提交总数。

贡献者列表如果经常比提交列表短得多，所以它应该只需要更少的请求来计算默认分支中的提交总数。

来源

2015-01-14 08:54:40

谢谢。当我使用[像这样的链接]（https://api.github.com/repos/jquery/jquery/contributors?anon=true）时，它似乎被限制为30项。我发现，返回多个项目的请求默认分为30个项目。您可以使用'？page'参数指定更多页面。所以如果你得到30，你需要检查是否有更多的页面，并将它们添加到初始结果。 – SteveCoffman

@SteveCoffman是的，这是预期的行为：https://developer.github.com/v3/#pagination –

它看起来像两种方法（你的和我的）是可行的，而且都不是优雅。我会接受你的答案，除非有人提出我们都忽略的东西。谢谢。 – SteveCoffman

我只是做了一个小脚本来做到这一点。它可能无法处理大型存储库，因为它不处理GitHub的速率限制。它也需要Python requests包。

#!/bin/env python3.4 
import requests 

GITHUB_API_BRANCHES = 'https://%(token)[email protected]/repos/%(namespace)s/%(repository)s/branches' 
GUTHUB_API_COMMITS = 'https://%(token)[email protected]/repos/%(namespace)s/%(repository)s/commits?sha=%(sha)s&page=%(page)i' 


def github_commit_counter(namespace, repository, access_token=''): 
    commit_store = list() 

    branches = requests.get(GITHUB_API_BRANCHES % { 
     'token': access_token, 
     'namespace': namespace, 
     'repository': repository, 
    }).json() 

    print('Branch'.ljust(47), 'Commits') 
    print('-' * 55) 

    for branch in branches: 
     page = 1 
     branch_commits = 0 

     while True: 
      commits = requests.get(GUTHUB_API_COMMITS % { 
       'token': access_token, 
       'namespace': namespace, 
       'repository': repository, 
       'sha': branch['name'], 
       'page': page 
      }).json() 

      page_commits = len(commits) 

      for commit in commits: 
       commit_store.append(commit['sha']) 

      branch_commits += page_commits 

      if page_commits == 0: 
       break 

      page += 1 

     print(branch['name'].ljust(45), str(branch_commits).rjust(9)) 

    commit_store = set(commit_store) 
    print('-' * 55) 
    print('Total'.ljust(42), str(len(commit_store)).rjust(12)) 

# for private repositories, get your own token from 
# https://github.com/settings/tokens 
# github_commit_counter('github', 'gitignore', access_token='fnkr:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') 
github_commit_counter('github', 'gitignore')

来源

2015-07-12 18:56:35 fnkr

有些事情发生了变化，现在代码显示错误'第36行，在github_commit_counter：commit_store.append（commit ['sha']）' – Whitecat

我错了。该脚本确实有效。我打了我的rate_limit。 – Whitecat

简单的解决方案：看看页码。 Github为你分页。因此您可以通过从链接标题中获取最后一个页码，然后减去一个（您需要手动添加最后一页），再乘以页面大小，抓取最后一页结果和获取该数组的大小并将这两个数字相加。这是最多两个API调用！

下面是我的实现使用octokit宝石红宝石抓提交总数为整个组织的：

@github = Octokit::Client.new access_token: key, auto_traversal: true, per_page: 100 

Octokit.auto_paginate = true 
repos = @github.org_repos('my_company', per_page: 100) 

# * take the pagination number 
# * get the last page 
# * see how many items are on it 
# * multiply the number of pages - 1 by the page size 
# * and add the two together. Boom. Commit count in 2 api calls 
def calc_total_commits(repos) 
    total_sum_commits = 0 

    repos.each do |e| 
     repo = Octokit::Repository.from_url(e.url) 
     number_of_commits_in_first_page = @github.commits(repo).size 
     repo_sum = 0 
     if number_of_commits_in_first_page >= 100 
      links = @github.last_response.rels 

      unless links.empty? 
       last_page_url = links[:last].href 

       /.*page=(?<page_num>\d+)/ =~ last_page_url 
       repo_sum += (page_num.to_i - 1) * 100 # we add the last page manually 
       repo_sum += links[:last].get.data.size 
      end 
     else 
      repo_sum += number_of_commits_in_first_page 
     end 
     puts "Commits for #{e.name} : #{repo_sum}" 
     total_sum_commits += repo_sum 
    end 
    puts "TOTAL COMMITS #{total_sum_commits}" 
end

是的，我知道代码是脏的，这只是一起扔在数分钟。

来源

2017-04-12 16:04:55 snowe

如何使用GitHub V3 API获取repo的提交计数？

回答

相关问题