2010-10-12 46 views
1

给定15名球员 - 2名守门员,5名后卫,5名中场球员和3名前锋,以及每个球员都有一定的价值和得分的事实,我想计算出我拥有的最高得分球队。每个团队必须由1个GK组成,然后形成例如4:4:2,4:3:3等我开始与像这样的样本数据python统计分析

玩家角色分成本

我那么做了以下评价所有组合

阅读每一行成列表(针对每个角色),然后使用itertools嵌套运行得到所有组合

if line[1] == "G": G.append(line[0]) 
if line[1] == "D": D.append(line[0]) 
if line[1] == "M": M.append(line[0]) 
if line[1] == "S": S.append(line[0]) 

for gk in itertools.combinations(G,1): 
    for de in itertools.combinations(D,4): 
     for mi in itertools.combinations(M,4): 
      for st in itertools.combinations(S,2): 
       teams[str(count)]= " ".join(gk)+" "+" ".join(de)+" "+" ".join(mi)+" "+" ".join(st) 
       count +=1 

已经得到了球队,我计算出它们的分值,和团队的成本。如果它低于阈值,我会打印它。
但是如果我现在让这20个守门员,150个后卫,150个中场球员和100个前锋,我理解的是失去记忆。
我可以做些什么来执行此分析?它是一个生成器而不是我需要的递归函数吗?

非常感谢

回答

5

您可能可以通过递归来解决这个问题。下面显示了基本轮廓,但是忽略了一些细节,比如一个团队由一定数量的特定类型的球员组成。

players=[{'name':'A','score':5,'cost':10}, 
     {'name':'B','score':10,'cost':3}, 
     {'name':'C','score':6,'cost':8}] 

def player_cost(player): 
    return player['cost'] 
def player_score(player): 
    return player['score'] 
def total_score(players): 
    return sum(player['score'] for player in players) 

def finance_team_recurse(budget, available_players): 
    affordable_players=[] 
    for player in available_players: 
     if player_cost(player)<=budget: 
      # Since we've ordered available players, the first player appended 
      # will be the one with the highest score. 
      affordable_players.append(player) 
    result=[] 
    if affordable_players: 
     candidate_player=affordable_players[0] 
     other_players=affordable_players[1:] 
     # if you include candidate_player on your team 
     team_with_candidate=finance_team_recurse(budget-player_cost(candidate_player), 
               other_players) 
     team_with_candidate.append(candidate_player) 
     score_of_team_with_candidate=total_score(team_with_candidate) 
     if score_of_team_with_candidate>total_score(other_players): 
      result=team_with_candidate 
     else: 
      # if you exclude candidate_player from your team 
      team_without_candidate=finance_team_recurse(budget, other_players) 
      score_of_team_without_candidate=total_score(team_without_candidate) 
      if score_of_team_with_candidate>score_of_team_without_candidate: 
       result=team_with_candidate 
      else: 
       result=team_without_candidate 
    return result 

def finance_team(budget, available_players): 
    tmp=available_players[:] 
    # Sort so player with highest score is first. (Greedy algorithm?) 
    tmp.sort(key=player_score, reverse=True) 
    return finance_team_recurse(budget,tmp) 

print(finance_team(20,players)) 
# [{'score': 6, 'cost': 8, 'name': 'C'}, {'score': 10, 'cost': 3, 'name': 'B'}] 

20 choose 1 = 20 combinations 
150 choose 4 = 20260275 combinations 
100 choose 2 = 4950 combinations 

因此,有总共在teams字典20 * 20260275 * 20260275 * 4950 = 40637395564486875000L 项目。这需要很多内存。

for gk in itertools.combinations(G,1): 
    for de in itertools.combinations(D,4): 
     for mi in itertools.combinations(M,4): 
      for st in itertools.combinations(S,2):  
       #Don't collect the results into a dict. 
       #That's what's killing you (memory-wise). 
       #Just compute the cost and 
       #Just print the result here. 

PS。 40637395564486875000L的订单是10**19。假设你的程序可以每秒处理10**6组合,这将需要大约1.3百万年的程序来完成...

+0

我一千年,电脑会更快! – florin 2010-10-12 19:13:09

+0

+1:正确使用combinatorics。这是可计算性的一个教科书示例** O **复杂性和不能做什么。辉煌。想要多花点时间。 – 2010-10-12 19:35:56

+0

行不行好? – user317225 2010-10-12 19:41:44

0

功能和发电机有很大的帮助:

def make_teams(G, D, M, S): 
    """ returns all possible teams """ 
    for gk in itertools.combinations(G,1): 
     for de in itertools.combinations(D,4): 
      for mi in itertools.combinations(M,4): 
       for st in itertools.combinations(S,2): 
        yield gk, de, mi, st 

def get_cost(team): 
    return sum(member.cost for member in team) 

def good_teams(min_score=0): 
    for team in make_teams(G, D, M, S): 
     if get_cost(team) > min_score: 
      yield team 

for team in good_teams(min_score=100): 
    print team 

它仍然产生所有可能的组合,所以你现在可能会用完时间,而不是记忆。

你在做什么好像knapsack problem的变化 - 你可以做的比尝试所有可能的组合更好,但不是更好

快速获得良好解决方案的一种方法是按照每的分数排序玩家。你应该首先得到最高得分的球队,但是不能保证你得到最好的解决方案。维基百科称这为“贪婪近似算法”。

def score_per_cost(player): 
    return player.score/player.cost 

def sorted_combinations(seq, n): 
    return itertools.combinations(
     sorted(seq, key=score_per_cost, reverse=True),n) 

def make_teams(G, D, M, S): 
    """ returns all possible teams """ 
    for gk in sorted_combinations(G,1): 
     for de in sorted_combinations(D,4): 
      for mi in sorted_combinations(M,4): 
       for st in sorted_combinations(S,2): 
        yield gk, de, mi, st 

def get_cost(team): 
    return sum(member.cost for member in team) 

def top_teams(n): 
    return itertools.islice(make_teams(G, D, M, S),n) 

for team in top_teams(100): 
    print team 

我会离开加入要求“每队<门槛费”给读者(提示:这是在make_teams一行:P)。

+0

当我看着背包问题时,我几乎感到不适! – user317225 2010-10-12 19:43:58