2012-10-12 68 views
1

我已经实现了下面的代码,它完美的工作没有任何问题。但我不满意它,因为它看起来不漂亮?比任何事情都好,我觉得它看起来不像pythonic那样做。Pythonic构建数据结构的方法

所以我想我会采取从stackoverflow社区的建议。这个metod从sql查询中获取数据,这是另一种方法,该方法返回一个字典,并基于该字典中的数据进行模式匹配和计数过程。我想以pythonic的方式做到这一点,并返回一个更好的数据结构。

下面是代码:

def getLaguageUserCount(self): 
    bots = self.getBotUsers() 
    user_template_dic = self.getEnglishTemplateUsers() 
    print user_template_dic 
    user_by_language = {} 
    en1Users = [] 
    en2Users = [] 
    en3Users=[] 
    en3Users=[] 
    en4Users=[] 
    en5Users=[] 
    en_N_Users=[] 
    en1 = 0 
    en2 = 0 
    en3 = 0 
    en4 = 0 
    en5 = 0 
    enN = 0 
    lang_regx = re.compile(r'User_en-([1-5n])', re.M|re.I) 
    for userId, langCode in user_template_dic.iteritems(): 
     if userId not in bots: 
      print 'printing key value' 
      for item in langCode: 
       item = item.replace('--','-') 
       match_lang_obj = lang_regx.match(item) 
       if match_lang_obj is not None: 
        if match_lang_obj.group(1) == '1': 
         en1 += 1 
         en1Users.append(userId) 
        if match_lang_obj.group(1) == '2': 
         en2 += 1 
         en2Users.append(userId) 
        if match_lang_obj.group(1) == '3': 
         en3 += 1 
         en3Users.append(userId) 
        if match_lang_obj.group(1) == '4': 
         en4 += 1 
         en4Users.append(userId) 
        if match_lang_obj.group(1) == '5': 
         en5 += 1 
         en5Users.append(userId) 
        if match_lang_obj.group(1) == 'N': 
         enN += 1 
         en_N_Users.append(userId) 
       else: 
        print "Group didn't match our regex: " + item 
     else: 
      print userId + ' is a bot' 
    language_count = {} 
    user_by_language['en-1-users'] = en1Users 
    user_by_language['en-2-users'] = en2Users 
    user_by_language['en-3-users'] = en3Users 
    user_by_language['en-4-users'] = en4Users 
    user_by_language['en-5-users'] = en5Users 
    user_by_language['en-N-users'] = en_N_Users 
    user_by_language['en-1'] = en1 
    user_by_language['en-2'] = en2 
    user_by_language['en-3'] = en3 
    user_by_language['en-4'] = en4 
    user_by_language['en-5'] = en5 
    user_by_language['en-n'] = enN 
    return user_by_language 
+1

这是更适合http://codereview.stackexchange.com –

+0

我该如何将此移至您建议的位置?只需复制过去或有办法“标记它即可移动”? –

回答

3

你能避免所有这些列表和直接的数据添加到字典user_by_language

我将其定义为:

user_by_language = collections.defaultdict(list) 

正则表达式匹配后,只是这样做:

user_by_language['en-%s-users' % match_lang_obj.group(1)].append(userId) 

最后,你抓住这些元素的全部长度,并保存为en-1en-2 ...