2017-01-25 105 views
0

嗨我有复杂的数据对象,我想通过s排序。下面的简化版本:如何使用特定规则对列表进行排序

class Data(object): 
    def __init__(self, s): 
     self.s = s 

这些数据对象的每一个都将放置在特定的类别中,以方便以后使用。简体版下面再次

class DataCategory(object): 
    def __init__(self, id1, id2, linked_data=None): 
     self.id1 = id1 
     self.id2 = id2 
     self.ld = linked_data 

我想按照它们的s号码排序数据但是有更少的规则。如果从第一个数据收集中使用一个数据对象,那么我想使用第二个数据集中的一个数据对象,如果其数目相同或更低。这里是我所得到的,我想实现

# order I get 
# [['p02g01r05', 5], ['p02g01r01', 4], ['p01g01r05', 4], ['p01g01r01', 3], ['p01g01r02', 2], ['p01g01r03', 2], ['p01g01r06', 2], ['p02g01r02', 2], ['p02g01r03', 2], ['p02g01r04', 2], ['p01g01r04', 1], ['p02g01r06', 1]] 
# order I want 
# [['p02g01r05', 5], ['p01g01r05', 4], ['p02g01r01', 4], ['p01g01r01', 3], ['p02g01r02', 2], ['p01g01r02', 2], ['p02g01r03', 2], ['p01g01r03', 2], ['p02g01r04', 2], ['p01g01r06', 2], ['p02g01r06', 1]], ['p01g01r04', 1] 

这是我创建至今,但我在想,我这个要在错误的方向是什么。我认为,要替换的索引列表是正确的。

# Some data objects 
p01g01r01 = Data(3) 
p01g01r02 = Data(2) 
p01g01r03 = Data(2) 
p01g01r04 = Data(1) 
p01g01r05 = Data(4) 
p01g01r06 = Data(2) 

p02g01r01 = Data(4) 
p02g01r02 = Data(2) 
p02g01r03 = Data(2) 
p02g01r04 = Data(2) 
p02g01r05 = Data(5) 
p02g01r06 = Data(1) 

p01g01 = DataCategory("01", "01", []) 
p02g01 = DataCategory("02", "01", []) 


# link data to data category 
def ldtdc(dc): 
    lst = [] 
    data = "p" + dc.id1 + "g" + dc.id2 + "r" 
    for i in range(1, 7): 
     if i < 10: 
      lst.append(data + "0" + str(i)) 
     else: 
      lst.append(data + str(i)) 
    return lst 

p01g01.ld = ldtdc(p01g01) 
p02g01.ld = ldtdc(p02g01) 


# /@= This starts to get way too complicated fast ############################ 
def lstu(ag, dg): 
    lst = [] 
    # data list of first collection 
    dlofc = [] 
    # data list of second collection 
    dlosc = [] 

    # for every data unit that exists in data collection 
    for unit in ag.ld: 
     # lst.append([unit, globals()[unit].s+10]) 
     lst.append([unit, globals()[unit].s]) 
     dlofc.append([unit, globals()[unit].s]) 

    for unit in dg.ld: 
     lst.append([unit, globals()[unit].s]) 
     dlosc.append([unit, globals()[unit].s]) 

    # lambda function is used here to sort list by data value ([1] is index of the item) 
    lst = sorted(lst, key=lambda x: x[1], reverse=True) 
    # current index 
    ci = 0 

    previous_data = ["last data unit will be stored here", 0] 
    # sorted list 
    slst = [] 

    for unit in lst: 
     try: 
      next_data = lst[ci+1] 
     except IndexError: 
      next_data = ["endoflist", 0] 
     if previous_data[0] == "last data unit will be stored here": 
      pass 
     elif previous_data[0][:6] == unit[0][:6]: 
      if unit[0][:6] not in dlofc[0][0]: 
       slst.append([unit[0], unit[1], ci]) 
      elif unit[0][:6] not in dlosc[0][0]: 
       slst.append([unit[0], unit[1], ci]) 
      else: 
       print "Error" 

     previous_data = unit 
     ci += 1 

    print "slist below" 
    print slst 

    return lst 
# \@= END ##################################################################### 


print p01g01.ld 
print p02g01.ld 


data_list = lstu(p01g01, p02g01) 
print data_list 

什么是排序这种数据的快速和正确的方法?

+1

你考虑过'sorted'函数或'list.sort'方法吗? – skyking

+0

在上面的例子中,你可以看到我已经使用了排序,但它不足以满足新列表的所有要求。 – Hsin

+0

你知道/意识到你可以控制'sorted'和'list.sort'在排序时比较元素的方式?一旦你可以控制,我不明白你为什么不应该能够使用'sorted'或'list.sort'。 – skyking

回答

0

找到解决办法。新lstu功能:

# replaced lambda with normal function 
def get_key(item): 
    return item[1] 


def lstu(ag, dg): 
    # ag list 
    agslst = [] 
    # dg list 
    dgslst = [] 

    # for every unit in first data collection 
    for unit in ag.u: 
     agslst.append([unit, globals()[unit].s]) 
    # sorted first data collection list 
    agslst = sorted(agslst, key=get_key, reverse=True) 
    print agslst 

    for unit in dg.u: 
     dgslst.append([unit, globals()[unit].s]) 
    # 2nd collection sorted list 
    dgslst = sorted(dgslst, key=get_key, reverse=True) 
    print dgslst 

    lst = [] 
    # last item 
    li = ["Empty", 0] 

    for item in range(0, len(agslst)+len(dgslst)+1): 
     if agslst and dgslst: 
      if agslst[0][1] == dgslst[0][1]: 
       if li[0][:6] == agslst[0][0][:6]: 
        li = dgslst.pop(0) 
        lst.append(li) 
       else: 
        li = agslst.pop(0) 
        lst.append(li) 

      elif agslst[0][1] > dgslst[0][1]: 
       li = agslst.pop(0) 
       lst.append(li) 
      else: 
       li = dgslst.pop(0) 
       lst.append(li) 

    return lst 

这样,我履行新的(最终)列表前面所提到的要求

输出:

[['p02g01r05', 5], ['p01g01r05', 4], ['p02g01r01', 4], ['p01g01r01', 3], ['p02g01r02', 2], ['p01g01r02', 2], ['p02g01r03', 2], ['p01g01r03', 2], ['p02g01r04', 2], ['p01g01r06', 2], ['p02g01r06', 1]], ['p01g01r04', 1]] 

我打开的任何优化建议。

1

您是否尝试过先按字符串排序,然后按项目中的数字进行排序?

>>> items = [['p02g01r05', 5], ['p02g01r01', 4], ['p01g01r05', 4], ['p01g01r01', 3], ['p01g01r02', 2], ['p01g01r03', 2], ['p01g01r06', 2], ['p02g01r02', 2], ['p02g01r03', 2], ['p02g01r04', 2], ['p01g01r04', 1], ['p02g01r06', 1]] 
>>> partially_sorted = sorted(items, key=lambda item: item[0], reverse=True) 
>>> sorted(partially_sorted, key=lambda item: item[1], reverse=True) 
[['p02g01r05', 5], ['p02g01r01', 4], ['p01g01r05', 4], ['p01g01r01', 3], ['p02g01r04', 2], ['p02g01r03', 2], ['p02g01r02', 2], ['p01g01r06', 2], ['p01g01r03', 2], ['p01g01r02', 2], ['p02g01r06', 1], ['p01g01r04', 1]] 
+0

它不会工作。如果他们有相同的“s”,则应该有p01g01中的一个项目,然后是p02g01中的一个项目。在上面的例子中,我们将从同一个集合中获得许多具有相同“s”的项目。 – Hsin

+0

它基本上是合并两个排序列表吗?一个排序列表名为p01g01,另一个是p02g01? – aisbaa

+0

不,python排序稳定https://en.wikipedia.org/wiki/Sorting_algorithm#Stability – aisbaa

相关问题