Python列表理解数据库数据

我想减少SQL Server数据库表中的重复项，条件很复杂。所有表格数据已被拉入Python list s。Python列表理解数据库数据

在这一步中，我想要使用列表理解来找到某个字段值，当一个list值与list的list的值在行中有重复时匹配。

list A是dupID个独特的名单：[134L, 1610L, 1861L, 2026L, 3211L, 4134L, 4363L, 4453L, 4733L,...]

list B是二维的：

Row# dupID nameID SSN   personID 

[[85097L, 236479L, 241583, '999-99-0000', 359913, datetime.datetime(2012, 9, 9, 0, 0)] 

[78654L, 236479L, 996783, '999-99-0000', NULL, datetime.datetime(2008, 5, 4, 0, 0)]...]

这些都是我想通过列表理解，加快循环：

personIDList = [] 
for i in range(len(A)): 
     for j in range(len(B)): 
      if A[i] == B[j][1]: # if dupID == dupID 
        personIDList.append(B[j][4]) # append personID

来源

2013-10-10 Albert

“我想加快通过列表理解”？什么让你认为列表理解会更快？ – Johnsyweb

首先，你会迭代元素而不是索引，所以你会得到这个：

personIDList = [] 
for a in A: 
    for b in B: 
     if a == b[1]: 
      personIDList.append(b[4])

这可以然后容易地变成一个列表理解：

personIDList = [b[4] for a in A for b in B if a == b[1]]

来源

2013-10-10 16:25:59 poke

唯一ID列表转换为一组第一：

s = set(A)

然后，使用一个在所述另一个列表遍历列表理解：

personIDList = [item[4] for item in B if item[1] in s]

与你的方法相比，这将是O(N)的复杂度，它是O(N**2)。

来源

2013-10-10 16:30:57

+1好主意在这里使用一套！ – poke

这里是你如何把一个for循环到列表理解：

my_list = [] 
for i in something: 
    my_list.append(i+7)

去

my_list = [i+7 for i in something]

这里是你又将怎样嵌套的for循环到一个列表理解：

my_list = [] 
for i in first_thing: 
    for j in second_thing: 
     my_list.append(i + j)

转至

my_list = [i + j for i in first_thing for j in second_thing]

所以你的情况，你想这样做：

personIDList = [b[4] for a in A for b in B if a == b[1]]

来源

2013-10-10 16:31:13 rlms

import numpy as np 
A = np.array(A) 
B = np.array(B) 
person_ids = B[np.in1d(list(B[:,1]),A)][:,4]

我认为至少...如果你贴一个例子会更容易和B列出

我总是喜欢做numpy的东西：P

我们可以打破它除了

dup_ids_in_b = list(B[:,1]) # take column 1 from B (we use list so its not of type `object`) 
boolmask_b_dups_in_a = np.in1d(dup_ids_in_b,A) # True,True,False,... True for all indices where B[i][1] is in A 
person_ids = B[boolmask_b_dups_in_a][:,4] # take the fourth column of all the True indices from last step

这更可读

来源

2013-10-10 16:34:49

非常感谢大家的帮助！我会发布我的工作。 – Albert

Python列表理解数据库数据

回答

相关问题