从CSV列表中检查数据

嗨，我是新来的蟒蛇，我想通过提供一个可用的函数来增加我的知识库。我试图建立一个函数，它创建一个从1到59范围内的一组数字中取出的6个随机数字的列表。现在我已经破解了这部分，它是下一个棘手的部分。我现在想检查随机集中数字的csv文件，然后打印出一个通知，如果从该集合中找到两个或更多的数字。现在我已经尝试了print (df[df[0:].isin(luckyDip)])，它有一点成功，它检查数据帧中的数字，然后显示数据帧中匹配的数字，但它也显示数据帧的其余部分为NaN，这是技术上不太令人愉快，并不是我想要的。从CSV列表中检查数据

我只是在寻找一些关于下一步做什么的指针，或者只是搜索google的东西，bellow是我一直在搞的代码。

import random 
import pandas as pd 

url ='https://www.national-lottery.co.uk/results/euromillions/draw-history/csv' 
df = pd.read_csv(url, sep=',', na_values=".") 

lottoNumbers = [1,2,3,4,5,6,7,8,9,10, 
      11,12,13,14,15,16,17,18,19,20, 
      21,22,23,24,25,26,27,28,29,30, 
      31,32,33,34,35,36,37,38,39,40, 
      41,42,43,44,45,46,47,48,49,50, 
      51,52,53,54,55,56,57,58,59] 
luckyDip = random.sample(lottoNumbers, k=6) #Picks 6 numbers at random 
print (sorted(luckyDip))  
print (df[df[0:].isin(luckyDip)])

来源

2017-05-23 Mortgage1

如果你只是希望扁平化阵列，并删除NaN值，你可以添加到您的代码的末尾：

matches = df[df[0:].isin(luckyDip)].values.flatten().astype(np.float64) 
    print matches[~np.isnan(matches)]

来源

2017-05-23 21:30:58 user2188329

不一样优雅的@ayhan解决方案，但这个工程：

import random 
import pandas as pd 

url ='https://www.national-lottery.co.uk/results/euromillions/draw-history/csv' 
df = pd.read_csv(url, index_col=0, sep=',') 

lottoNumbers = range(1, 60) 

tries = 0 
while True: 
    tries+=1 
    luckyDip = random.sample(lottoNumbers, k=6) #Picks 6 numbers at random 

    # subset of balls 
    draws = df.iloc[:,0:7] 

    # True where there is match 
    matches = draws.isin(luckyDip) 

    # Gives the sum of Trues 
    sum_of_trues = matches.sum(1) 

    # you are looking for matches where sum_of_trues is 6 
    final = sum_of_trues[sum_of_trues == 6] 
    if len(final) > 0: 
     print("Took", tries) 
     print(final) 
     break

的结果是这样的：

Took 15545 
DrawDate 
16-May-2017 6 
dtype: int64

来源

2017-05-23 21:34:13 RicLeal

您可以通过计算每行中的notnull值来添加到您拥有的内容。然后显示匹配大于或等于2的行。

match_count = df[df[0:].isin(luckyDip)].notnull().sum(axis=1) 
print(match_count[match_count >= 2])

这会为您提供匹配行的索引值和匹配数量。

输出示例：

如果你也想从这些行的匹配值，您可以添加：

index = match_count[match_count >= 2].index 
matches = [tuple(x[~pd.isnull(x)]) for x in df.loc[index][df[0:].isin(luckyDip)].values] 
print(matches)

输出示例：

[(19.0, 23.0), (19.0, 41.0), (19.0, 23.0, 34.0), (23.0, 28.0)]

来源

2017-05-23 21:44:01

从CSV列表中检查数据

回答

相关问题