numpy.random.normal不同的分布：从分布中选择值

我有能量的幂律分布，我想根据分布选择n个随机能量。我尝试使用随机数手动执行此操作，但对于我想要执行的操作而言效率太低。我想知道numpy（或其他）有没有像numpy.random.normal那样工作的方法，除了使用正态分布之外，可以指定分布。所以，在我脑海里的例子可能看起来像（类似于numpy.random.normal）：numpy.random.normal不同的分布：从分布中选择值

import numpy as np 

# Energies from within which I want values drawn 
eMin = 50. 
eMax = 2500. 

# Amount of energies to be drawn 
n = 10000 

photons = [] 

for i in range(n): 

    # Method that I just made up which would work like random.normal, 
    # i.e. return an energy on the distribution based on its probability, 
    # but take a distribution other than a normal distribution 
    photons.append(np.random.distro(eMin, eMax, lambda e: e**(-1.))) 

print(photons)

印刷photons应该给我的能量在这个人口分布长度10000的列表。如果我要直方图，它会有更大的bin值在较低的能量。

我不确定这种方法是否存在，但看起来应该如此。我希望很清楚我想要做什么。

编辑：

我看到numpy.random.power但我的指数为-1，所以我不认为这会工作。

来源

2014-07-07 davly

不正是你想要的PDF？配电是beta的一个特例，你可以用它来代替http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.beta.html？ – wim

@wim，我相信我想要一个在我的能量范围外f（x）= 0和f（x）= x ** a（其中a可以是-5到5之间的值）的分段函数。我看不到beta如何在这里工作。 – davly

@davly更新我的答案与代码片段，以防这种情况有用 –

-1

为什么不使用eval并将分布放在一个字符串中？

>>> cmd = "numpy.random.normal(500)" 
>>> eval(cmd)

您可以根据需要操作字符串来设置分布。

来源

2014-07-07 11:53:33 BigBrownBear00

对不起，我误解了你的问题。 – BigBrownBear00

对任意PDF文件进行抽样实际上很难。有关large and dense books只是关于如何有效和准确地从标准分布系列中抽样。

看起来你可能可以通过自定义的反转方法得到你给出的例子。

来源

2014-07-07 12:49:57

我将如何实现自定义反转方法？我没有看到它[像这样]（http://docs.scipy.org/doc/numpy/reference/routines.random.html） – davly

导出CDF的反函数。使用'random_sample（）'获得均匀分布在0和1之间的值。将这些通过反CDF传递，以获得符合所需分布的值。在你的情况下，逆CDF是'lambda u：eMin *（eMax/eMin）** u'。 –

如果您想从任意分布中采样，您需要累积密度函数（而不是pdf）的逆函数。

然后，您可以从范围[0,1]中统一采样一个概率，并将其馈入cdf的逆函数以获取相应的值。

通常不可能从pdf中分析获得cdf。然而，如果你很乐意近似分布，你可以通过计算f（x）在它的域上定期的计算，然后在这个向量上做一个cumsum来得到cdf的近似值，并且从这个近似的倒数。

粗糙的代码片段：

import matplotlib.pyplot as plt 
import numpy as np 
import scipy.interpolate 

def f(x): 
    """ 
    substitute this function with your arbitrary distribution 
    must be positive over domain 
    """ 
    return 1/float(x) 


#you should vary inputVals to cover the domain of f (for better accurracy you can 
#be clever about spacing of values as well). Here i space them logarithmically 
#up to 1 then at regular intervals but you could definitely do better 
inputVals = np.hstack([1.**np.arange(-1000000,0,100),range(1,10000)]) 

#everything else should just work 
funcVals = np.array([f(x) for x in inputVals]) 
cdf = np.zeros(len(funcVals)) 
diff = np.diff(funcVals) 
for i in xrange(1,len(funcVals)): 
    cdf[i] = cdf[i-1]+funcVals[i-1]*diff[i-1] 
cdf /= cdf[-1] 

#you could also improve the approximation by choosing appropriate interpolator 
inverseCdf = scipy.interpolate.interp1d(cdf,inputVals) 

#grab 10k samples from distribution 
samples = [inverseCdf(x) for x in np.random.uniform(0,1,size = 100000)] 

plt.hist(samples,bins=500) 
plt.show()

来源

2014-07-07 13:47:15

numpy.random.normal不同的分布：从分布中选择值

回答

相关问题