$\Beta$分布推导与可视化

科技资讯 投稿 5500 0 评论

$\Beta$分布推导与可视化

$\Gamma$函数

$\displaystyle \Gamma(x = \int_0^\infty t^{x-1}e^{-t} dt $

$\Gamma$函数具有以下性质:

2、$\Gamma$函数满足递推关系:$\Gamma(x + 1 = x\Gamma(x$(注意和整数阶乘的联系)。

$\Beta$分布

基于伯努利实验的推导

$\Beta$分布(Beta分布)与伯努利试验相关。在伯努利试验中,假设硬币朝上的概率为$p$。当抛$a+b$次硬币,硬币朝上的次数为$a$时,计算该情况的概率为

$ \displaystyle C_{a+b}^ap^a(1-p^b$

$\displaystyle k \int_0^1C_{a+b}^ap^a(1-p^b  dp=1$

$\displaystyle k =\left(\int_0^1C_{a+b}^ap^a(1-p^b dp\right^{-1}=a+b+1 $

from scipy.special import comb

def Int(func, l, h, n=1000: #模拟定积分
    a = np.linspace(l, h, n
    return func(a.sum(*(h-l/n
a, b = 5, 2 #取任意整数
k = a + b + 1
def func(x:
    return comb(a+b, a * (x**a *((1-x**b
Int(func, 0, 1 * k # = 1

获得概率密度函数:

$ \begin{align} f(p; a, b&=(a+b+1C_{a+b}^ap^a(1-p^b\\ &=\frac{(a+b+1!}{a!b!}p^a(1-p^b\\ \end{align} $

$ \begin{align}f(p; a, b= \frac{\Gamma(a+b+2}{\Gamma(a+1\Gamma(b+1}p^a(1-p^b\end{align}$

$ \begin{align}\displaystyle f(x;\alpha,\beta=\frac{\Gamma(\alpha+\beta}{\Gamma(\alpha\Gamma(\beta}x^{\alpha-1}(1-x^{\beta-1}=\frac{1}{\Beta(\alpha,\beta}x^{\alpha-1}(1-x^{\beta-1}\end{align}$

联合概率密度

正整数情况

关于(2式,我们把其中的$a,b,p$都看作随机变量,再除以一个归一化系数,就可以构成这三个随机变量的联合概率密度,从而可以非常直观地理解$\Beta$分布。分别固定抽样次数$n=0,1,2,5,10,15$,可视化如下:

此外,当在y轴方向上进行求和,可以得到$p$的边缘分布,为均匀分布;而当在x轴方向上进行积分,得到$a$的边缘分布,也是均匀分布。但感觉边缘分布似乎没有什么意义,不知理解是否正确。可视化代码如下:

import matplotlib.pyplot as plt from scipy.special import gamma import numpy as np import matplotlib matplotlib.rcParams['font.family'] = 'Times New Roman' def Beta(a, b, p: g1, g2, g3 = gamma(a+b+2, gamma(a+1, gamma(b+1 return g1/(g2*g3 * p**(a * (1-p**(b def BetaHot(n, samp_n = 1000: p = np.linspace(0, 1, samp_n a = np.linspace(0, n, n+1 P, A = np.meshgrid(p, a Z = Beta(A, n-A, P/(n+1 per_width = samp_n//(n+1 Z1 = np.repeat(Z, per_width, 0 # 热力图 plt.imshow(Z1, origin='lower', cmap='Blues' plt.colorbar( # 关于p的密度图 for i, t in enumerate(Z: plt.plot(np.linspace(-0.5, samp_n-0.5, samp_n, i*per_width+per_width*t-0.5, '--', c='red' # 绘图设置 plt.xlabel("p" plt.ylabel('a' old_yticks = np.linspace(per_width/2-0.5, Z1.shape[0]-0.5-per_width/2, n+1 plt.yticks(old_yticks, [f'{i:.0f}' for i in np.linspace(0, n, n+1] old_xticks = np.linspace(-0.5, samp_n-0.5, 6 plt.xticks(old_xticks, [f'{i:.1f}' for i in np.linspace(0, 1, 6] plt.ylim([0, samp_n+4] plt.title("n = a + b = %d"%n plt.savefig('img/n = %d.png'%n plt.show( for n in [0, 1, 2, 5, 10, 15]: BetaHot(n

一般情况

以上可视化的是$a,b$取为正整数时$\Beta$分布的情况,即(2式。对于更一般的情况(4式,即取$\alpha,\beta$为大于0的实数,在(3式中就是$a>-1,b>-1$,尽管并不符合真实伯努利试验的情况,但仍可以计算。可视化如下:

 
import matplotlib.pyplot as plt
from scipy.special import gamma
import numpy as np
import matplotlib 
matplotlib.rcParams['font.family'] = 'Times New Roman'

def Beta(a, b, p:
    g1, g2, g3 = gamma(a+b+2, gamma(a+1, gamma(b+1
    beta = g1/(g2*g3 * p**(a * (1-p**(b
    return beta

for n in [-1, 0, 1, 2, 5, 10]:
    for a in np.linspace(-0.9, n+0.9, 3:
        p = np.linspace(0.0001, 0.9999, 300
        b = n-a
        y = Beta(a, b, p
        plt.plot(p, y, label="a = %.1f, b = %.1f"%(a, b
    plt.legend(loc='upper center'
    plt.ylim([-0.1, 3]
    plt.title('a + b = %.1f' % n
    plt.savefig('img/a + b = %.1f.png' % n
    plt.show(

参考

1.  共轭先验: https://www.zhihu.com/question/41846423?sort=created

 

编程笔记 » $\Beta$分布推导与可视化

赞同 (26) or 分享 (0)
游客 发表我的评论   换个身份
取消评论

表情
(0)个小伙伴在吐槽