6 minute read

Gamma and Related Distributions

Introduction

We define a function $\Gamma (\alpha)$ called Gamma function which is an extension of factorial as follows:

\[\begin{align*} \Gamma (\alpha) = \int_{0}^\infty y^{\alpha - 1} e^{-y} \cdot dy \end{align*}\]

As a result, we can see that

\[\begin{align*} &\Gamma(\alpha) = (\alpha - 1) \Gamma (\alpha - 1) \text{ where } \alpha > 1, \Gamma (1) = 1 \end{align*}\]

by Integral By Parts. For much motivation for it, see [2]

Gamma Distribution

With a Gamma function, an important distribution called Gamma distribution is defined that is popularly used for wait time modeling like life testing.

Assume Poisson process. Let’s consider the time interval of length $w$.
Let the random variable $W$ be the time that is needed to obtain exactly $k$ times of events. (We fix $k$ as a positive integer) Let $X$ be the number of events in an interval of length $w$.
Then,

\[\begin{align*} P(W > w) &= \sum_{x=0}^{k-1} P(X = x) \\ &= \sum_{x=0}^{k-1} \frac{(\lambda w)^x e^{-\lambda w}}{x!} \\ &= \int_{\lambda w}^\infty \frac{1}{\Gamma (k)} z^{k-1} e^{-z} \cdot dz \text{ by I.B.P } \end{align*}\]

Thus, the CDF of $W$ is

\[\begin{align*} F_W (w) &= \int_{0}^{\lambda w} \frac{1}{\Gamma (k)} z^{k-1} e^{-z} \cdot dz \\ &= \int_{0}^{w} \frac{\lambda^k}{\Gamma (k)} x^{k-1} e^{-\lambda x} \cdot dx \\ &\Rightarrow W \sim \Gamma (k, \frac{1}{\lambda}) \end{align*}\]

And in Bayesian Statistics, we often encounter this special distribution since it is a conjugate prior of other distributions.

PDF

\[\begin{align*} f(x) = \frac{1}{\Gamma (\alpha) \beta^\alpha} x^{\alpha - 1} e^{-\frac{x}{\beta}} \; (0 < x < \infty) \end{align*}\]

image


And we name $\alpha, \beta$ as shape parameter and scale parameter respectively.

MGF

\[\begin{align*} M(t) &= \mathbb{E}[e^{tX}] \\ &= \frac{1}{\Gamma (\alpha) \beta^\alpha} \int_{0}^\infty x^{\alpha - 1} e^{-x (\frac{1}{\beta} - t)} \cdot dx \\ &= \frac{(\frac{1}{\beta} - t)^{-\alpha}}{\beta^\alpha} \\ &= \frac{1}{(1- \beta t)^\alpha} \; (t < \frac{1}{\beta}) \end{align*}\]

Mean, Variance

\[\begin{align*} \mathbb{E}[X] &= \alpha \beta \\ \text{Var}[X] &= \alpha \beta^2 \end{align*}\]
$\mathbf{Proof.}$
\[\begin{align*} M' (0) &= \frac{\alpha \beta}{(1 - \beta t)^{\alpha + 1}} = \alpha \\ \text{Var}[X] &= M''(0) - \alpha^2 \beta^2 = (\alpha^2 + \alpha) \beta^2 - \alpha^2 \beta^2 = \alpha \beta^2._\blacksquare \end{align*}\]


Additivity, Scaling property

$\mathbf{Thm\ 1.1.}$ Suppose $X_1, \cdots X_n$ are independent and $X_i \sim \Gamma (\alpha_i, \beta)$. Then, $X = \sum_{i=1}^n X_i \sim \Gamma (\sum_{i=1}^n \alpha_i, \beta)$.

$\mathbf{Proof.}$

Note that MGF of $X_i$ is $M_i (t) = (1 - \beta t)^{-\alpha_i}$. Then,

\[\begin{align*} M(t) = \mathbb{E}(e^{tX}) = \prod_{i=1}^n \mathbb{E}(e^{t X_i}) = (1 - \beta t)^{\sum \alpha_i} \end{align*}\]

Done by uniqueness of MGFs.


$\mathbf{Thm\ 1.2.}$ Suppose $X \sim \Gamma (\alpha, \beta)$. Then, for $k > 0$, $kX \sim \Gamma (\alpha, k\beta)$.

$\mathbf{Proof.}$

Note that MGF of $X$ is $M (t) = (1 - \beta t)^{-\alpha}$. Then,

\[\begin{align*} M(t) = \mathbb{E}(e^{ktX}) = (1 - k \beta t)^{\alpha} \end{align*}\]

Done by uniqueness of MGFs.



Exponential Distribution

Exponential Distribution is a special case of Gamma distribution when $\alpha = 1$ and $\beta = \frac{1}{\lambda}$: For $X \sim \text{exp}(\lambda)$,

\[\begin{align*} f(x) = \lambda e^{-\lambda x} \; (0 < x < \infty) \end{align*}\]

Thus, $X \sim \Gamma (1, \lambda)$.

image



$\boldsymbol{\chi}^2$ Distribution

When $\alpha = \frac{r}{2}, \beta = 2$, it becomes a $\boldsymbol{\chi}^2$ distribution with $\mathbf{r}$ degree of freedom.

PDF

\[\begin{align*} f(x) = \frac{1}{\Gamma (r/2) 2^{r/2}} x^{\frac{r}{2} - 1} e^{-\frac{x}{2}} \; (0 < x < \infty) \end{align*}\]

image


MGF

\[\begin{align*} M(t) = (1- 2 t)^{-r/2} \; (t < \frac{1}{2}) \end{align*}\]

Mean, Variance

\[\begin{align*} \mathbb{E}[X] &= r \\ \text{Var}[X] &= 2r \end{align*}\]

Formula of moments

$\mathbf{Thm\ 1.3.}$ Suppose $X \sim \chi^2 (r)$. If $k > -\frac{r}{2}$, $\mathbb{E}(X^k)$ exists and $\mathbb{E}(X^k) = \frac{2^k \Gamma (k + \frac{r}{2})}{\Gamma(\frac{r}{2})}$

$\mathbf{Proof.}$
\[\begin{align*} \mathbb{E}(X^k) &= \int_{0}^\infty \frac{1}{\Gamma(r/2) 2^{r/2}} x^{\frac{r}{2} + k - 1} e^{-\frac{x}{2}} \cdot dx \\ &= \int_{0}^\infty \frac{1}{\Gamma(r/2) 2^{r/2}} 2^{\frac{r}{2} + k - 1} y^{\frac{r}{2} + k - 1} e^{-y} \cdot dy &= \frac{2^k \Gamma (\frac{r}{2} + k)}{\Gamma (\frac{r}{2})}._\blacksquare \end{align*}\]



$\boldsymbol{\beta}$ Distribution

Let $X_1 \sim \Gamma (\alpha, 1)$ and $X_2 \sim \Gamma (\beta, 1)$ be independent random variables. Then, we want to find the distribution of $Y_1 = X_1 + X_2$, $Y_2 = \frac{X_1}{X_1 + X_2}$.

First, the joint PDF of $X_1$ and $X_2$ is

\[\begin{aligned} f(x_1, x_2) &= f_1 (x_1) \cdot f_2 (x_2) \\ &= (\frac{1}{\Gamma (\alpha)} x_1^{\alpha - 1} e^{-x_1}) \cdot (\frac{1}{\Gamma (\beta)} x_2^{\beta - 1} e^{-x_2}) \end{aligned}\]

And, notice that $X_1 = Y_1 \cdot Y_2$, $X_2 = Y_1 (1 - Y_2)$. Then, the Jacobian is

\[\begin{align*} |J|=\left|\begin{array}{cc} Y_2 & Y_1 \\ 1-Y_2 & -Y_1 \end{array}\right|=\left|-Y_1\right|=Y_1 \end{align*}\]

As a result, the joint PDF of $Y_1$ and $Y_2$ is

\[\begin{align*} g(y_1, y_2) = \frac{1}{\Gamma (\alpha) \Gamma(\beta)} y_1^{\alpha + \beta - 1} y_2^{\alpha - 1} (1 - y_2)^{\beta - 1} e^{-y_1} \; (y_1 \geq 0, 0 \leq y_2 \leq 1) \end{align*}\]

Thus, the marginal PDF of $Y_2$ is

\[\begin{align*} f_2(y_2) &= \int_{0}^{\infty} \frac{1}{\Gamma (\alpha) \Gamma(\beta)} y_1^{\alpha + \beta - 1} y_2^{\alpha - 1} (1 - y_2)^{\beta - 1} e^{-y_1} \cdot dy_1 \\ &= \frac{\Gamma (\alpha + \beta)}{\Gamma (\alpha) \Gamma (\beta)} y_2^{\alpha - 1} (1 - y_2)^{\beta - 1} \end{align*}\]

The distribution of $Y_2$, defined by two $\Gamma$ distributions, is called the $\boldsymbol{\beta}$ distribution. It appears a lot in Bayesian Statistics as the conjugate prior for the Bernoulli, binomial, negative binomial and geometric distributions.

PDF

\[\begin{align*} f(x) = \frac{\Gamma (\alpha + \beta)}{\Gamma (\alpha) \Gamma (\beta)} x^{\alpha - 1} (1 - x)^{\beta - 1} \; (0 \leq x \leq 1) \end{align*}\]

image


Mean, Variance

\[\begin{align*} \mathbb{E}[X] &= \frac{\alpha}{\alpha + \beta} \\ \text{Var}[X] &= \frac{\alpha \beta}{(\alpha + \beta + 1)(\alpha + \beta)^2} \end{align*}\]
$\mathbf{Proof.}$
\[\begin{aligned} &\int_0^1 \frac{\Gamma (\alpha + \beta)}{\Gamma (\alpha) \Gamma (\beta)} x^\alpha (1 - x)^{\beta-1} \cdot dx = \frac{\Gamma (\alpha + \beta)}{\Gamma (\alpha)} \int_0^1 x^\alpha (1-x)^{\beta-1} \cdot dx \\ &\int_0^1 x^\alpha (1-x)^{\beta-1} \cdot dx= [\frac{-x^\alpha (1-x)^\beta}{\beta} ]_0^1 + \frac{\alpha}{\beta} \int_0^1 x^{\alpha-1}(1-x)^{\beta-1} \cdot dx \\ & = x^\alpha(1-x)^{\beta-1} \cdot dx= [\frac{-x^\alpha (1-x)^\beta}{\beta} ]_0^1 + \frac{\alpha}{\beta} \int_0^1 x^{\alpha-1}(1-x)^{\beta-1} \cdot dx \\ & = \frac{\beta}{\alpha+\beta} \cdot \frac{\alpha}{\beta} \frac{I(\alpha) I(\beta)}{I(\alpha+\beta)} \Longrightarrow \frac{\alpha}{\alpha+\beta} \end{aligned}\]



Dirichlet Distribution

There is a multivariate generalization of $\beta$ distribution, which is called Dirichlet distribution, or multivariate beta distribution (MBD).

Let $X_1, \cdots, X_{k + 1}$ be independent random variables each having a $\Gamma(\alpha_{k + 1}, 1)$. Consider the one-to-one transformation of random variables

\[Y_i = \left\{\begin{array}{l} X_1 + \cdots + X_{k+1} \; (i = k+1) \\ \frac{X_i}{X_1 + \cdots + X_{k+1}} \; (i=1,2, \cdots, k) \end{array} \quad \leadsto X_i = \left\{\begin{array}{l} Y_{k+1} - Y_{k+1} \left( Y_1 + \cdots + Y_k \right) \; (i=k+1) \\ Y_i Y_{k+1} \; (i=1, \cdots, k) \end{array}\right.\right.\]

Then, the Jacobian $J$ is

\[J = \begin{aligned}\begin{vmatrix} Y_{k+1} & 0 & \cdots & 0 & Y_1 \\ 0 & Y_{k+1} & \cdots & 0 & Y_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & Y_{k+1} & Y_k Y_{k+1} \\ -Y_{k+1} & -Y_{k+1} & \cdots & -Y_{k+1} & 1-Y_1 - \cdots - Y_{k+1} \end{vmatrix} = \begin{vmatrix} Y_{k+1} & 0 & \cdots & Y_1 \\ 0 & Y_{k+1} & \cdots & Y_2 \\ \ddots & \ddots & \ddots & \ddots \\ 0 & 0 & \cdots & Y_k \\ 0 & 0 & \cdots & 1 \end{vmatrix} = Y_{k+1}^k \end{aligned}\]

Thus, the joint PDF is

\[\begin{aligned} f(x_1, \cdots x_{k+1}) = \frac{\Gamma (\alpha_1 + \cdots + \alpha_{k+1})}{\Gamma (\alpha_1) \cdot \cdots \cdot \Gamma (\alpha_{k+1})} x_1^{\alpha_1 - 1} \cdots x_k^{\alpha_k - 1} (1 - x_1 - \cdots - x_k)^{\alpha_{k + 1} - 1} \end{aligned}\]

By integrating out $y_{k+1}$, we get the PDF of Dirichlet distribution.

Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact, the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.

PDF

\[\begin{align*} f(x_1, \cdots x_{k+1}) = \frac{\Gamma (\alpha_1 + \cdots + \alpha_{k+1})}{\Gamma (\alpha_1) \cdot \cdots \cdot \Gamma (\alpha_{k+1})} x_1^{\alpha_1 - 1} \cdots x_k^{\alpha_k - 1} (1 - x_1 - \cdots - x_k)^{\alpha_{k + 1} - 1} \end{align*}\]

where $y_i \geq 0, 0 \leq y_1 + \cdots + y_k \leq 1$





Reference

[1] Hogg, R., McKean, J. & Craig, A., Introduction to Mathematical Statistics, Pearson 2019
[2] Wikipedia, Gamma function
[3] Wikipedia, Gamma distribution
[4] Wikipedia, Exponential distribution
[5] Wikipedia, Chi-squared distribution
[6] Wikipedia, Beta distribution
[7] Wikipedia, Dirichlet distribution

Leave a comment