1 minute read

Maximum Likelihood Method - Part II

In this post, with some assumptions, (R0) ~ (R2) in the previous post, we will see that the mle is a consistent estimators.


The maximum likelihood estimators are consistent estimators

Recall that we denote the true parameter by $\theta_0$ and

$\mathbf{Assumption\ 1.1}$ (Regularity Conditions).
(R0) The cdfs are distinct; i.e., $\theta_1 \neq \theta_2 \to F(x_i; \theta_1) \neq F(x_i; \theta_2)$.
(R1) The pdfs have common support for all $\theta \in \Omega$.
(R2) The point $\theta_0$ is an interior point in $\Omega$.

$\mathbf{Thm\ 2.1.}$ Assume that $X_1, \cdots, X_n$ satisfy (R0) through (R2), and further that $f(x;\theta)$ is differentiable with regard to $\theta$ in $\Omega$. Then, the likelihood equation

$\frac{\partial}{\partial \theta} l(\theta) = 0$


has a solution $\widehat{\theta_n}$ such that $\widehat{\theta_n} \overline{P}{\to} \theta_0$.

$Proof$. By (R2), for some $a > 0$, $(\theta_0 - a, \theta_0 + a) \subset \Omega$.

Define the event

$S_n = \{ \mathbf{X} : l(\theta_0; \mathbf{X}) > l(\theta_0 - a; \mathbf{X}) \} \cap \{ \mathbf{X} : l(\theta_0; \mathbf{X}) > l(\theta_0 + a; \mathbf{X}) \}$


By Theorem 1.1. in the previous post, $P(S_n) \to 1$.

But,

$E_{\theta_0} [\frac{f(X_i; \theta)}{f(X_i; \theta_0)}] = \int \frac{f(x; \theta)}{f(x; \theta_0)} f(x; \theta_0) dx = 1$. This can be satisfied by (R1). Thus, the statement is true$._\blacksquare$

In summary, asymptotically, the likelihood function is maximized at the true parameter $\theta_0$. So in considering estimates of $\theta_0$, it seems natural to consider the value of $\theta$ that maximizes the likelihood.


Leave a comment