기본 확률 분포 Basic Concept of Probability Distributions 8: Normal Distribution

8165 단어 orm
PDF version
PDF & CDF
The probability density function is $$f(x;\mu,\sigma) = {1\over\sqrt{2\pi}\sigma}e^{-{1\over2}{(x-\mu)^2\over\sigma^2}}$$ The cumulative distribution function is defined by $$F(x;\mu,\sigma) =\Phi\left({x-\mu\over\sigma}\right)$$ where $$\Phi(z) = {1\over\sqrt{2\pi}}\int_{-\infty}^{z}e^{-{1\over2}x^2}\dx$$
Proof:
$$\begin{align*}\int_{-\infty}^{\infty}f(x;\mu,\sigma) &=\int_{-\infty}^{\infty}{1\over\sqrt{2\pi}\sigma}e^{-{1\over2}{(x-\mu)^2\over\sigma^2}}\dx\\&= {1\over\sqrt{2\pi}\sigma}\int_{-\infty}^{\infty}e^{-{1\over2}{(x-\mu)^2\over\sigma^2}}\dx\\&= {1\over\sqrt{2\pi}}\int_{-\infty}^{\infty}e^{-{1\over2}y^2}\dy\quad\quad\quad\quad\quad(\mbox{setting}\y={x-\mu\over\sigma}\Rightarrow dx =\sigma dy)\\\end{align*} $$ Let $I =\int_{-\infty}^{\infty}e^{-{1\over2}y^2}\dy$, then $$\begin{eqnarray*} I^2 &=&\int_{-\infty}^{\infty}e^{-{1\over2}y^2}\dy\int_{-\infty}^{\infty}e^{-{1\over2}x^2}\dx\\&=&\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}e^{-{1\over2}(y^2+x^2)}\dydx\quad\quad\quad\quad(\mbox{setting}\x=r\cos\theta, y=r\sin\theta)\\&=&\int_{0}^{\infty}\int_{0}^{2\pi}e^{-{1\over2}r^2}\rd\theta dr\\& & (\mbox{double integral}\\iint\limits_{D}f(x, y)\dxdy =\iint\limits_{D^*}f(r\cos\theta, r\sin\theta)r\drd\theta)\\&=& 2\pi\int_{0}^{\infty}re^{-{1\over2}r^2}\dr\\&=& -2\pi e^{-{1\over2}r^2}\Big|_{0}^{\infty}\\&=& 2\pi\end{eqnarray*} $$ Hence $$\int_{-\infty}^{\infty}f(x;\mu,\sigma) = {1\over\sqrt{2\pi}}\cdot\sqrt{2\pi} = 1$$
Standard Normal Distribution
If $X$ is normally distributed with parameters $\mu$ and $\sigma^2$, then $$Z = {X-\mu\over\sigma}$$ is normally distributed with parameters 0 and 1.
Proof:
An important conclusion is that if $X$ is normally distributed with parameters $\mu$ and $\sigma^2$, then $Y = aX + b$ is normally distributed with parameters $a\mu + b$ and $a^2\sigma^2$. Denote $F_{Y}$ as the cumulative distribution function of $Y$: $$\begin{align*} F_{Y}(x) &= P(Y\leq x)\\&= P(aX + b\leq x)\\&= P(X\leq {x-b\over a})\\&= F_{X}\left({x-b\over a}\right)\end{align*} $$ where $F_{X}(x)$ is the cumulative distribution function of $X$. By differentiation, the probability density function of $Y$ is $$\begin{align*} f_{Y}(x) &= {1\over a}f_{X}\left({x-b\over a}\right)\\&= {1\over\sqrt{2\pi}a\sigma}e^{-{1\over2}{({x-b\over a} -\mu)^2\over\sigma^2}}\\&= {1\over\sqrt{2\pi}(a\sigma)}e^{-{1\over2}{(x-b - a\mu)^2\over a^2\sigma^2}}\\&= {1\over\sqrt{2\pi}(a\sigma)}e^{-{1\over2}{(x-(b + a\mu))^2\over (a\sigma)^2}}\end{align*} $$ which shows that $Y$ is normally distributed with parameters $a\mu + b$ and $a^2\sigma^2$. According to the above result, we can easily deduce that $Z = {X-\mu\over\sigma}$ follows the normally distributed with parameters 0 and 1.
Mean
The expected value is $$E[X] =\mu$$
Proof:
$$\begin{align*} E[Z] &=\int_{-\infty}^{\infty}xf_{Z}(x)\dx\quad\quad\quad\quad\quad\quad\quad (\mbox{setting}\Z={X-\mu\over\sigma})\\&= {1\over\sqrt{2\pi}}\int_{-\infty}^{\infty}xe^{-{1\over2}x^2}\dx\\&= -{1\over\sqrt{2\pi}}\int_{-\infty}^{\infty}e^{-{1\over2}x^2}\d\left(-{1\over2}x^2\right)\\&= -{1\over\sqrt{2\pi}}e^{-{1\over2}x^2}\Big|_{-\infty}^{\infty}\\&= 0\end{align*} $$ Hence $$\begin{align*} E[X] &= E\left[\sigma Z+\mu\right]\\&=\sigma E[Z] +\mu\\&=\mu\end{align*} $$
Variance
The variance is $$\mbox{Var}(X) =\sigma^2$$
Proof:
$$\begin{align*} E\left[Z^2\right] &= {1\over\sqrt{2\pi}}\int_{-\infty}^{\infty}x^2e^{-{1\over2}x^2}\dx\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad (\mbox{setting}\Z={X-\mu\over\sigma})\\&= {1\over\sqrt{2\pi}}\left(-xe^{-{1\over2}x^2}\Big|_{-\infty}^{\infty} +\int_{-\infty}^{\infty}e^{-{1\over2}x^2}\dx\right)\quad\quad\quad(\mbox{integrating by parts})\\&= {1\over\sqrt{2\pi}}\int_{-\infty}^{\infty}e^{-{1\over2}x^2}\dx\quad\quad\quad\quad\quad\quad\quad(\mbox{standard normal distribution})\\&= 1\end{align*} $$ the integral by parts: $$u= x,\dv = xe^{-{1\over2}x^2}\dx$$ $$\implies du = dx,\v =\int xe^{-{1\over2}x^2}\dx = -e^{-{1\over2}x^2}$$ $$\implies\int x^2e^{-{1\over2}x^2}\dx =-xe^{-{1\over2}x^2} +\int e^{-{1\over2}x^2}\dx$$ Hence $$\mbox{Var}(X) =\mbox{Var}(\sigma Z +\mu)=\sigma^2\mbox{Var}(Z) =\sigma^2$$
Examples
1. If $X$ is a normal random variable with parameters $\mu = 3$ and $\sigma^2 = 9$, find (a) $P(2 < X <5 data-blogger-escaped-b=""> 0)$; (c) $P(|X - 3| > 6)$.
Solution:
(a) $$\begin{align*} P(2 < X < 5) &= P\left({2-3\over3} < {X - 3\over 3} < {5-3\over 3}\right)\\&= P\left(-{1\over3} < Z < {2\over3}\right)\\&=\Phi\left({2\over3}\right) -\Phi\left(-{1\over3}\right) = 0.3780661\end{align*} $$ R code:
pnorm(2/3) - pnorm(-1/3)

# [1] 0.3780661 

(b) $$\begin{align*} P(X > 0) &= P\left({X-3\over3} > {0-3\over3}\right)\\&= P\left(Z > -1\right)\\&= 1 -\Phi(-1) = 0.8413447\end{align*} $$ R code:
1 - pnorm(-1)

# [1] 0.8413447

(c) $$\begin{align*} P(|X - 3| > 6) &= P(X > 9) + P(X < -3)\\&= P\left({X-3\over3} > {9-3\over3}\right) + P\left({X-3\over3} < {-3-3\over3}\right)\\&= P(Z > 2) + P(Z < -2)\\&= 1-\Phi(2) +\Phi(-2) = 0.04550026\end{align*} $$ R code:
1 - pnorm(2) + pnorm(-2)

# [1] 0.04550026 

2. Let $X$ be normally distributed with standard deviation $\sigma$. Determine $P\left(|X-\mu|\geq 2\sigma\right)$. Compare with Chebyshev's Inequality.
Solution:
$$\begin{align*} P\left(|X-\mu|\geq 2\sigma\right) &= P\left({X-\mu\over\sigma}\geq 2\right) + P\left({X-\mu\over\sigma}\leq -2\right)\\&=2\cdot P\left({X-\mu\over\sigma}\leq -2\right) = 2\Phi(-2)\end{align*} $$ R code:
2 * pnorm(-2)

# [1] 0.04550026

By Chebyshev's Inequality, the probability is $$P\left(|X-\mu|\geq 2\sigma\right)\leq {1\over2^2}=0.25$$ which is a weaker estimation.
3. Let $X$ be a normally distributed random variable with expected value $\mu=5$. Assume $P(X\leq 0) = 0.1$. What is the variance of $X$?
Solution:
$$\begin{align*} P(X\leq 0) &= P\left({X - 5\over\sigma}\leq {0-5\over\sigma}\right)\\&= P\left(Z\leq -{5\over\sigma}\right) = 0.1\end{align*} $$ Hence by using R:
z = qnorm(0.1)

var = (-5/z)^2

# [1] -1.281552

# [1] 15.22186 

$$-{5\over\sigma} = -1.281552\Rightarrow\sigma^2 = 15.22186$$
4. A normally distributed random variable $X$ satisfies $P(X\leq 0) = 0.4$ and $P(X\geq 10) = 0.1$. What is the expected value $\mu$ and the standard deviation $\sigma$?
Solution:
$$P(X\leq 0) = 0.4\Rightarrow\Phi\left({-\mu\over\sigma}\right) = 0.4$$ and $$P(X\geq 10) = 0.1\Rightarrow\Phi\left({10-\mu\over\sigma}\right) = 0.9$$ Thus $$\begin{cases}{-\mu\over\sigma}=-0.2533471\\{10-\mu\over\sigma}=1.281552\end{cases}\Rightarrow\begin{cases}\mu = 1.650579\\\sigma= 6.515088\end{cases}$$ R code:
z1 = qnorm(0.4); z2 = qnorm(0.9)

s = 10 / (z2 - z1)

mu = -s * z1

z1; z2

# [1] -0.2533471

# [1] 1.281552

mu; s

# [1] 1.650579

# [1] 6.515088 

5. Consider independent random variables $X\sim N(1, 3)$ and $Y\sim N(2, 4)$. What is $P(X + Y\leq 5)$?
Solution:
$X +Y$ is still normally distributed with parameters $$\mu =\mu_1 +\mu_2 = 3$$ and $$\sigma^2 =\sigma_1^2 +\sigma_2^2 = 7$$ Hence $$\begin{align*} P(X + Y\leq 5) &= P\left(Z\leq {5-3\over\sqrt{7}}\right)\\&=\Phi\left({2\over\sqrt{7}}\right) = 0.7751541\end{align*} $$ R code:
pnorm(2 / sqrt(7))

# [1] 0.7751541

 
 
Reference
  • Ross, S. (2010). A First Course in Probability (8th Edition). Chapter 5. Pearson. ISBN: 978-0-13-603313-4.
  • Brink, D. (2010). Essentials of Statistics: Exercises. Chapter 5 & 15. ISBN: 978-87-7681-409-0.
  • 좋은 웹페이지 즐겨찾기