STAT 400

Fri. February 14th, 2020

Cumulative Distribution Functions (cont.)

For any two numbers $a,b$ , $[a,b]\cup (-\infin,a)=(-\infin,b)$ . Thus, $\begin{aligned} P(a\le X\le b)+P(X\lt a)&=P(X\le b)\\ P(a\le X\le b)&=P(X\le b)-P(X\lt a). \end{aligned}$

We denote $a-$ as the largest possible value of $X$ that is strictly less than $a$ . (For example, if $X$ takes only integer values and $a=2$ , then $a-=1$ .) Now, we can write $\begin{aligned} P(X\lt a)&=P(X\le a-)\\ P(a\le X\le b)&=P(X\le b)-P(X\le a-)=F(b)-F(a-) \end{aligned}$ where $F$ is the cumulative distribution function for $X$ .

In particular, if $X$ takes only integer values and $a,b$ are integers, then $a-=a-1$ and $P(a\le X\le b)=F(b)-F(a-1).$

Ex. Suppose $X$ follows $\text{Geo}(p)$ and $1\le n\le m$ are integers. $\begin{aligned} P(n\le X\le m)&=F(m)-F(n-1)\\ &=(1-(1-p)^m)-(1-(1-p)^{n-1})\\ &=(1-p)^{n-1}-(1-p)^m. \end{aligned}$

Expected Values & Variance

Definition. Let $X$ be a discrete random variable with a set of possible values $D$ , and a probability mass function $p(x)$ . The expected value (also called mean value or expectation) of $X$ , denoted by $E(X)$ (or $\mu_X$ or just $\mu$ ) is defined as $E(X)=\mu_X=\sum_{y\in D}y\cdot p(y).$

Calculating Expected Value

Ex1. Suppose $X$ follows $\text{Bernoulli}(\alpha)$ . $E(X)=0\cdot p(0)+1\cdot p(1)=0(1-\alpha) + 1(\alpha)=\alpha.$

Ex2. Returning to the blood sample example from previous lectures: $\begin{aligned} p(1)&=0.4\\p(2)&=0.3\\p(3)&=0.2\\p(4)&=0.1\\\\E(X)&=1(0.4)+2(0.3)+3(0.2)+4(0.1)\\&=2. \end{aligned}$

Ex3. Suppose $X$ follows $\text{Geo}(p)$ . $\begin{aligned} p(k;p)&=(1-p)^{k-1}p\\ E(X)&=\sum_{k=1}^\infin k\cdot (1-p)^{k-1}p\\ &=p\cdot \sum_{k=1}^\infin k\cdot (1-p)^{k-1}\\ &=p\cdot \sum_{k=1}^\infin \frac{d}{dp}(-(1-p)^k)\\ &=p\cdot [\frac{d}{dp}\sum_{k=1}^\infin (-(1-p)^k)]\\ &=p\cdot [\frac{d}{dp}(1+\sum_{k=0}^\infin (-(1-p)^k))].\\ \end{aligned}$ The infinite sum is now in the form of a geometric series, so we can apply that formula: $\begin{aligned} E(X)&=p\cdot [\frac{d}{dp}(1-\frac{1}{p})]\\ &=p\cdot \frac{1}{p^2}\\ &=\frac{1}{p}. \end{aligned}$

Expectation of a Function

We want to find $E(h(X))$ , where $h(x)$ could be any function of our random variable $X$ . We can do this by looking at $h(X)$ as a new random variable generated by $h$ .

Generally, $E(h(X))=\sum_{y\in D}h(y)\cdot p(y).$

In the special case where $h(x)=ax+b$ with $a,b$ constant (that is, $h$ is linear), $E(h(X))=a\cdot E(X)+b.$

Ex. For the blood sample example: the cost of the experiment $C(x)$ is a function of how many tests you had to perform. (This is not necessarily linear.) $C(1)=100,\ C(2)=140,\ C(3)=170,\ C(4)=190.$ Given this function, $\begin{aligned} E(X)&=C(1)\cdot p(1)+C(2)\cdot p(2)+C(3)\cdot p(3)+C(4)\cdot p(4)\\ &=100(0.4)+140(0.3)+170(0.2)+190(0.1)\\ &=135. \end{aligned}$