For any two numbers a,b, [a,b]∪(−∞,a)=(−∞,b). Thus, P(a≤X≤b)+P(X<a)P(a≤X≤b)=P(X≤b)=P(X≤b)−P(X<a).
We denote a− as the largest possible value of X that is strictly less than a. (For example, if X takes only integer values and a=2, then a−=1.) Now, we can write P(X<a)P(a≤X≤b)=P(X≤a−)=P(X≤b)−P(X≤a−)=F(b)−F(a−) where F is the cumulative distribution function for X.
In particular, if X takes only integer values and a,b are integers, then a−=a−1 and P(a≤X≤b)=F(b)−F(a−1).
Ex. Suppose X follows Geo(p) and 1≤n≤m are integers. P(n≤X≤m)=F(m)−F(n−1)=(1−(1−p)m)−(1−(1−p)n−1)=(1−p)n−1−(1−p)m.
Definition. Let X be a discrete random variable with a set of possible values D, and a probability mass function p(x). The expected value (also called mean value or expectation) of X, denoted by E(X) (or μX or just μ) is defined as E(X)=μX=y∈D∑y⋅p(y).
Ex1. Suppose X follows Bernoulli(α). E(X)=0⋅p(0)+1⋅p(1)=0(1−α)+1(α)=α.
Ex2. Returning to the blood sample example from previous lectures: p(1)p(2)p(3)p(4)E(X)=0.4=0.3=0.2=0.1=1(0.4)+2(0.3)+3(0.2)+4(0.1)=2.
Ex3. Suppose X follows Geo(p). p(k;p)E(X)=(1−p)k−1p=k=1∑∞k⋅(1−p)k−1p=p⋅k=1∑∞k⋅(1−p)k−1=p⋅k=1∑∞dpd(−(1−p)k)=p⋅[dpdk=1∑∞(−(1−p)k)]=p⋅[dpd(1+k=0∑∞(−(1−p)k))]. The infinite sum is now in the form of a geometric series, so we can apply that formula: E(X)=p⋅[dpd(1−p1)]=p⋅p21=p1.
We want to find E(h(X)), where h(x) could be any function of our random variable X. We can do this by looking at h(X) as a new random variable generated by h.
Generally, E(h(X))=y∈D∑h(y)⋅p(y).
In the special case where h(x)=ax+b with a,b constant (that is, h is linear), E(h(X))=a⋅E(X)+b.
Ex. For the blood sample example: the cost of the experiment C(x) is a function of how many tests you had to perform. (This is not necessarily linear.) C(1)=100, C(2)=140, C(3)=170, C(4)=190. Given this function, E(X)=C(1)⋅p(1)+C(2)⋅p(2)+C(3)⋅p(3)+C(4)⋅p(4)=100(0.4)+140(0.3)+170(0.2)+190(0.1)=135.