STAT 400

Wed. February 19th, 2020


Binomial Probability Distribution

Suppose we repeat an experiment nn times ("nn trials"). The experiment produces two possible outcomes: success and failure, or {S,F}\{S,F\}. The outcomes of each trial are independent. Denote P(S)=pP(S)=p, which is the same for all trials.

Let XX be the number of successes among the nn trials. Then we call these nn trials a binomial experiment and XX a binomial random variable (XX follows bin(n,p)\text{bin}(n,p) with parameters n,pn,p).


Ex. XX follows bin(4,p)\text{bin}(4,p). Find the probability mass function for XX.

Possible values of XX: 0, 1, 2, 3, 4.

p(0)=P(X=0)=(1p)4p(0)=P(X=0)=(1-p)^4
p(1)=P(X=1)=(41)p(1p)3=4p(1p)3p(1)=P(X=1)={4\choose{1}}\cdot p(1-p)^3 = 4p(1-p)^3
We multiply by 4 above because there’s four possible ways to choose one success out of the four trial: (41)4\choose{1}.
p(2)=P(X=2)=(42)p2(1p)2=6p2(1p)2p(2)=P(X=2)={4\choose{2}}\cdot p^2(1-p)^2=6p^2(1-p)^2
p(3)=P(X=3)=(43)p3(1p)1=4p3(1p)p(3)=P(X=3)={4\choose{3}}\cdot p^3(1-p)^1=4p^3(1-p)
p(4)=P(X=4)=(44)p4=p4p(4)=P(X=4)={4\choose{4}}\cdot p^4=p^4


In general, if XX follows bin(n,p)\text{bin}(n,p), then P(X=k)=(nk)pk(1p)nk.P(X=k)={n\choose k}\cdot p^k(1-p)^{n-k}.

Theorem. If XX follows bin(n,p)\text{bin}(n,p), then the probability mass function for XX is b(k;n,p)={(nk)pk(1p)nk0kn0otherwiseb(k;n,p)=\begin{cases} {n\choose k}\cdot p^k(1-p)^{n-k} & 0\le k\le n\\ 0 & \text{otherwise} \end{cases}

The cumulative distribution function for XX is B(x;n,p)=y=0[x]b(y;n,p) for x>0.B(x;n,p)=\sum_{y=0}^{[x]} b(y;n,p) \text{ for } x>0.
Note that there is no closed form for this cdf – the easiest way to calculate it is to just work out each value of bb by hand and sum them. However, you can use a binomial table to simplify the calculations.


Expected Value and Variance

Theorem. If XX follows bin(n,p)\text{bin}(n,p), then E(X)=npE(X)=np and V(X)=np(1p).V(X)=np(1-p).


Properties

Now that we understand the binomial distribution, we can draw some conclusions about its properties and how it relates to the Bernoulli distribution:

  1. Bernoulli(p)=bin(1,p).\text{Bernoulli}(p)=\text{bin}(1,p).
  2. Suppose we performed a sequence of independent experiments XiX_i that follow Bernoulli(p)\text{Bernoulli}(p), where XiX_i denotes the success or failure of the iith experiment. Then a new random variable formed by summing these outcomes follows bin(n,p)\text{bin}(n,p).
  3. Suppose X1X_1 follows bin(n1,p)\text{bin}(n_1,p) and X2X_2 follows bin(n2,p)\text{bin}(n_2,p). Then Y=X1+X2Y=X_1 + X_2 follows bin(n1+n2,p)\text{bin}(n_1+n_2,p).